Skip to content

Commit

Permalink
notebook polished
Browse files Browse the repository at this point in the history
  • Loading branch information
kislayabhi committed Aug 18, 2014
1 parent 37feede commit f919acd
Showing 1 changed file with 54 additions and 33 deletions.
87 changes: 54 additions & 33 deletions doc/ipython-notebooks/computer_vision/Scene_classification.ipynb
@@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:8f3bc27b3fa8431f807aa2d495969f8b1db0f6c5eaef9020e691ddfa7e85a2a7"
"signature": "sha256:5b1851d5a815506d7130f5bea071f5f491fba4a5ee446ff0d618893f064447de"
},
"nbformat": 3,
"nbformat_minor": 0,
Expand Down Expand Up @@ -73,13 +73,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"SIFT extracts keypoints and compute its descriptors. It requires the following steps to be done:\n",
"* **Scale-space Extrema Detection**: Difference of Gaussian(DOG) are used to search for local extrema over scale and space.\n",
"SIFT extracts keypoints and computes its descriptors. It requires the following steps to be done:\n",
"* **Scale-space Extrema Detection**: Difference of Gaussian (DOG) are used to search for local extrema over scale and space.\n",
"* **Keypoint Localization**: Once potential keypoints are found, we refine them by eliminating low-contrast keypoints and edge keypoints.\n",
"* **Orientation Assignment**: Now an orientation is assigned to each keypoint to achieve invariance to image rotation.\n",
"* **Keypoint Descriptor**: Now a keypoint descriptor is created. A total of 128 elements are available for each keypoint.\n",
"\n",
"To get more details about SIFT in OpenCV, Do read OpenCV python Documentation <a href=\"http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_feature2d/py_sift_intro/py_sift_intro.html\">here</a>.\n",
"To get more details about SIFT in OpenCV, do read OpenCV python documentation <a href=\"http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_feature2d/py_sift_intro/py_sift_intro.html\">here</a>.\n",
"\n",
"OpenCV has a nice API for using SIFT. Let's see what we are looking at:"
]
Expand All @@ -97,12 +97,12 @@
"# get the list of all jpg images from the path provided\n",
"import os\n",
"def get_imlist(path):\n",
" return [[os.path.join(path,f) for f in os.listdir(path) if f.endswith('.jpg')]]\n",
" return [[os.path.join(path,f) for f in os.listdir(path) if (f.endswith('.jpg') or f.endswith('.png'))]]\n",
"\n",
"#Use the following function when reading an image through OpenCV and displaying through plt.\n",
"def showfig(image, ucmap):\n",
" #There is a difference in pixel ordering in OpenCV and Matplotlib.\n",
" #OpenCV follows BGR order, while matplotlib likely follows RGB order.\n",
" #OpenCV follows BGR order, while matplotlib follows RGB order.\n",
" if len(image.shape)==3 :\n",
" b,g,r = cv2.split(image) # get b,g,r\n",
" image = cv2.merge([r,g,b]) # switch it to rgb\n",
Expand All @@ -118,15 +118,17 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We try to construct the vocabulary from a set of template images. It is a set of three general images belonging to the category of car, plane and train. "
"We try to construct the vocabulary from a set of template images. It is a set of three general images belonging to the category of car, plane and train. \n",
"\n",
"OpenCV also provides **cv2.drawKeyPoints()** function which draws the small circles on the locations of keypoints. If you pass a flag, **cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS** to it, it will draw a circle with size of keypoint and it will even show its orientation. See below example."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plt.rcParams['figure.figsize'] = 17, 4\n",
"filenames=get_imlist('sample/template/')\n",
"filenames=get_imlist('../../../data/SIFT/template/')\n",
"filenames=np.array(filenames)\n",
"\n",
"# for keeping all the descriptors from the template images\n",
Expand Down Expand Up @@ -162,8 +164,8 @@
"source": [
"***2. Group similar descriptors into an arbitrary number of clusters***.\n",
"\n",
"We take all the descriptors that we got from the above 3 images and find similarity in between them.\n",
"Here, similarity is decided by <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CEuclideanDistance.html\">Euclidean distance</a> between the 128-element SIFT descriptors. Similar descriptors are clustered into **k** number of groups. This can be done using Shogun's <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKNN.html\">**KMeans class**</a>. These clusters are called **bags of keypoints** or **visual words** and they collectively represent the **vocabulary** of the program. Each cluster has a **cluster center**, which can be thought of as the representative descriptor of all the descriptors belonging to that cluster. These cluster centers can be found using <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKMeans.html#a5d8a09aeadada018747786a5470d3653\">**get_cluster_centers()**</a> method.\n",
"We take all the descriptors that we got from the three images above and find similarity in between them.\n",
"Here, similarity is decided by <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CEuclideanDistance.html\">Euclidean distance</a> between the 128-element SIFT descriptors. Similar descriptors are clustered into **k** number of groups. This can be done using Shogun's <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKNN.html\">**KMeans class**</a>. These clusters are called **bags of keypoints** or **visual words** and they collectively represent the **vocabulary** of the program. Each cluster has a **cluster center**, which can be thought of as the representative descriptor of all the descriptors belonging to that cluster. These cluster centers can be found using the <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKMeans.html#a5d8a09aeadada018747786a5470d3653\">**get_cluster_centers()**</a> method.\n",
"\n",
"To perform clustering into **k** groups, we define the **get_similar_descriptors()** function below."
]
Expand Down Expand Up @@ -226,19 +228,19 @@
"In short, we approximated each training image into a **k** element vector. This can be utilized to train any multiclass classifier.\n",
"\n",
"\n",
"First, let us see few training images"
"First, let us see a few training images"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# name of all the folders together\n",
"folders=['cars','planes','train']\n",
"folders=['cars','planes','trains']\n",
"training_sample=[]\n",
"for folder in folders:\n",
" #get all the training images from a particular class \n",
" filenames=get_imlist('sample/%s'%folder)\n",
" filenames=get_imlist('../../../data/SIFT/%s'%folder)\n",
" for i in xrange(10):\n",
" temp=cv2.imread(filenames[0][i])\n",
" training_sample.append(temp)\n",
Expand All @@ -258,7 +260,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"we here define **get_sift_training()** function to get all the **SIFT** descriptors present in all the training images."
"We here define **get_sift_training()** function to get all the **SIFT** descriptors present in all the training images."
]
},
{
Expand All @@ -268,7 +270,7 @@
"def get_sift_training():\n",
" \n",
" # name of all the folders together\n",
" folders=['cars','planes','train']\n",
" folders=['cars','planes','trains']\n",
" \n",
" folder_number=-1\n",
" des_training=[]\n",
Expand All @@ -277,7 +279,7 @@
" folder_number+=1\n",
"\n",
" #get all the training images from a particular class \n",
" filenames=get_imlist('sample/%s'%folder)\n",
" filenames=get_imlist('../../../data/SIFT/%s'%folder)\n",
" filenames=np.array(filenames)\n",
" \n",
" des_per_folder=[]\n",
Expand Down Expand Up @@ -350,14 +352,14 @@
" knn.train(sg_cluster_centers)\n",
"\n",
" # name of all the folders together\n",
" folders=['cars','planes','train']\n",
" folders=['cars','planes','trains']\n",
" folder_number=-1\n",
"\n",
" for folder in folders:\n",
" folder_number+=1\n",
"\n",
" #get all the training images from a particular class \n",
" filenames=get_imlist('sample/%s'%folder)\n",
" filenames=get_imlist('../../../data/SIFT/%s'%folder)\n",
"\n",
" for image_name in xrange(len(filenames[0])):\n",
" \n",
Expand Down Expand Up @@ -456,7 +458,7 @@
"# Lets see the testing images\n",
"testing_sample=[]\n",
"#get all the testing images \n",
"filenames=get_imlist('sample/test_image/')\n",
"filenames=get_imlist('../../../data/SIFT/test_image/')\n",
"for i in xrange(len(filenames[0])):\n",
" temp=cv2.imread(filenames[0][i])\n",
" testing_sample.append(temp)\n",
Expand Down Expand Up @@ -484,7 +486,7 @@
"collapsed": false,
"input": [
"def get_sift_testing():\n",
" filenames=get_imlist('sample/test_image/')\n",
" filenames=get_imlist('../../../data/SIFT/test_image/')\n",
" filenames=np.array(filenames)\n",
" des_testing=[]\n",
" for image_name in filenames[0]:\n",
Expand Down Expand Up @@ -531,7 +533,7 @@
" \n",
" # a list to hold histograms of all the test images\n",
" all_histograms=[]\n",
" filenames=get_imlist('sample/test_image/')\n",
" filenames=get_imlist('../../../data/SIFT/test_image/')\n",
" \n",
" for image_name in xrange(len(filenames[0])):\n",
" \n",
Expand Down Expand Up @@ -619,13 +621,14 @@
"collapsed": false,
"input": [
"import re\n",
"filenames=get_imlist('sample/test_image/')\n",
"filenames=get_imlist('../../../data/SIFT/test_image/')\n",
"# get the formation of the files, later to be used for calculating the confusion matrix\n",
"formation=([int(''.join(x for x in filename if x.isdigit())) for filename in filenames[0]])\n",
" \n",
"# associate them with the correct labels by making a dictionary\n",
"keys=range(len(filenames[0]))\n",
"values=[0,1,0,2,1,0,1,0,0,0,1,1,2,0,0,2,2,2,1,1,1,1]\n",
"\n",
"values=[0,1,0,2,1,0,1,0,0,0,1,2,2,2,2,1,1,1,1,1]\n",
"label_dict=dict(zip(keys, values))\n",
"\n",
"# the following list holds the actual labels\n",
Expand All @@ -648,6 +651,9 @@
"cell_type": "code",
"collapsed": false,
"input": [
"best_k=1\n",
"max_accuracy=0\n",
"\n",
"for k in xrange(1,5):\n",
" k=100*k\n",
" \n",
Expand All @@ -662,10 +668,16 @@
" \n",
" # step 5\n",
" predicted=classify_svm(k, knn, descriptor_testing)\n",
" print \"for a k=%d, accuracy is %d%%\"%(k,sum(predicted==expected)*100/float(len(expected)))\n",
" accuracy=sum(predicted==expected)*100/float(len(expected))\n",
" print \"for a k=%d, accuracy is %d%%\"%(k, accuracy)\n",
" \n",
" #step 6\n",
" m=create_conf_matrix(expected, predicted, 3)\n",
"\n",
" if accuracy>max_accuracy:\n",
" best_k=k\n",
" max_accuracy=accuracy\n",
" best_prediction=predicted\n",
" \n",
" print \"confusion matrix for k=%d\"%k\n",
" print m"
Expand All @@ -678,9 +690,26 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"From all the above k's we choose the one which has the best accuracy. Number of k's can be extended further to enhance the overall accuracy."
"From all the above k's we choose the one which has the best accuracy. Number of k's can be extended further to enhance the overall accuracy.\n",
"\n",
"Test images along with their predicted labels are shown below for the most optimum value of k: "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plt.rcParams['figure.figsize']=20,8\n",
"fig=plt.figure()\n",
"for image_no in xrange(len(filenames[0])):\n",
" fig.add_subplot(3,8, image_no+1)\n",
" plt.title('pred. class: '+folders[int(best_prediction[image_no])])\n",
" showfig(testing_sample[image_no], None)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 2,
Expand Down Expand Up @@ -714,14 +743,6 @@
"\n",
"* Practical OpenCV by Samarth Brahmbhatt, University of Pennsylvania "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
Expand Down

0 comments on commit f919acd

Please sign in to comment.