notebook polished

shogun-toolbox · Aug 18, 2014 · f919acd · f919acd
1 parent 37feede
commit f919acd
Showing 1 changed file with 54 additions and 33 deletions.
diff --git a/doc/ipython-notebooks/computer_vision/Scene_classification.ipynb b/doc/ipython-notebooks/computer_vision/Scene_classification.ipynb
@@ -1,7 +1,7 @@
 {
  "metadata": {
   "name": "",
-  "signature": "sha256:8f3bc27b3fa8431f807aa2d495969f8b1db0f6c5eaef9020e691ddfa7e85a2a7"
+  "signature": "sha256:5b1851d5a815506d7130f5bea071f5f491fba4a5ee446ff0d618893f064447de"
  },
  "nbformat": 3,
  "nbformat_minor": 0,
@@ -73,13 +73,13 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "SIFT extracts keypoints and compute its descriptors. It requires the following steps to be done:\n",
-      "* **Scale-space Extrema Detection**: Difference of Gaussian(DOG) are used to search for local extrema over scale and space.\n",
+      "SIFT extracts keypoints and computes its descriptors. It requires the following steps to be done:\n",
+      "* **Scale-space Extrema Detection**: Difference of Gaussian (DOG) are used to search for local extrema over scale and space.\n",
       "* **Keypoint Localization**: Once potential keypoints are found, we refine them by eliminating low-contrast keypoints and edge keypoints.\n",
       "* **Orientation Assignment**: Now an orientation is assigned to each keypoint to achieve invariance to image rotation.\n",
       "* **Keypoint Descriptor**: Now a keypoint descriptor is created. A total of 128 elements are available for each keypoint.\n",
       "\n",
-      "To get more details about SIFT in OpenCV, Do read OpenCV python Documentation <a href=\"http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_feature2d/py_sift_intro/py_sift_intro.html\">here</a>.\n",
+      "To get more details about SIFT in OpenCV, do read OpenCV python documentation <a href=\"http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_feature2d/py_sift_intro/py_sift_intro.html\">here</a>.\n",
       "\n",
       "OpenCV has a nice API for using SIFT. Let's see what we are looking at:"
      ]
@@ -97,12 +97,12 @@
       "# get the list of all jpg images from the path provided\n",
       "import os\n",
       "def get_imlist(path):\n",
-      "    return [[os.path.join(path,f) for f in os.listdir(path) if f.endswith('.jpg')]]\n",
+      "    return [[os.path.join(path,f) for f in os.listdir(path) if (f.endswith('.jpg') or f.endswith('.png'))]]\n",
       "\n",
       "#Use the following function when reading an image through OpenCV and displaying through plt.\n",
       "def showfig(image, ucmap):\n",
       "    #There is a difference in pixel ordering in OpenCV and Matplotlib.\n",
-      "    #OpenCV follows BGR order, while matplotlib likely follows RGB order.\n",
+      "    #OpenCV follows BGR order, while matplotlib follows RGB order.\n",
       "    if len(image.shape)==3 :\n",
       "        b,g,r = cv2.split(image)       # get b,g,r\n",
       "        image = cv2.merge([r,g,b])     # switch it to rgb\n",
@@ -118,15 +118,17 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "We try to construct the vocabulary from a set of template images. It is a set of three general images belonging to the category of car, plane and train.  "
+      "We try to construct the vocabulary from a set of template images. It is a set of three general images belonging to the category of car, plane and train.  \n",
+      "\n",
+      "OpenCV also provides **cv2.drawKeyPoints()** function which draws the small circles on the locations of keypoints. If you pass a flag, **cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS** to it, it will draw a circle with size of keypoint and it will even show its orientation. See below example."
      ]
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
       "plt.rcParams['figure.figsize'] = 17, 4\n",
-      "filenames=get_imlist('sample/template/')\n",
+      "filenames=get_imlist('../../../data/SIFT/template/')\n",
       "filenames=np.array(filenames)\n",
       "\n",
       "# for keeping all the descriptors from the template images\n",
@@ -162,8 +164,8 @@
      "source": [
       "***2. Group similar descriptors into an arbitrary number of clusters***.\n",
       "\n",
-      "We take all the descriptors that we got from the above 3 images and find similarity in between them.\n",
-      "Here, similarity is decided by <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CEuclideanDistance.html\">Euclidean distance</a> between the 128-element SIFT descriptors. Similar descriptors are clustered into **k** number of groups. This can be done using Shogun's <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKNN.html\">**KMeans class**</a>. These clusters are called **bags of keypoints** or **visual words** and they collectively represent the **vocabulary** of the program. Each cluster has a **cluster center**, which can be thought of as the representative descriptor of all the descriptors belonging to that cluster. These cluster centers can be found using <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKMeans.html#a5d8a09aeadada018747786a5470d3653\">**get_cluster_centers()**</a> method.\n",
+      "We take all the descriptors that we got from the three images above and find similarity in between them.\n",
+      "Here, similarity is decided by <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CEuclideanDistance.html\">Euclidean distance</a> between the 128-element SIFT descriptors. Similar descriptors are clustered into **k** number of groups. This can be done using Shogun's <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKNN.html\">**KMeans class**</a>. These clusters are called **bags of keypoints** or **visual words** and they collectively represent the **vocabulary** of the program. Each cluster has a **cluster center**, which can be thought of as the representative descriptor of all the descriptors belonging to that cluster. These cluster centers can be found using the <a href=\"http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKMeans.html#a5d8a09aeadada018747786a5470d3653\">**get_cluster_centers()**</a> method.\n",
       "\n",
       "To perform clustering into **k** groups, we define the **get_similar_descriptors()** function below."
      ]
@@ -226,19 +228,19 @@
       "In short, we approximated each training image into a **k** element vector. This can be utilized to train any multiclass classifier.\n",
       "\n",
       "\n",
-      "First, let us see few training images"
+      "First, let us see a few training images"
      ]
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
       "# name of all the folders together\n",
-      "folders=['cars','planes','train']\n",
+      "folders=['cars','planes','trains']\n",
       "training_sample=[]\n",
       "for folder in folders:\n",
       "    #get all the training images from a particular class \n",
-      "    filenames=get_imlist('sample/%s'%folder)\n",
+      "    filenames=get_imlist('../../../data/SIFT/%s'%folder)\n",
       "    for i in xrange(10):\n",
       "        temp=cv2.imread(filenames[0][i])\n",
       "        training_sample.append(temp)\n",
@@ -258,7 +260,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "we here define **get_sift_training()** function to get all the **SIFT** descriptors present in all the training images."
+      "We here define **get_sift_training()** function to get all the **SIFT** descriptors present in all the training images."
      ]
     },
     {
@@ -268,7 +270,7 @@
       "def get_sift_training():\n",
       "    \n",
       "    # name of all the folders together\n",
-      "    folders=['cars','planes','train']\n",
+      "    folders=['cars','planes','trains']\n",
       "    \n",
       "    folder_number=-1\n",
       "    des_training=[]\n",
@@ -277,7 +279,7 @@
       "        folder_number+=1\n",
       "\n",
       "        #get all the training images from a particular class \n",
-      "        filenames=get_imlist('sample/%s'%folder)\n",
+      "        filenames=get_imlist('../../../data/SIFT/%s'%folder)\n",
       "        filenames=np.array(filenames)\n",
       "        \n",
       "        des_per_folder=[]\n",
@@ -350,14 +352,14 @@
       "    knn.train(sg_cluster_centers)\n",
       "\n",
       "    # name of all the folders together\n",
-      "    folders=['cars','planes','train']\n",
+      "    folders=['cars','planes','trains']\n",
       "    folder_number=-1\n",
       "\n",
       "    for folder in folders:\n",
       "        folder_number+=1\n",
       "\n",
       "        #get all the training images from a particular class \n",
-      "        filenames=get_imlist('sample/%s'%folder)\n",
+      "        filenames=get_imlist('../../../data/SIFT/%s'%folder)\n",
       "\n",
       "        for image_name in xrange(len(filenames[0])):\n",
       "            \n",
@@ -456,7 +458,7 @@
       "# Lets see the testing images\n",
       "testing_sample=[]\n",
       "#get all the testing images  \n",
-      "filenames=get_imlist('sample/test_image/')\n",
+      "filenames=get_imlist('../../../data/SIFT/test_image/')\n",
       "for i in xrange(len(filenames[0])):\n",
       "    temp=cv2.imread(filenames[0][i])\n",
       "    testing_sample.append(temp)\n",
@@ -484,7 +486,7 @@
      "collapsed": false,
      "input": [
       "def get_sift_testing():\n",
-      "    filenames=get_imlist('sample/test_image/')\n",
+      "    filenames=get_imlist('../../../data/SIFT/test_image/')\n",
       "    filenames=np.array(filenames)\n",
       "    des_testing=[]\n",
       "    for image_name in filenames[0]:\n",
@@ -531,7 +533,7 @@
       "    \n",
       "    # a list to hold histograms of all the test images\n",
       "    all_histograms=[]\n",
-      "    filenames=get_imlist('sample/test_image/')\n",
+      "    filenames=get_imlist('../../../data/SIFT/test_image/')\n",
       "    \n",
       "    for image_name in xrange(len(filenames[0])):\n",
       "        \n",
@@ -619,13 +621,14 @@
      "collapsed": false,
      "input": [
       "import re\n",
-      "filenames=get_imlist('sample/test_image/')\n",
+      "filenames=get_imlist('../../../data/SIFT/test_image/')\n",
       "# get the formation of the files, later to be used for calculating the confusion matrix\n",
       "formation=([int(''.join(x for x in filename if x.isdigit())) for filename in filenames[0]])\n",
       "    \n",
       "# associate them with the correct labels by making a dictionary\n",
       "keys=range(len(filenames[0]))\n",
-      "values=[0,1,0,2,1,0,1,0,0,0,1,1,2,0,0,2,2,2,1,1,1,1]\n",
+      "\n",
+      "values=[0,1,0,2,1,0,1,0,0,0,1,2,2,2,2,1,1,1,1,1]\n",
       "label_dict=dict(zip(keys, values))\n",
       "\n",
       "# the following list holds the actual labels\n",
@@ -648,6 +651,9 @@
      "cell_type": "code",
      "collapsed": false,
      "input": [
+      "best_k=1\n",
+      "max_accuracy=0\n",
+      "\n",
       "for k in xrange(1,5):\n",
       "    k=100*k\n",
       "    \n",
@@ -662,10 +668,16 @@
       "    \n",
       "    # step 5\n",
       "    predicted=classify_svm(k, knn, descriptor_testing)\n",
-      "    print \"for a k=%d, accuracy is %d%%\"%(k,sum(predicted==expected)*100/float(len(expected)))\n",
+      "    accuracy=sum(predicted==expected)*100/float(len(expected))\n",
+      "    print \"for a k=%d, accuracy is %d%%\"%(k, accuracy)\n",
       "    \n",
       "    #step 6\n",
       "    m=create_conf_matrix(expected, predicted, 3)\n",
+      "\n",
+      "    if accuracy>max_accuracy:\n",
+      "        best_k=k\n",
+      "        max_accuracy=accuracy\n",
+      "        best_prediction=predicted\n",
       "    \n",
       "    print \"confusion matrix for k=%d\"%k\n",
       "    print m"
@@ -678,9 +690,26 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "From all the above k's we choose the one which has the best accuracy. Number of k's can be extended further to enhance the overall accuracy."
+      "From all the above k's we choose the one which has the best accuracy. Number of k's can be extended further to enhance the overall accuracy.\n",
+      "\n",
+      "Test images along with their predicted labels are shown below for the most optimum value of k: "
      ]
     },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "plt.rcParams['figure.figsize']=20,8\n",
+      "fig=plt.figure()\n",
+      "for image_no in xrange(len(filenames[0])):\n",
+      "    fig.add_subplot(3,8, image_no+1)\n",
+      "    plt.title('pred. class: '+folders[int(best_prediction[image_no])])\n",
+      "    showfig(testing_sample[image_no], None)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
     {
      "cell_type": "heading",
      "level": 2,
@@ -714,14 +743,6 @@
       "\n",
       "* Practical OpenCV by Samarth Brahmbhatt, University of Pennsylvania "
      ]
-    },
-    {
-     "cell_type": "code",
-     "collapsed": false,
-     "input": [],
-     "language": "python",
-     "metadata": {},
-     "outputs": []
     }
    ],
    "metadata": {}