shogun-toolbox · Jiaolong · Mar 11, 2014 · Mar 11, 2014 · Mar 11, 2014 · Mar 12, 2014
diff --git a/.gitignore b/.gitignore
@@ -28,6 +28,7 @@ configure.log
 .localvimrc
 *.DS_Store
 cscope.*
+cpplint.py
 
 *~
 \#*\#

diff --git a/NEWS b/NEWS
@@ -5,6 +5,7 @@
 	* Features:
 		- ID3 algorithm for decision tree learning [Parijat Mazumdar]
 		- New modes for PCA matrix factorizations: SVD & EVD, in-place or reallocating [Parijat Mazumdar]
+		- Added kernel multiclass strategy examples in multiclass notebook [Saurabh Mahindre]
 	* Bugfixes:
 		- Fix memory problem in PCA::apply_to_feature_matrix [Parijat Mazumdar]
 		- Fix crash in LeastAngleRegression for the case D greater than N [Parijat Mazumdar]

diff --git a/data b/data
diff --git a/doc/ipython-notebooks/multiclass/multiclass_reduction.ipynb b/doc/ipython-notebooks/multiclass/multiclass_reduction.ipynb
@@ -214,7 +214,7 @@
      "collapsed": false,
      "input": [
       "def evaluate(strategy, C):\n",
-      "        bin_machine = LibLinear()\n",
+      "        bin_machine = LibLinear(L2R_L2LOSS_SVC)\n",
       "        bin_machine.set_bias_enabled(True)\n",
       "        bin_machine.set_C(C, C)\n",
       "\n",
@@ -453,7 +453,7 @@
       "\n",
       "print \"\\nRandom Dense Encoder + Margin Loss based Decoder\"\n",
       "print \"=\"*60\n",
-      "evaluate(ECOCStrategy(ECOCRandomDenseEncoder(), ECOCLLBDecoder()))"
+      "evaluate(ECOCStrategy(ECOCRandomDenseEncoder(), ECOCLLBDecoder()), 2.0)"
      ],
      "language": "python",
      "metadata": {},
@@ -471,9 +471,9 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "Expanding on the idea of creating a generic multiclass machine and then assigning a particular multiclass strategy and a base binary machine, one can also use the [KernelMulticlassMachine](http://www.shogun-toolbox.org/doc/en/current/classshogun_1_1CKernelMulticlassMachine.html) with a kernel of choice.\n",
+      "Expanding on the idea of creating a generic multiclass machine and then assigning a particular multiclass strategy and a base binary machine, one can also use the [KernelMulticlassMachine](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelMulticlassMachine.html) with a kernel of choice.\n",
       "\n",
-      "Here we will use a [GaussianKernel](http://www.shogun-toolbox.org/doc/en/3.0.0/classshogun_1_1CGaussianKernel.html) with LibSVM as the classifer.\n",
+      "Here we will use a [GaussianKernel](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CGaussianKernel.html) with [LibSVM](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLibSVM.html) as the classifer.\n",
       "All we have to do is define a new helper evaluate function with the features defined as in the above examples."
      ]
     },
@@ -516,9 +516,229 @@
      "language": "python",
      "metadata": {},
      "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "So we have seen that we can classify multiclass samples using a base binary machine. If we dwell on this a bit more, we can easily spot the intuition behind this.\n",
+      "\n",
+      "The [MulticlassOneVsRestStrategy](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMulticlassOneVsOneStrategy.html) classifies one class against the rest of the classes. This is done for each and every class by training a separate classifier for it.So we will have total $k$ classifiers where $k$ is the number of classes.\n",
+      "\n",
+      "Just to see this in action lets create some data using the gaussian mixture model class ([GMM](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CGMM.html)) from which we sample the data points.Four different classes are created and plotted."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "from modshogun import *\n",
+      "from numpy import *\n",
+      "\n",
+      "num=1000;\n",
+      "dist=1.0;\n",
+      "\n",
+      "gmm=GMM(4)\n",
+      "gmm.set_nth_mean(array([-dist*4,-dist]),0)\n",
+      "gmm.set_nth_mean(array([-dist*4,dist*4]),1)\n",
+      "gmm.set_nth_mean(array([dist*4,dist*4]),2)\n",
+      "gmm.set_nth_mean(array([dist*4,-dist]),3)\n",
+      "gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),0)\n",
+      "gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),1)\n",
+      "gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),2)\n",
+      "gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),3)\n",
+      "\n",
+      "gmm.set_coef(array([1.0,0.0,0.0,0.0]))\n",
+      "x0=array([gmm.sample() for i in xrange(num)]).T\n",
+      "x0t=array([gmm.sample() for i in xrange(num)]).T\n",
+      "\n",
+      "gmm.set_coef(array([0.0,1.0,0.0,0.0]))\n",
+      "x1=array([gmm.sample() for i in xrange(num)]).T\n",
+      "x1t=array([gmm.sample() for i in xrange(num)]).T\n",
+      "\n",
+      "gmm.set_coef(array([0.0,0.0,1.0,0.0]))\n",
+      "x2=array([gmm.sample() for i in xrange(num)]).T\n",
+      "x2t=array([gmm.sample() for i in xrange(num)]).T\n",
+      "\n",
+      "gmm.set_coef(array([0.0,0.0,0.0,1.0]))\n",
+      "x3=array([gmm.sample() for i in xrange(num)]).T\n",
+      "x3t=array([gmm.sample() for i in xrange(num)]).T\n",
+      "\n",
+      "\n",
+      "traindata=concatenate((x0,x1,x2,x3), axis=1)\n",
+      "testdata=concatenate((x0t,x1t,x2t,x3t), axis=1)\n",
+      "\n",
+      "l0 = array([0.0 for i in xrange(num)])\n",
+      "l1 = array([1.0 for i in xrange(num)])\n",
+      "l2 = array([2.0 for i in xrange(num)])\n",
+      "l3 = array([3.0 for i in xrange(num)])\n",
+      "\n",
+      "trainlab=concatenate((l0,l1,l2,l3))\n",
+      "testlab=concatenate((l0,l1,l2,l3))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "_=jet()\n",
+      "_=scatter(traindata[0,:], traindata[1,:], c=trainlab, s=100)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Now that we have the data ready , lets convert it to shogun format features."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "feats_tr=RealFeatures(traindata)\n",
+      "labels=MulticlassLabels(trainlab)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The [KernelMulticlassMachine](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelMulticlassMachine.html) is used with [LibSVM](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLibSVM.html) as the classifer just as in the above example.\n",
+      "\n",
+      "Now we have four different classes, so as explained above we will have four classifiers which in shogun terms are submachines.\n",
+      "\n",
+      "We can see the outputs of two of the four individual submachines (specified by the index) and of the main machine. The plots clearly show how the submachine classify each class as if it is a binary classification problem and this provides the base for the whole multiclass classification."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "from modshogun import KernelMulticlassMachine, LibSVM, GaussianKernel\n",
+      "\n",
+      "width=2.1\n",
+      "epsilon=1e-5\n",
+      "   \n",
+      "kernel=GaussianKernel(feats_tr, feats_tr, width)\n",
+      "    \n",
+      "classifier=LibSVM()\n",
+      "classifier.set_epsilon(epsilon)\n",
+      "\n",
+      "mc_machine=KernelMulticlassMachine(MulticlassOneVsRestStrategy(), kernel, classifier, labels)\n",
+      "\n",
+      "mc_machine.train()\n",
+      "\n",
+      "size=100\n",
+      "x1=linspace(-10, 10, size)\n",
+      "x2=linspace(-10, 10, size)\n",
+      "x, y=meshgrid(x1, x2)\n",
+      "grid=RealFeatures(array((ravel(x), ravel(y)))) #test features\n",
+      "\n",
+      "out=mc_machine.apply_multiclass(grid) #main output\n",
+      "z=out.get_labels().reshape((size, size))\n",
+      "\n",
+      "sub_out0=mc_machine.get_submachine_outputs(0) #first submachine\n",
+      "sub_out1=mc_machine.get_submachine_outputs(1) #second submachine\n",
+      "\n",
+      "z0=sub_out0.get_labels().reshape((size, size))\n",
+      "z1=sub_out1.get_labels().reshape((size, size))\n",
+      "\n",
+      "figure(figsize=(20,5))\n",
+      "subplot(131, title=\"Submachine 1\")\n",
+      "c0=pcolor(x, y, z0)\n",
+      "_=contour(x, y, z0, linewidths=1, colors='black', hold=True)\n",
+      "_=colorbar(c0)\n",
+      "\n",
+      "subplot(132, title=\"Submachine 2\")\n",
+      "c1=pcolor(x, y, z1)\n",
+      "_=contour(x, y, z1, linewidths=1, colors='black', hold=True)\n",
+      "_=colorbar(c1)\n",
+      "\n",
+      "subplot(133, title=\"Multiclass output\")\n",
+      "c2=pcolor(x, y, z)\n",
+      "_=contour(x, y, z, linewidths=1, colors='black', hold=True)\n",
+      "_=colorbar(c2)\n"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The `MulticlassOneVsOneStrategy` is a bit different with more number of machines.\n",
+      "Since it trains a classifer for each pair of classes, we will have a total of   $\\frac{k(k-1)}{2}$  submachines for $k$ classes. Binary classification then takes place on each pair.\n",
+      "Let's visualize this in a plot."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "C=2.0\n",
+      "    \n",
+      "bin_machine = LibLinear(L2R_L2LOSS_SVC)\n",
+      "bin_machine.set_bias_enabled(True)\n",
+      "bin_machine.set_C(C, C)\n",
+      "\n",
+      "mc_machine1 = LinearMulticlassMachine(MulticlassOneVsOneStrategy(), feats_tr, bin_machine, labels)\n",
+      "mc_machine1.train()\n",
+      "\n",
+      "out1=mc_machine1.apply_multiclass(grid) #main output\n",
+      "z1=out1.get_labels().reshape((size, size))\n",
+      "\n",
+      "sub_out10=mc_machine1.get_submachine_outputs(0) #first submachine\n",
+      "sub_out11=mc_machine1.get_submachine_outputs(1) #second submachine\n",
+      "\n",
+      "z10=sub_out10.get_labels().reshape((size, size))\n",
+      "z11=sub_out11.get_labels().reshape((size, size))\n",
+      "\n",
+      "no_color=array([5.0 for i in xrange(num)])\n",
+      "\n",
+      "figure(figsize=(20,5))\n",
+      "subplot(131, title=\"Submachine 1\")              #plot submachine and traindata\n",
+      "c10=pcolor(x, y, z10)\n",
+      "_=contour(x, y, z10, linewidths=1, colors='black', hold=True)\n",
+      "lab1=concatenate((l0,l1,no_color,no_color))\n",
+      "_=scatter(traindata[0,:], traindata[1,:], c=lab1, cmap='gray', s=100)\n",
+      "_=colorbar(c10)\n",
+      "\n",
+      "subplot(132, title=\"Submachine 2\")\n",
+      "c11=pcolor(x, y, z11)\n",
+      "_=contour(x, y, z11, linewidths=1, colors='black', hold=True)\n",
+      "lab2=concatenate((l0, no_color, l2, no_color))\n",
+      "_=scatter(traindata[0,:], traindata[1,:], c=lab2, cmap=\"gray\", s=100)\n",
+      "_=colorbar(c11)\n",
+      "\n",
+      "subplot(133, title=\"Multiclass output\")\n",
+      "c12=pcolor(x, y, z1)\n",
+      "_=contour(x, y, z1, linewidths=1, colors='black', hold=True)\n",
+      "_=colorbar(c12)   \n"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The first two plots help us visualize how the submachines do binary classification for each pair. The class with maximum votes is chosen for test samples, leading to a refined multiclass output as in the last plot."
+     ]
     }
    ],
    "metadata": {}
   }
  ]
-}
+}