Skip to content

Commit

Permalink
Merge pull request #2004 from Saurabh7/mcnb
Browse files Browse the repository at this point in the history
extend multiclass nb
  • Loading branch information
karlnapf committed Mar 19, 2014
2 parents d5be548 + 9ad4aea commit 9f0f2b7
Showing 1 changed file with 222 additions and 2 deletions.
224 changes: 222 additions & 2 deletions doc/ipython-notebooks/multiclass/multiclass_reduction.ipynb
Expand Up @@ -453,7 +453,7 @@
"\n",
"print \"\\nRandom Dense Encoder + Margin Loss based Decoder\"\n",
"print \"=\"*60\n",
"evaluate(ECOCStrategy(ECOCRandomDenseEncoder(), ECOCLLBDecoder()))"
"evaluate(ECOCStrategy(ECOCRandomDenseEncoder(), ECOCLLBDecoder()), 2.0)"
],
"language": "python",
"metadata": {},
Expand Down Expand Up @@ -516,9 +516,229 @@
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So we have seen that we can classify multiclass samples using a base binary machine. If we dwell on this a bit more, we can easily spot the intuition behind this.\n",
"\n",
"The `MulticlassOneVsRestStrategy` classifies one class against the rest of the classes. This is done for each and every class by training a separate classifier for it.So we will have total $k$ classifiers where $k$ is the number of classes.\n",
"\n",
"Just to see this in action lets create some data using the gaussian mixture model class ([GMM](http://www.shogun-toolbox.org/doc/en/3.0.0/classshogun_1_1CGMM.html)) from which we sample the data points.Four different classes are created and plotted."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from modshogun import *\n",
"from numpy import *\n",
"\n",
"num=1000;\n",
"dist=1.0;\n",
"\n",
"gmm=GMM(4)\n",
"gmm.set_nth_mean(array([-dist*4,-dist]),0)\n",
"gmm.set_nth_mean(array([-dist*4,dist*4]),1)\n",
"gmm.set_nth_mean(array([dist*4,dist*4]),2)\n",
"gmm.set_nth_mean(array([dist*4,-dist]),3)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),0)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),1)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),2)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),3)\n",
"\n",
"gmm.set_coef(array([1.0,0.0,0.0,0.0]))\n",
"x0=array([gmm.sample() for i in xrange(num)]).T\n",
"x0t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"gmm.set_coef(array([0.0,1.0,0.0,0.0]))\n",
"x1=array([gmm.sample() for i in xrange(num)]).T\n",
"x1t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"gmm.set_coef(array([0.0,0.0,1.0,0.0]))\n",
"x2=array([gmm.sample() for i in xrange(num)]).T\n",
"x2t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"gmm.set_coef(array([0.0,0.0,0.0,1.0]))\n",
"x3=array([gmm.sample() for i in xrange(num)]).T\n",
"x3t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"\n",
"traindata=concatenate((x0,x1,x2,x3), axis=1)\n",
"testdata=concatenate((x0t,x1t,x2t,x3t), axis=1)\n",
"\n",
"l0 = array([0.0 for i in xrange(num)])\n",
"l1 = array([1.0 for i in xrange(num)])\n",
"l2 = array([2.0 for i in xrange(num)])\n",
"l3 = array([3.0 for i in xrange(num)])\n",
"\n",
"trainlab=concatenate((l0,l1,l2,l3))\n",
"testlab=concatenate((l0,l1,l2,l3))"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"_=jet()\n",
"_=scatter(traindata[0,:], traindata[1,:], c=trainlab, s=100)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have the data ready , lets convert it to shogun format features."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"feats_tr=RealFeatures(traindata)\n",
"labels=MulticlassLabels(trainlab)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The [KernelMulticlassMachine](http://www.shogun-toolbox.org/doc/en/current/classshogun_1_1CKernelMulticlassMachine.html) is used with LibSVM as the classifer just as in the above example.\n",
"\n",
"Now we have four different classes, so as explained above we will have four classifiers which in shogun terms are submachines.\n",
"\n",
"We can see the outputs of two of the four individual submachines (specified by the index) and of the main machine. The plots clearly show how the submachine classify each class as if it is a binary classification problem and this provides the base for the whole multiclass classification."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from modshogun import KernelMulticlassMachine, LibSVM, GaussianKernel\n",
"\n",
"width=2.1\n",
"epsilon=1e-5\n",
" \n",
"kernel=GaussianKernel(feats_tr, feats_tr, width)\n",
" \n",
"classifier=LibSVM()\n",
"classifier.set_epsilon(epsilon)\n",
"\n",
"mc_machine=KernelMulticlassMachine(MulticlassOneVsRestStrategy(), kernel, classifier, labels)\n",
"\n",
"mc_machine.train()\n",
"\n",
"size=100\n",
"x1=linspace(-10, 10, size)\n",
"x2=linspace(-10, 10, size)\n",
"x, y=meshgrid(x1, x2)\n",
"grid=RealFeatures(array((ravel(x), ravel(y)))) #test features\n",
"\n",
"out=mc_machine.apply_multiclass(grid) #main output\n",
"z=out.get_labels().reshape((size, size))\n",
"\n",
"sub_out0=mc_machine.get_submachine_outputs(0) #first submachine\n",
"sub_out1=mc_machine.get_submachine_outputs(1) #second submachine\n",
"\n",
"z0=sub_out0.get_labels().reshape((size, size))\n",
"z1=sub_out1.get_labels().reshape((size, size))\n",
"\n",
"figure(figsize=(20,5))\n",
"subplot(131, title=\"Submachine 1\")\n",
"c0=pcolor(x, y, z0)\n",
"_=contour(x, y, z0, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c0)\n",
"\n",
"subplot(132, title=\"Submachine 2\")\n",
"c1=pcolor(x, y, z1)\n",
"_=contour(x, y, z1, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c1)\n",
"\n",
"subplot(133, title=\"Multiclass output\")\n",
"c2=pcolor(x, y, z)\n",
"_=contour(x, y, z, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c2)\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `MulticlassOneVsOneStrategy` is a bit different with more number of machines.\n",
"Since it trains a classifer for each pair of classes, we will have a total of $\\frac{k*(k-1)}{2}$ submachines for $k$ classes.Binary classification then takes place on each pair.\n",
"Let's visualize this in a plot."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"C=2.0\n",
" \n",
"bin_machine = LibLinear()\n",
"bin_machine.set_bias_enabled(True)\n",
"bin_machine.set_C(C, C)\n",
"\n",
"mc_machine1 = LinearMulticlassMachine(MulticlassOneVsOneStrategy(), feats_tr, bin_machine, labels)\n",
"mc_machine1.train()\n",
"\n",
"out1=mc_machine1.apply_multiclass(grid) #main output\n",
"z1=out1.get_labels().reshape((size, size))\n",
"\n",
"sub_out10=mc_machine1.get_submachine_outputs(0) #first submachine\n",
"sub_out11=mc_machine1.get_submachine_outputs(1) #second submachine\n",
"\n",
"z10=sub_out10.get_labels().reshape((size, size))\n",
"z11=sub_out11.get_labels().reshape((size, size))\n",
"\n",
"no_color=array([5.0 for i in xrange(num)])\n",
"\n",
"figure(figsize=(20,5))\n",
"subplot(131, title=\"Submachine 1\") #plot submachine and traindata\n",
"c10=pcolor(x, y, z10)\n",
"_=contour(x, y, z10, linewidths=1, colors='black', hold=True)\n",
"lab1=concatenate((l0,l1,no_color,no_color))\n",
"_=scatter(traindata[0,:], traindata[1,:], c=lab1, cmap='gray', s=100)\n",
"_=colorbar(c10)\n",
"\n",
"subplot(132, title=\"Submachine 2\")\n",
"c11=pcolor(x, y, z11)\n",
"_=contour(x, y, z11, linewidths=1, colors='black', hold=True)\n",
"lab2=concatenate((l0, no_color, l2, no_color))\n",
"_=scatter(traindata[0,:], traindata[1,:], c=lab2, cmap=\"gray\", s=100)\n",
"_=colorbar(c11)\n",
"\n",
"subplot(133, title=\"Multiclass output\")\n",
"c12=pcolor(x, y, z1)\n",
"_=contour(x, y, z1, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c12) \n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first two plots help us visualize how the submachines do binary classification for each pair. The class with maximum votes is chosen for test samples, leading to a refined multiclass output as in the last plot."
]
}
],
"metadata": {}
}
]
}
}

0 comments on commit 9f0f2b7

Please sign in to comment.