Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added multilabel reader in LibSVMFile. Fixed a bug in so_multiclass.cpp #2062

Closed
wants to merge 111 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
111 commits
Select commit Hold shift + click to select a range
717da9b
add a new LaplacianInferenceMethodWithLBFGS class, where the Newton m…
yorkerlin Mar 11, 2014
3363791
modify for Google C++ style
yorkerlin Mar 11, 2014
6611e6a
ignore the coding style script
yorkerlin Mar 11, 2014
e8262c7
add methods used for LBFGS in CLaplacianInferenceMethodWithLBFGS
yorkerlin Mar 12, 2014
f2a43b5
complete the implementation of LaplacianInferenceMethodWithLBFGS
yorkerlin Mar 12, 2014
854ff67
we need to set the private data members in LaplacianInferenceMethod t…
yorkerlin Mar 12, 2014
faef016
remove the useless header file
yorkerlin Mar 12, 2014
2107beb
add some comments in CLogitLikelihood
yorkerlin Mar 12, 2014
7b8cf6d
chang the order of matrix multiplication to gain improvement
yorkerlin Mar 12, 2014
9babe65
enable unit test
yorkerlin Mar 13, 2014
1c39d9b
remove the useless code optimization for matrix mulitiplication
yorkerlin Mar 13, 2014
3154086
updated the L-BFGS class
yorkerlin Mar 13, 2014
fa83523
add unit test for L-BFGS used the same result produced by the Newton …
yorkerlin Mar 13, 2014
41e90df
Revert "remove the useless code optimization for matrix mulitiplication"
yorkerlin Mar 13, 2014
df441a1
Revert "chang the order of matrix multiplication to gain improvement"
yorkerlin Mar 13, 2014
f9ee042
Revert "enable unit test"
yorkerlin Mar 13, 2014
10e7de7
add corresponding Matlab code for the LaplacianInferenceMethod_unittest
yorkerlin Mar 14, 2014
6fd01b6
Revert "add unit test for L-BFGS used the same result produced by the…
yorkerlin Mar 14, 2014
18b0f5e
softmax function added
mazumdarparijat Mar 14, 2014
f227a1c
extend multiclass nb
Saurabh7 Mar 15, 2014
4082c64
fix argument
Saurabh7 Mar 15, 2014
24271f7
Update RandomKitchenSinksDotFeatures.h
hwl596 Mar 16, 2014
7822050
Merge pull request #1 from hwl596/features/RKSDotFeatureHeaderFile
hwl596 Mar 16, 2014
18577fd
Fixing memory leak in NCBM solver.
tklein23 Mar 16, 2014
4a55113
Update RandomKitchenSinksDotFeatures.h
hwl596 Mar 17, 2014
6ee3991
Merge pull request #2014 from tklein23/fix_libncbm_leak
vigsterkr Mar 17, 2014
5b5334d
licence added + log_sum_exp trick added
mazumdarparijat Mar 17, 2014
9c31fa0
Revert "add corresponding Matlab code for the LaplacianInferenceMetho…
yorkerlin Mar 17, 2014
a0f2502
add the corresponding Matlab code
yorkerlin Mar 17, 2014
07d0f98
add the corresponding Matlab used for generating the result for the u…
yorkerlin Mar 17, 2014
51d0e58
updated the class
yorkerlin Mar 17, 2014
803a979
add corresponding Matlab fot the unit test
yorkerlin Mar 17, 2014
707e9d4
add the python script for generate code for numerical comparison
yorkerlin Mar 17, 2014
1fa1517
change the private data member in LaplacianInferenceMethod
yorkerlin Mar 17, 2014
521e441
add the LaplacianInferenceMethodWithLBFGS_unittest
yorkerlin Mar 17, 2014
4098574
GaussianProcessBinaryClassificationWithLBFGS_unittest
yorkerlin Mar 17, 2014
cd7a6ce
Merge pull request #2012 from hwl596/develop
karlnapf Mar 17, 2014
8716514
Merge pull request #1997 from mazumdarparijat/gp
karlnapf Mar 17, 2014
0d09ba7
Revert "add the python script for generate code for numerical compari…
yorkerlin Mar 18, 2014
24eed4d
Revert "add the corresponding Matlab used for generating the result f…
yorkerlin Mar 18, 2014
fd1ab3a
Revert "add corresponding Matlab fot the unit test"
yorkerlin Mar 18, 2014
3ba2b63
Revert "add the corresponding Matlab code"
yorkerlin Mar 18, 2014
d1c7b01
Revert "change the private data member in LaplacianInferenceMethod"
yorkerlin Mar 18, 2014
415ed6b
Revert "we need to set the private data members in LaplacianInference…
yorkerlin Mar 18, 2014
f17b291
change the private data member to protected member in order to implem…
yorkerlin Mar 18, 2014
b629a0a
change the default LBFGS parameters in order to be consistent with th…
yorkerlin Mar 18, 2014
6134bc2
Merge pull request #2020 from tklein23/besser_import
karlnapf Mar 18, 2014
a4b7540
add titles,cleanup
Saurabh7 Mar 18, 2014
9ad4aea
OvO plots
Saurabh7 Mar 18, 2014
3bb6578
Added different number of samples for quadratic time MMD
lambday Mar 17, 2014
8963986
fix include config.h in SGRefObject
lambday Mar 18, 2014
8ef92a7
Revert "updated the class"
yorkerlin Mar 18, 2014
0dad5bd
Revert "updated the L-BFGS class"
yorkerlin Mar 18, 2014
d7dc6fb
Revert "remove the useless header file"
yorkerlin Mar 18, 2014
06c2b53
Revert "complete the implementation of LaplacianInferenceMethodWithLB…
yorkerlin Mar 18, 2014
51302d0
Revert "add methods used for LBFGS in CLaplacianInferenceMethodWithLB…
yorkerlin Mar 18, 2014
148aca7
Revert "modify for Google C++ style"
yorkerlin Mar 18, 2014
e12dd8e
Revert "add a new LaplacianInferenceMethodWithLBFGS class, where the …
yorkerlin Mar 18, 2014
981b91f
Revert "GaussianProcessBinaryClassificationWithLBFGS_unittest"
yorkerlin Mar 18, 2014
583daba
Revert "add the LaplacianInferenceMethodWithLBFGS_unittest"
yorkerlin Mar 18, 2014
e8484ff
fix null spectrum approximation formula in quadratic time MMD
lambday Mar 18, 2014
7cfa149
pca ipython notebook revised 3rd time
kislayabhi Mar 18, 2014
26ffa04
add a updated LaplacianInferenceMethodWithLBFGS header file
yorkerlin Mar 18, 2014
6bfd373
upadted the header file
yorkerlin Mar 18, 2014
c2adfa0
add two constructors for LaplacianInferenceMethodWithLBFGS
yorkerlin Mar 18, 2014
834e63c
add the init method for LaplacianInferenceMethodWithLBFGS
yorkerlin Mar 18, 2014
d935183
add methods used for L-BFGS
yorkerlin Mar 18, 2014
0acaaac
finally add the update_alpha() method
yorkerlin Mar 18, 2014
92bc0e8
add the unit test for LaplacianInferenceMethodWithLBFGS
yorkerlin Mar 18, 2014
848f376
finally add a unit test to test whether the class can be used for Cla…
yorkerlin Mar 18, 2014
8d18215
fix the indentation issue in these two unit test files
yorkerlin Mar 18, 2014
a27830e
Revert "fix the indentation issue in these two unit test files"
yorkerlin Mar 18, 2014
9f4dc20
fix indentation issues
yorkerlin Mar 18, 2014
d5be548
Merge pull request #2027 from tklein23/more_import_cleanups
tklein23 Mar 19, 2014
c9181fb
this is the first commit of the Entrance task #1971 without code clea…
yorkerlin Mar 19, 2014
ef57566
ID3 modular interface setup+ID3 API example added
mazumdarparijat Mar 19, 2014
9f0f2b7
Merge pull request #2004 from Saurabh7/mcnb
karlnapf Mar 19, 2014
d48f117
Merge pull request #2022 from lambday/develop
karlnapf Mar 19, 2014
7a6d2e5
Merge pull request #2028 from kislayabhi/develop
karlnapf Mar 19, 2014
ca7a81d
Update RandomKitchenSinksDotFeatures.h
hwl596 Mar 19, 2014
a804ca4
minor doc changes
mazumdarparijat Mar 20, 2014
a3aada7
Merge pull request #2021 from mazumdarparijat/pca
iglesias Mar 20, 2014
edb252d
Merge pull request #2042 from tklein23/fixing_broken_imports
tklein23 Mar 20, 2014
c456949
Merge pull request #2045 from tklein23/fixing_broken_imports
tklein23 Mar 20, 2014
fcd588b
Merge pull request #2038 from hwl596/RKSDotFeatures_documentation
karlnapf Mar 20, 2014
689e5d9
Revert "this is the first commit of the Entrance task #1971 without c…
yorkerlin Mar 20, 2014
7c97b23
minor fixes in multiclass nb
Saurabh7 Mar 20, 2014
c925db1
Merge pull request #2051 from tklein23/fixing_broken_imports
tklein23 Mar 20, 2014
6145d92
streaming_onlineliblinear_sparse.cpp: Use mktemp in example
vperic Mar 20, 2014
1f24f77
Merge pull request #2052 from vperic/examples-mktemp-fix
tklein23 Mar 20, 2014
5a30bd5
Update README_cmake.md
dhruv13J Mar 20, 2014
9e2beb7
Update README_cmake.md
dhruv13J Mar 21, 2014
700a727
remove the GaussianProcessBinaryClassificationWithLBFGS unit test and…
yorkerlin Mar 21, 2014
afcd973
modify a method name and fix the indentation issue in header file
yorkerlin Mar 21, 2014
e24a38a
modify a method name and fix the indentation issue in the cpp file
yorkerlin Mar 21, 2014
d3cd50f
fix indentation issue
yorkerlin Mar 21, 2014
f63a68a
merge the LBFGS unit test for classification to GaussianProcessBinary…
yorkerlin Mar 21, 2014
33419a6
Merge pull request #2058 from dhruv13J/patch-1
iglesias Mar 21, 2014
1176064
add config.h in all .h files in shogun/classifier
Mar 21, 2014
85e78d8
minor fix
yorkerlin Mar 21, 2014
683999f
Merge pull request #2060 from frank0523/develop
iglesias Mar 21, 2014
c32840f
Added multilabel reader in LibSVMFile. Fixed a bug in so_multiclass.cpp
Jiaolong Mar 19, 2014
5936eda
Merge pull request #1988 from yorkerlin/develop
karlnapf Mar 21, 2014
299befb
Merge pull request #2047 from Saurabh7/develop
karlnapf Mar 21, 2014
f9be77f
Update to latest data version
vigsterkr Mar 21, 2014
69f916d
Merge pull request #2061 from tklein23/cleaning_io_imports
tklein23 Mar 21, 2014
f80b69c
Added multilabel reader in LibSVMFile. Fixed a bug in so_multiclass.cpp
Jiaolong Mar 19, 2014
0b80abd
updating revision of data submodule
Jiaolong Mar 20, 2014
ecbc414
Merge branch 'io_libsvm_multilable' of https://github.com/Jiaolong/sh…
Jiaolong Mar 22, 2014
23f804f
Added multilabel reader in LibSVMFile. Fixed a bug in so_multiclass.cpp
Jiaolong Mar 19, 2014
e9dc8e0
Merge branch 'io_libsvm_multilable' of https://github.com/Jiaolong/sh…
Jiaolong Mar 22, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ configure.log
.localvimrc
*.DS_Store
cscope.*
cpplint.py

*~
\#*\#
Expand Down
1 change: 1 addition & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
* Features:
- ID3 algorithm for decision tree learning [Parijat Mazumdar]
- New modes for PCA matrix factorizations: SVD & EVD, in-place or reallocating [Parijat Mazumdar]
- Added kernel multiclass strategy examples in multiclass notebook [Saurabh Mahindre]
* Bugfixes:
- Fix memory problem in PCA::apply_to_feature_matrix [Parijat Mazumdar]
- Fix crash in LeastAngleRegression for the case D greater than N [Parijat Mazumdar]
Expand Down
2 changes: 1 addition & 1 deletion data
Submodule data updated 393 files
230 changes: 225 additions & 5 deletions doc/ipython-notebooks/multiclass/multiclass_reduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@
"collapsed": false,
"input": [
"def evaluate(strategy, C):\n",
" bin_machine = LibLinear()\n",
" bin_machine = LibLinear(L2R_L2LOSS_SVC)\n",
" bin_machine.set_bias_enabled(True)\n",
" bin_machine.set_C(C, C)\n",
"\n",
Expand Down Expand Up @@ -453,7 +453,7 @@
"\n",
"print \"\\nRandom Dense Encoder + Margin Loss based Decoder\"\n",
"print \"=\"*60\n",
"evaluate(ECOCStrategy(ECOCRandomDenseEncoder(), ECOCLLBDecoder()))"
"evaluate(ECOCStrategy(ECOCRandomDenseEncoder(), ECOCLLBDecoder()), 2.0)"
],
"language": "python",
"metadata": {},
Expand All @@ -471,9 +471,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Expanding on the idea of creating a generic multiclass machine and then assigning a particular multiclass strategy and a base binary machine, one can also use the [KernelMulticlassMachine](http://www.shogun-toolbox.org/doc/en/current/classshogun_1_1CKernelMulticlassMachine.html) with a kernel of choice.\n",
"Expanding on the idea of creating a generic multiclass machine and then assigning a particular multiclass strategy and a base binary machine, one can also use the [KernelMulticlassMachine](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelMulticlassMachine.html) with a kernel of choice.\n",
"\n",
"Here we will use a [GaussianKernel](http://www.shogun-toolbox.org/doc/en/3.0.0/classshogun_1_1CGaussianKernel.html) with LibSVM as the classifer.\n",
"Here we will use a [GaussianKernel](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CGaussianKernel.html) with [LibSVM](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLibSVM.html) as the classifer.\n",
"All we have to do is define a new helper evaluate function with the features defined as in the above examples."
]
},
Expand Down Expand Up @@ -516,9 +516,229 @@
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So we have seen that we can classify multiclass samples using a base binary machine. If we dwell on this a bit more, we can easily spot the intuition behind this.\n",
"\n",
"The [MulticlassOneVsRestStrategy](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CMulticlassOneVsOneStrategy.html) classifies one class against the rest of the classes. This is done for each and every class by training a separate classifier for it.So we will have total $k$ classifiers where $k$ is the number of classes.\n",
"\n",
"Just to see this in action lets create some data using the gaussian mixture model class ([GMM](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CGMM.html)) from which we sample the data points.Four different classes are created and plotted."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from modshogun import *\n",
"from numpy import *\n",
"\n",
"num=1000;\n",
"dist=1.0;\n",
"\n",
"gmm=GMM(4)\n",
"gmm.set_nth_mean(array([-dist*4,-dist]),0)\n",
"gmm.set_nth_mean(array([-dist*4,dist*4]),1)\n",
"gmm.set_nth_mean(array([dist*4,dist*4]),2)\n",
"gmm.set_nth_mean(array([dist*4,-dist]),3)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),0)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),1)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),2)\n",
"gmm.set_nth_cov(array([[1.0,0.0],[0.0,1.0]]),3)\n",
"\n",
"gmm.set_coef(array([1.0,0.0,0.0,0.0]))\n",
"x0=array([gmm.sample() for i in xrange(num)]).T\n",
"x0t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"gmm.set_coef(array([0.0,1.0,0.0,0.0]))\n",
"x1=array([gmm.sample() for i in xrange(num)]).T\n",
"x1t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"gmm.set_coef(array([0.0,0.0,1.0,0.0]))\n",
"x2=array([gmm.sample() for i in xrange(num)]).T\n",
"x2t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"gmm.set_coef(array([0.0,0.0,0.0,1.0]))\n",
"x3=array([gmm.sample() for i in xrange(num)]).T\n",
"x3t=array([gmm.sample() for i in xrange(num)]).T\n",
"\n",
"\n",
"traindata=concatenate((x0,x1,x2,x3), axis=1)\n",
"testdata=concatenate((x0t,x1t,x2t,x3t), axis=1)\n",
"\n",
"l0 = array([0.0 for i in xrange(num)])\n",
"l1 = array([1.0 for i in xrange(num)])\n",
"l2 = array([2.0 for i in xrange(num)])\n",
"l3 = array([3.0 for i in xrange(num)])\n",
"\n",
"trainlab=concatenate((l0,l1,l2,l3))\n",
"testlab=concatenate((l0,l1,l2,l3))"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"_=jet()\n",
"_=scatter(traindata[0,:], traindata[1,:], c=trainlab, s=100)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have the data ready , lets convert it to shogun format features."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"feats_tr=RealFeatures(traindata)\n",
"labels=MulticlassLabels(trainlab)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The [KernelMulticlassMachine](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CKernelMulticlassMachine.html) is used with [LibSVM](http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CLibSVM.html) as the classifer just as in the above example.\n",
"\n",
"Now we have four different classes, so as explained above we will have four classifiers which in shogun terms are submachines.\n",
"\n",
"We can see the outputs of two of the four individual submachines (specified by the index) and of the main machine. The plots clearly show how the submachine classify each class as if it is a binary classification problem and this provides the base for the whole multiclass classification."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from modshogun import KernelMulticlassMachine, LibSVM, GaussianKernel\n",
"\n",
"width=2.1\n",
"epsilon=1e-5\n",
" \n",
"kernel=GaussianKernel(feats_tr, feats_tr, width)\n",
" \n",
"classifier=LibSVM()\n",
"classifier.set_epsilon(epsilon)\n",
"\n",
"mc_machine=KernelMulticlassMachine(MulticlassOneVsRestStrategy(), kernel, classifier, labels)\n",
"\n",
"mc_machine.train()\n",
"\n",
"size=100\n",
"x1=linspace(-10, 10, size)\n",
"x2=linspace(-10, 10, size)\n",
"x, y=meshgrid(x1, x2)\n",
"grid=RealFeatures(array((ravel(x), ravel(y)))) #test features\n",
"\n",
"out=mc_machine.apply_multiclass(grid) #main output\n",
"z=out.get_labels().reshape((size, size))\n",
"\n",
"sub_out0=mc_machine.get_submachine_outputs(0) #first submachine\n",
"sub_out1=mc_machine.get_submachine_outputs(1) #second submachine\n",
"\n",
"z0=sub_out0.get_labels().reshape((size, size))\n",
"z1=sub_out1.get_labels().reshape((size, size))\n",
"\n",
"figure(figsize=(20,5))\n",
"subplot(131, title=\"Submachine 1\")\n",
"c0=pcolor(x, y, z0)\n",
"_=contour(x, y, z0, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c0)\n",
"\n",
"subplot(132, title=\"Submachine 2\")\n",
"c1=pcolor(x, y, z1)\n",
"_=contour(x, y, z1, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c1)\n",
"\n",
"subplot(133, title=\"Multiclass output\")\n",
"c2=pcolor(x, y, z)\n",
"_=contour(x, y, z, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c2)\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `MulticlassOneVsOneStrategy` is a bit different with more number of machines.\n",
"Since it trains a classifer for each pair of classes, we will have a total of $\\frac{k(k-1)}{2}$ submachines for $k$ classes. Binary classification then takes place on each pair.\n",
"Let's visualize this in a plot."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"C=2.0\n",
" \n",
"bin_machine = LibLinear(L2R_L2LOSS_SVC)\n",
"bin_machine.set_bias_enabled(True)\n",
"bin_machine.set_C(C, C)\n",
"\n",
"mc_machine1 = LinearMulticlassMachine(MulticlassOneVsOneStrategy(), feats_tr, bin_machine, labels)\n",
"mc_machine1.train()\n",
"\n",
"out1=mc_machine1.apply_multiclass(grid) #main output\n",
"z1=out1.get_labels().reshape((size, size))\n",
"\n",
"sub_out10=mc_machine1.get_submachine_outputs(0) #first submachine\n",
"sub_out11=mc_machine1.get_submachine_outputs(1) #second submachine\n",
"\n",
"z10=sub_out10.get_labels().reshape((size, size))\n",
"z11=sub_out11.get_labels().reshape((size, size))\n",
"\n",
"no_color=array([5.0 for i in xrange(num)])\n",
"\n",
"figure(figsize=(20,5))\n",
"subplot(131, title=\"Submachine 1\") #plot submachine and traindata\n",
"c10=pcolor(x, y, z10)\n",
"_=contour(x, y, z10, linewidths=1, colors='black', hold=True)\n",
"lab1=concatenate((l0,l1,no_color,no_color))\n",
"_=scatter(traindata[0,:], traindata[1,:], c=lab1, cmap='gray', s=100)\n",
"_=colorbar(c10)\n",
"\n",
"subplot(132, title=\"Submachine 2\")\n",
"c11=pcolor(x, y, z11)\n",
"_=contour(x, y, z11, linewidths=1, colors='black', hold=True)\n",
"lab2=concatenate((l0, no_color, l2, no_color))\n",
"_=scatter(traindata[0,:], traindata[1,:], c=lab2, cmap=\"gray\", s=100)\n",
"_=colorbar(c11)\n",
"\n",
"subplot(133, title=\"Multiclass output\")\n",
"c12=pcolor(x, y, z1)\n",
"_=contour(x, y, z1, linewidths=1, colors='black', hold=True)\n",
"_=colorbar(c12) \n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first two plots help us visualize how the submachines do binary classification for each pair. The class with maximum votes is chosen for test samples, leading to a refined multiclass output as in the last plot."
]
}
],
"metadata": {}
}
]
}
}