Example for learning/predicting multiclass on own data #1868

tklein23 · 2014-02-13T21:26:26Z

This task is about creating examples in any of the available interfaces to train a multiclass SVM and to predict multiclass labels using the learned SVM:

$ python train_multiclass_svm.py train.data train.labels multiclass-svm.model
$ python predict_multiclass_svm.py multiclass-svm.model eval.data predicted.labels

The example should be as simple as possible. The goals are:

applying algorithms to own data without a need to change the scripts/sources
providing a simple examples for new developers to start own scripts/experiments

Additionally, we could try to provide a script, that evaluates the outcome of the above scripts with. For example:

$ python evaluate_multiclass_labels.py eval.label predicted.labels

Disclaimer: This task could also be solved with other interfaces than python and for other algorithms that multilabel SVMs.

The text was updated successfully, but these errors were encountered:

PirosB3 · 2014-03-07T10:57:55Z

I would love to do this task, can I start?
This would be also good experience for me because it somewhat regards my SoC proposal.

Thanks

PirosB3 · 2014-03-07T11:34:17Z

Hi @tklein23
What type of format would you like the *.data and *.labels? I was thinking it would be a good idea to add something standard like svmlight. In this way, we reduce the files from 2 to 1 (svmlight contains both target and features) and provide a tool that is compatible with standard formats.

.=. : : ... : #

What do you think?

Thanks,
Dan

tklein23 · 2014-03-07T12:14:48Z

Hey Dan! Feel free to take this task!

I totally agree that we should use standard formates. I think we can go with SVMlighs format, like shown in data/toy/7class_example4_train.light. No need to split it into two files.

The goal is simply to have something that can be applied easily to own data (without touching source). Btw., you're not limited to Python - feel free to do it in C++ if you like to.

PirosB3 · 2014-03-07T13:29:22Z

@tklein23 I have started working on this new feature. When you have a second can you see my initial commit? it's no where near to a production version, but if there is something I am doing wrong please let me know.

Also, wouldn't it be better to have a single file that does training+evaluation? (like you said: evaluate_multiclass_labels.py, but without having the other two scripts) there is a lot of reusable functionality between the two.

Let me know,
Dan

vigsterkr · 2014-03-07T13:31:54Z

@PirosB3 we would like you to send a PR (pull request) instead of asking people to check on your forked repository... it is essential that you start working with PRs as that's how we do development during the whole GSoC.

even if your code is not ready it's ok to send a PR as we'll discuss things in that PR and then you can change and add more commits to the PR obviously...

PirosB3 · 2014-03-07T13:33:03Z

Perfect!
I will send a PR

2014-03-07 13:31 GMT+00:00 Viktor Gal notifications@github.com:

@PirosB3 https://github.com/PirosB3 we would like you to send a PR
(pull request) instead of asking people to check on your forked
repository... it is essential that you start working with PRs as that's how
we do development during the whole GSoC.

even if your code is not ready it's ok to send a PR as we'll discuss
things in that PR and then you can change and add more commits to the PR
obviously...

Reply to this email directly or view it on GitHubhttps://github.com//issues/1868#issuecomment-37023878
.

PirosB3

https://github.com/PirosB3 http://pirosb3.com

tklein23 · 2014-03-07T13:38:27Z

You wrote: Also, wouldn't it be better to have a single file that does training+evaluation? (like you said: evaluate_multiclass_labels.py, but without having the other two scripts) there is a lot of reusable functionality between the two.

I think it's better to have individual scripts for training, predicting and evaluation. Evaluation for example is not limited to a specific learning algorithm; you can use it to evaluate everything that outputs multiclass-labels.

Anyway, if you see reusable code, try to use methods/includes/whatever-codeblocks for it.

karlnapf · 2014-03-07T14:35:34Z

As for format, keep in mind we can serialise objects in Shogun

tklein23 added the entrance label Feb 13, 2014

PirosB3 added a commit to PirosB3/shogun that referenced this issue Mar 7, 2014

initial commit of train_multiclass_svm. Issue shogun-toolbox#1868

b7158ec

PirosB3 mentioned this issue Mar 7, 2014

Feature/1868 -- Example for learning/predicting multiclass #1954

Merged

cameo54321 pushed a commit to cameo54321/shogun that referenced this issue Mar 17, 2014

initial commit of train_multiclass_svm. Issue shogun-toolbox#1868

4c3f904

tklein23 closed this as completed Apr 3, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example for learning/predicting multiclass on own data #1868

Example for learning/predicting multiclass on own data #1868

tklein23 commented Feb 13, 2014

PirosB3 commented Mar 7, 2014

PirosB3 commented Mar 7, 2014

tklein23 commented Mar 7, 2014

PirosB3 commented Mar 7, 2014

vigsterkr commented Mar 7, 2014

PirosB3 commented Mar 7, 2014

tklein23 commented Mar 7, 2014

karlnapf commented Mar 7, 2014

Example for learning/predicting multiclass on own data #1868

Example for learning/predicting multiclass on own data #1868

Comments

tklein23 commented Feb 13, 2014

PirosB3 commented Mar 7, 2014

PirosB3 commented Mar 7, 2014

tklein23 commented Mar 7, 2014

PirosB3 commented Mar 7, 2014

vigsterkr commented Mar 7, 2014

PirosB3 commented Mar 7, 2014

tklein23 commented Mar 7, 2014

karlnapf commented Mar 7, 2014