New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added multilabel reader in LibSVMFile #2073
Conversation
@tklein23 @vigsterkr |
(1) when user tries to read singlelabel file by calling (2) when user tries to read multilabel file by calling (3) when user uses |
get_sparse_matrix(matrix, num_feat, num_vec, labels, false); \ | ||
SGVector<float64_t>* mat_label; \ | ||
int32_t num_classes; \ | ||
get_sparse_matrix(mat_feat, num_feat, num_vec, mat_label, num_classes, false); \ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest "mat_label" to "labels" to align naming in both functions. Even better would be "multilabel", so the name is telling exactly what it will contain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will change the naming.
All-in-all it looks really cool. It's almost ready to merge. I'm "just" missing: a simply unit test, that reads an example file and compares the resulting labels in a few cases:
Please run your test(s) in valgrind and check if memory is clean. |
@tklein23 Thanks for your comments. Yes, the unit test is quite necessary. I will update the code soon. |
@tklein23 I have updated the LibSVMFile:
|
SG_SPRINT("Unable to open file: %s\n", fname); | ||
return; | ||
} | ||
fclose(pfile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an additional space here.
Just go through the code, don't understand why you change |
@hushell Thanks for your comments. There are a lot of details I didn't check carefully. Regarding to your main concern |
On Monday 31 March 2014 05:13:02 Jiaolong wrote:
"empty" multilabels are valid es well; just a line that starts with a |
@tklein23 Note that in the above example |
On Monday 31 March 2014 14:04:47 Jiaolong wrote:
I missed that. That's okay. (I guess the space is important to let |
@tklein23 Given a space at the head, I think it can be parsed by the current code as well. Anyway, I will add this in the unit test. |
If you ran some tests, let me know if it worked! |
@tklein23 There is an example in libshogun: io_libsvm_multilabel, which reads and displays multilabels from yeast dataset. I pasted the output as follows:
|
@tklein23 I got a Travis error: Could you help me to check it? I only changed the format issue and the previous versions have all passed the Travis test. |
@tklein23 For the memory leak issue, I pasted the the valgrind output in gist |
*/ | ||
#include <shogun/io/LibSVMFile.h> | ||
#include <shogun/lib/SGVector.h> | ||
#include <shogun/lib/SGVector.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including SGVector.h more than once? ;)
@tklein23 I have updated the code with minor changes. Now the tests are passed. |
Added multilabel reader in LibSVMFile - thanks to Jiaolong
There are several issues need to be discussed:
(1) Use SGVector or SGSparseVector to store labels. Currently, it is SGVector but there should be no trouble to change it into SGVector.
(2) There is just one unified function for parsing single-label and multi-label files. However, I kept the olde API of reading single-label file. So the SGSparseVector labels are converted internally. To move this converter outside, I guess many files dependent on the old API need to be changed.
@tklein23 @vigsterkr
For (1), if you all agree, I can use the unified type SGVector, since it is more compatible with @tklein23 requirement.
@vigsterkr
For(2), I need your help since I don't know how to put the converter outside and how many files need to be changed correspondingly. To be honest, I don't fully understand your design. But I think all the necessary code are already in LibSVMFile right now.