-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: io: support sparse format in loadarff
#3535
base: main
Are you sure you want to change the base?
Conversation
@cournape you added the arff reader, so would be great if you could have a look at this enhancement. |
@fbenites there's a mix of tabs and spaces in your commits: https://travis-ci.org/scipy/scipy/jobs/22599231 |
There's also lots of You can install pep8 on your own machine; running |
see https://travis-ci.org/scipy/scipy/builds/22940084, now seems pep8 conform and tests built successfully |
Looks much better now. May be good to ask on the scipy-dev mailing list if there's anyone who has a use for this functionality and wants to test / provide feedback. |
Adding a third return value to |
@WarrenWeckesser agreed, returning a 3rd value should be avoided. Returning a 3rd value conditionally is even worse IMO. We should add a new API to use the new features, with a provision to be more extensible (looking at my original code is humbling :) ). |
I must admit that this class thing is pretty much a Meka thing, I could write the code for a conditional 3rd argument, since this is a special case. Mulan does handle this problem with a separate xml file for the classes. Further, there is a feature, which I never saw it out there but theoretically.., which gives weight to the instance. I did not implement it. So it does not cover the whole spec as in http://weka.wikispaces.com/ARFF+%28developer+version%29 . I also did not test for sparse and undefined. |
How about putting the class data in the |
from the docs: Knows about attributes names and types. The classes are the classes for each object. In multi-label the objects can have multiple classes assigned to it, like tags. So for every instance there are attributes and classes. In normal weka the classes are part of the data, I wanted to split up. It is also possible to implement so that data also have the classes in it. So we could pass the number of classes in metadata. Meka uses the first x attributes as classes, MULAN (other multilabel library build over weka) uses the last x as classes. So it should clear that also then, if that important for conformidity. I hoped to use like that and later, if there are many interested in the functionality, change it accordingly as the most users need it. |
loadarff also for sparse format with support for meka multi-label assignments