Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Export OpenML data to data package, Import data package to OpenML #482
Would be nice if we could do that. Would make it much easier for people to upload data to and to work with data from OpenML.
Data packages are defined by:
To improve user friendliness. See also:
How is meta data specified in data packages?
This is related to #457
To be able to import data packages into OpenML I think we need to first do the following steps:
Any help on this issue would be very appreciate
Copying my response from gitter.im at request of @HeidiSeibold
I note there are a few libs in Python for ARFF
And we have a documented way to convert to/from data backends here:
And some example implementations of the storage API at:
So writing an ARFF backend would be great!
Interesting dataset: Maybe this one is a good place to start:
Nice thing is that they have all the attribute file types, offered as a JSON file. Should be easy to convert to ARFF. What is still missing is the task, i.e. what you want to predict. There is also no description of what the dataset is about.
I also couldn't figure out how to navigate DataHub. There are apparently 200+ datasets but I can only see a few of them on the website.
Feedback from the frictionlessdata gitter:
There are now some machine learning data sets available as data packages: http://datahub.io/machine-learning
I guess a first step now would be to check:
See also discussion datopian/datahub-qa#33 (comment)