-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[joss] feature request: accessible utility to import a dataset #19
Comments
Hi again @hbaniecki! Wow, this is an amazing idea 👏👏👏 What do you think if adding a method called Perhaps from pyss3 import SS3
x_train, y_train = Dataset.load_from_url("https://url/to/movie_review.zip", "train")
x_test, y_test = Dataset.load_from_url("https://url/to/movie_review.zip", "test")
clf = SS3()
clf.fit(x_train, y_train)
y_pred = clf.predict(x_test) What do you think? |
Thanks! If the data won't be saved on disk and only loaded into |
Now datasets can be directly loaded via a given url, not only from disk. To achieve this, two methods have been added to ``Dataset`` class: - ``Dataset.load_from_url(...)`` - ``Dataset.load_from_url_multilabel(...)`` These methods download and extract the zip file (given by the url) into the system's temporary folder and then calls ``Dataset.load_from_files()`` to load it (or ``Dataset.load_from_files_multilabel()``, respectively). Note: If the same url is used consecutively, the already downloaded files will be used as a cache (to avoid downloading and extracting them again). Resolves: #19
Hi @hbaniecki! sorry for the delay, I just had to wait for the weekend to get down on this. I've added the suggested methods and also updated the Below I'm pasting the commit message that marked this issue as closed:
|
No worries. Thanks! Works great. |
openjournals/joss-reviews#3934
This package has good documentation. Going through the examples I came up with a feature request, which would greatly benefit introducing newcomers and prototyping code.
I like the first example in README to be straightforward and copy-paste ready, which is not the case here (looking at missing code
...
).How about implementing some
import_dataset(url)
/download(url)
functionality inutils
orDataset
that would, for example, download the dataset.zip
file and unpack it (sample code) so that one can load the data into exemplary code:Implementation details and naming may vary, but it would be nice to easily run code from README.
The text was updated successfully, but these errors were encountered: