Skip to content
📥 Command-line downloader for public datasets
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
dafter Make "dafter get" work with url or path (#87) May 7, 2019
docs Add logo and correct explanations in README Apr 17, 2019
tests
.gitignore Make "dafter get" work with url or path (#87) May 7, 2019
LICENSE Rename LICENSE.md to LICENSE for pypi May 1, 2019
MANIFEST.in
README.md Switch to pip uninstall May 1, 2019
setup.cfg Add pip install May 1, 2019
setup.py

README.md

dafter : the data fetcher

dafter-logo

You have just found dafter.

Dafter is a command line downloader of public datasets. It takes care of downloading and formatting the datasets' files so that you can spend hours building models instead of looking for datasets and their urls.

Install dafter

To install dafter, just do:

pip install dafter

Commands

To download the MNIST dataset:

dafter get mnist

To delete MNIST from your machine:

dafter delete mnist

To search among downloadable datasets:

# Search all available datasets
dafter search
# Search all available datasets that have the tags "image" and "deep-learning"
# and whose name contains "mni"
dafter search mni --tags image deep-learning

To list all the datasets that have been downloaded and are stored on your machine:

# Lists all datasets in database
dafter list
# Lists all datasets in database that have the tag "twitter" and whose name
# contains "sentiment"
dafter list sentiment --tags twitter

Update

To update dafter, do:

pip install --upgrade dafter

Uninstall

To uninstall dafter, do:

pip uninstall dafter

How to contribute?

Add a new dataset

To add a new dataset, just add a json file called name-of-the-dataset.json in the datasets-configs folder.

{
  "name": "name-of-the-dataset",
  "urls": [
    {
      "url": "https://site.com/file1.tar.gz",
      "bytes": 45221
    },
    {
      "url": "https://site.com/file2.tar.gz",
      "bytes": 1147803
    }
  ],
  "type": "tar.gz",
  "tags": ["tag1", "tag2", "tag3"],
  "description": "This is a description of the dataset",
  "source": "https://site.com/"
}
You can’t perform that action at this time.