Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NLP Dataset Downloader #243

Closed
alanakbik opened this issue Nov 26, 2018 · 0 comments
Closed

NLP Dataset Downloader #243

alanakbik opened this issue Nov 26, 2018 · 0 comments
Labels
feature A new feature

Comments

@alanakbik
Copy link
Collaborator

Some NLP datasets, such as the Universal Dependencies corpora, are freely available online. To make experimentation easier, add a feature for automatically fetching these datasets and putting them in a default folder structure.

Perhaps the existing NLPTaskDataFetcher can be extended such that:

corpus = NLPTaskDataFetcher.fetch_data(NLPTask.UD_ENGLISH)

When this is called, checks first if the dataset is already there. If not, triggers a download action in which the corresponding UD corpus gets downloaded and upacked into default folder structure.

alanakbik pushed a commit that referenced this issue Nov 26, 2018
alanakbik pushed a commit that referenced this issue Nov 26, 2018
alanakbik pushed a commit that referenced this issue Nov 26, 2018
alanakbik pushed a commit that referenced this issue Nov 26, 2018
alanakbik pushed a commit that referenced this issue Nov 26, 2018
alanakbik pushed a commit that referenced this issue Nov 26, 2018
alanakbik pushed a commit that referenced this issue Nov 26, 2018
tabergma added a commit that referenced this issue Nov 26, 2018
tabergma added a commit that referenced this issue Nov 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature
Projects
None yet
Development

No branches or pull requests

2 participants