-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Terms
- I have searched open and closed feature requests
- I agree to follow Scribe-Data's Code of Conduct
Description
Scribe-Data will be expanding its functionality to work from Wikidata dumps. The first step in this is to add the ability for the CLI to download Wikidata Lexeme dumps. The following command should be added in this issue:
# Latest dump:
scribe-data download --wikidata-dump
scribe-data d -wd
# Specific dump:
scribe-data download --wikidata-dump YYYYMMDD
scribe-data d -wd YYYYMMDD
# Specific output directory:
scribe-data download --wikidata-dump --output-dir DIRECTORY_PATH
scribe-data d -wd -od DIRECTORY_PATHThe above will download the dumps from dumps.wikimedia.org/wikidatawiki/entities/. In the fist set of queries the latest .json.bz2 file will be downloaded, and in the second the URL for the given YYYYMMDD stamp will be checked and a .json.bz2 dump will be downloaded to the PWD. The third would add in an output directory path as is done on the get command, but let's not change the file name. We'll just allow the user to put it in a directory 😊
The functionality should be added in a file src/scribe_data/cli/download.py, with the option being added into src/scribe_data/cli/main.py :)
Contribution
Being worked on by @axif0 as a part of Outreachy! 📶🚀
Metadata
Metadata
Assignees
Labels
Type
Projects
Status