forked from turian/grab-wikipedia-abstracts
Grab all Wikipedia abstracts, in all languages
gaybro8777/grab-wikipedia-abstracts
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Grab all Wikipedia abstracts, in all languages For every dump in: http://dumps.wikimedia.org/backup-index.html find the file abstract.xml and wget it. USAGE: ./grab-wikipedia-abstracts.py This will create a directory download.wikimedia.org/ with the abstract.xml files. REQUIREMENTS: * BeautifulSoup * wget
About
Grab all Wikipedia abstracts, in all languages
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 100.0%