- The scrape directory contains our "Big Scrape" scripts, the
TSVs created by those scripts, and two tables (a
README and a TSV) describing
those TSVs.
- More information on the "Big Scrape" scripts, including instructions on how to run your own scrape, can be found here.
- The phones directory contains the
.phones
files used to filter the TSVs produced by the "Big Scrape", scripts that facilitate the creation of.phones
files, and two tables (a README and a TSV) describing those.phones
files. - The frequencies directory contains scripts used to merge word
frequencies into the TSVs produced by the "Big Scrape".
- Details on the specific function of each script and how we acquire the frequencies can be found here.
- The morphology directory contains scripts that download
UniMorph data for all languages covered by both UniMorph and the "Big Scrape".
- Details can be found here.