Skip to content

Latest commit

 

History

History
24 lines (23 loc) · 1.39 KB

README.md

File metadata and controls

24 lines (23 loc) · 1.39 KB

Data directories

  • The scrape directory contains our "Big Scrape" scripts, the TSVs created by those scripts, and two tables (a README and a TSV) describing those TSVs.
    • More information on the "Big Scrape" scripts, including instructions on how to run your own scrape, can be found here.
  • The phones directory contains the .phones files used to filter the TSVs produced by the "Big Scrape", scripts that facilitate the creation of .phones files, and two tables (a README and a TSV) describing those .phones files.
    • More information on the files within the phones directory, including instructions on how to create your own .phones file, can be found here.
  • The frequencies directory contains scripts used to merge word frequencies into the TSVs produced by the "Big Scrape".
    • Details on the specific function of each script and how we acquire the frequencies can be found here.
  • The morphology directory contains scripts that download UniMorph data for all languages covered by both UniMorph and the "Big Scrape".
    • Details can be found here.