Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 523 Bytes

README.md

File metadata and controls

10 lines (7 loc) · 523 Bytes

About

This repo contains a list of English words extracted from Wikipedia articles built by Lexipedia

  • wikipedia_words.zip Contains over 2.9M words. All words from english wikipedia articles including nouns, pronouns, etc. Also contains non-english words.

  • wiktionary_words.zip 280k words. Contains only words found in English Wiktionary.

Each line has 4 values: word, length, frequency, and document frequency(number of Wikipedia articles in which this word occurs) separeted by spaces.