GitHub - avinassh/gre-classics

An attempt to find out frequency of Top 800 GRE words in Classic books.

#Requirements:

Check requirements.txt. Install it using pip:
```
  pip install -r requirements.txt
```
NTLK WordNet corpus

#Log:

16/Feb/2014:
- High frequency words and corpus, were not lemmatized
- Output is saved in output.json
18/Feb/2014:
- High frequency words and corpus both were lemmatized
- Lemmatization was done using WordNetLemmatizer:
```
  from nltk.stem import WordNetLemmatizer
  wnl = WordNetLemmatizer()
  wnl.lemmatize(word)
```
- Output is saved in output-wnl.json

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hf_words.json		hf_words.json
main.py		main.py
output-wnl.json		output-wnl.json
output.json		output.json
requirements.txt		requirements.txt
utils.py		utils.py
wordsfrequency.json		wordsfrequency.json

Provide feedback