Skip to content

Indexes Word Docs after removing stopwords and lemmatization. Allows a simple boolean conjunctive query over the index

License

Notifications You must be signed in to change notification settings

clydedacruz/worddoc-indexer-py

Repository files navigation

worddoc-indexer-py

Dependencies

Python version: Use python version 3.5 or greater

To install dependencies, run : pip install nltk

Then download nltk data : in the python prompt:

import nltk
nltk.download('wordnet')

Usage

To create index : python create_index.py data

To query : python query_index.py <term1> <term2> ...<termN>

About

Indexes Word Docs after removing stopwords and lemmatization. Allows a simple boolean conjunctive query over the index

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages