A parser to extract information from resumes in PDF and DOCX formats written in Python
##Dependencies The parser requires two Python modules for it to work as intended,
The module used for tokenizing and stop word removal are:
- word_tokenize from nltk.tokenize #can be replaced with the split() which is built-in
- stopwords from nltk.corpus
To get both you'll to install the Python NLTK module.
The script is written in Python 2.7..6
##License The script is licensed under the General Public License (GPL), for more details do check out the LICENSE.md.