Skip to content

Commit

Permalink
implemented review comments
Browse files Browse the repository at this point in the history
- also changed the readme with the instruction to install a spacy nlp model, this way you do not need the assets folder.
  • Loading branch information
joriscram committed Mar 8, 2018
1 parent e62c877 commit bb73005
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
3 changes: 3 additions & 0 deletions README.md
Expand Up @@ -13,6 +13,9 @@ conda env create -f environment.yml
# Activate Python virtual environment
source activate resume

#Retrieve language model from spacy
python -m spacy download en

# Run code (with default configurations)
cd bin/
python main.py
Expand Down
6 changes: 3 additions & 3 deletions bin/main.py
Expand Up @@ -12,8 +12,8 @@

import lib
import field_extraction
import spacy

import en_core_web_sm

def main():
"""
Expand All @@ -27,7 +27,7 @@ def main():
observations = extract()

# Spacy: Spacy NLP
nlp = en_core_web_sm.load()
nlp = spacy.load('en')

# Transform data to have appropriate fields
observations, nlp = transform(observations, nlp)
Expand All @@ -42,7 +42,7 @@ def text_extract_utf8(f):
try:
return unicode(textract.process(f), "utf-8")
except UnicodeDecodeError, e:
return e
return ''

def extract():
logging.info('Begin extract')
Expand Down

0 comments on commit bb73005

Please sign in to comment.