proof of concept using NLTK for named entity extraction
Ruby Python Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
spec
.gitignore
README
batch.sh
extract_body.rb
ferret_demo.rb
interesting
map_to_places.rb
message_sanitisation.rb
notes
noun_phrases.py

README

gem install ferret
require nltk install including corpus for chunking (need to look up which one again)

head data/msgs_200k.27.csv | ./extract_body.rb | ./noun_phrases.py
see also batch.sh

head -n1000 lp_places.csv > test
./map_to_places.rb and enter query tests