This repository has been archived by the owner. It is now read-only.
proof of concept using NLTK for named entity extraction
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
spec
.gitignore
README
batch.sh
extract_body.rb
ferret_demo.rb
interesting
map_to_places.rb
message_sanitisation.rb
notes
noun_phrases.py

README

gem install ferret
require nltk install including corpus for chunking (need to look up which one again)

head data/msgs_200k.27.csv | ./extract_body.rb | ./noun_phrases.py
see also batch.sh

head -n1000 lp_places.csv > test
./map_to_places.rb and enter query tests