No place names detected...why? #32

brekhusr opened this Issue Mar 5, 2013 · 1 comment


None yet
2 participants

brekhusr commented Mar 5, 2013

For my set of documents (878 PDF's, of New York Times articles from 1851-1870 that include the word "Missouri," there seemed to be no locations at all that were picked up by either the mapping applications nor the DBPedia annotation. I know there were lots of locations in these texts - they show up in the phrase nets and I know there were all sorts of articles mentioning sites of various battles, conventions, etc. in these articles. Can you help me understand what might be happening here?


corajr commented Mar 6, 2013

The Europeana Connect geoparsing service, on which Paper Machines relies, appears to have been taken offline. Previously, it used Yahoo! Placemaker, which became a pay-only service in November.

I am not yet aware of any alternatives to these services apart from EDINA's Unlock Text, which requires the texts to be hosted online. Another possibility is to use an offline system such as Pete Warden's geodict; the database would add some 200 megabytes to the download, but could be packaged separately.

There is an older version of Paper Machines floating around that uses the geodict program; this evening I'll try to update this and package it so that the geoparsing option at least exists.

corajr closed this May 11, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment