Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No place names detected...why? #32

Closed
brekhusr opened this issue Mar 5, 2013 · 1 comment
Closed

No place names detected...why? #32

brekhusr opened this issue Mar 5, 2013 · 1 comment

Comments

@brekhusr
Copy link

brekhusr commented Mar 5, 2013

For my set of documents (878 PDF's, of New York Times articles from 1851-1870 that include the word "Missouri," there seemed to be no locations at all that were picked up by either the mapping applications nor the DBPedia annotation. I know there were lots of locations in these texts - they show up in the phrase nets and I know there were all sorts of articles mentioning sites of various battles, conventions, etc. in these articles. Can you help me understand what might be happening here?

@corajr
Copy link
Contributor

corajr commented Mar 6, 2013

The Europeana Connect geoparsing service, on which Paper Machines relies, appears to have been taken offline. Previously, it used Yahoo! Placemaker, which became a pay-only service in November.

I am not yet aware of any alternatives to these services apart from EDINA's Unlock Text, which requires the texts to be hosted online. Another possibility is to use an offline system such as Pete Warden's geodict; the database would add some 200 megabytes to the download, but could be packaged separately.

There is an older version of Paper Machines floating around that uses the geodict program; this evening I'll try to update this and package it so that the geoparsing option at least exists.

@corajr corajr closed this as completed May 11, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants