Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Empty set of entities #2

Closed
waqarhameed opened this Issue Oct 22, 2012 · 5 comments

Comments

Projects
None yet
4 participants

I have installed Pyner successfully. However when I run the example, an empty set of entities is returned (indicated below):

$ python
Python 2.7.3 (default, Sep 26 2012, 21:53:58)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import ner
tagger = ner.HttpNER(host='localhost', port=1234)
tagger.get_entities("University of California is located in California, United States")
{}

The command through which i am running Stanford NER is:
java -mx1000m -cp stanford-ner.jar edu.stanford.nlp.ie.NERServer -loadClassifier classifiers/english.all.3class.distsim.crf.ser.gz -port 1234

Owner

dat commented Feb 7, 2013

This is because you're deploying it via the Java server option using the normal socket. You should be building it as a .war file (see Makefile) and deploy it as a servlet under Tomcat.

Owner

dat commented Feb 7, 2013

If this is the mode that you're interested in running, take a look at the SocketNER class. That is what you're looking for.

@dat dat closed this Feb 13, 2013

I found this worked on a fresh stanford-ner 3.4 and pyner --

$ cp stanford-ner.jar stanford-ner-with-classifier.jar
$ jar -uf stanford-ner-with-classifier.jar classifiers/english.all.3class.distsim.crf.ser.gz
$ java -mx500m -cp stanford-ner-with-classifier.jar edu.stanford.nlp.ie.NERServer -port 2020 
 -loadClassifier classifiers/english.all.3class.distsim.crf.ser.gz &

Define the tagger

tagger = ner.SocketNER(host='localhost', port=2020, output_format='slashTags')

If I leave out the output_format, or try another format, I get {} output. The actual output format, however, is not slashTags but a clearly superior (more informative) dict format:

tagger.get_entities(text)
{u'ORGANIZATION': [u'UNIVERSITY OF CALIFORNIA'], u'LOCATION': [u'CALIFORNIA', u'UNITED STATES'], u'O': [u'IS LOCATED IN', u',']}

Incidentally, I also discovered that pyner works fine for stanford-pos too, though none of the custom formatting of the result string works. Just use tag_text for raw output and you have pypos.

I think this case should be re-opened. Even using SocketNER I experienced the same issue but once output_format was specified to "slashTags" everything worked well

Without specifying output_format:

tagger = ner.SocketNER(host='localhost',port=9191)
tagger.get_entities("University of California is located in California, United States")

Returned:

{}

With specifying output_format:

tagger = ner.SocketNER(host='localhost',port=9191, output_format='slashTags')
tagger.get_entities("University of California is located in California, United States")

Returned:

{u'LOCATION': [u'California', u'United States'],
 u'O': [u'is located in', u','],
 u'ORGANIZATION': [u'University of California']}

Either way, appreciate of the work you're doing dat. Thank you so much for all that you do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment