Skip to content
This repository has been archived by the owner on Dec 12, 2018. It is now read-only.

Commit

Permalink
NER now returns the index corresponding to the sentence each named en…
Browse files Browse the repository at this point in the history
…tity was found in.
  • Loading branch information
Diane M. Napolitano committed Jan 28, 2014
1 parent be489fb commit d7f63b5
Show file tree
Hide file tree
Showing 8 changed files with 148 additions and 31 deletions.
1 change: 1 addition & 0 deletions README_ner.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ The core return type here is a data structure called `NamedEntity` which has fou
* `tag`: A string containing the tag assigned to this named entity (PERSON, LOCATION, etc.). Should always be upper-case.
* `startOffset`: All named entities exist in some sentence. This integer represents the starting character offset of this named entity in its sentence.
* `endOffset`: Like `startOffset`, only tells you the character offset of the last character of the named entity in its sentence.
* `sentence_num`: An `integer` referring to the index (starting from 0) into the original `list` of sentences provided to Stanford NER, showing in which sentence this named entity occurred.

In order to get these `NamedEntity` objects, you have three choices, depending on what kind of data you'd like to recognize named entities in. The return type for ALL of these is a Java `ArrayList`/Python list containing `NamedEntity` objects corresponding to entities recognized across the ENTIRETY of your text, no matter how many sentences, parse trees, etc. were passed in. If you'd like to recognize named entities in:

Expand Down
3 changes: 2 additions & 1 deletion corenlp.thrift
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ struct NamedEntity
1:string entity,
2:string tag,
3:i32 startOffset,
4:i32 endOffset
4:i32 endOffset,
5:i32 sentence_number
}

struct TaggedToken
Expand Down
102 changes: 98 additions & 4 deletions gen-java/CoreNLP/NamedEntity.java

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 14 additions & 1 deletion gen-py/corenlp/ttypes.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit d7f63b5

Please sign in to comment.