Skip to content

Implementation of Hobbs' algorithm for coreference resolution in python

License

Notifications You must be signed in to change notification settings

ArunkumarRamanan/hobbs

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hobbs.py


Enter python hobbs.py demo to see a demo.

Otherwise, enter a file containing parsed sentences to search, with one sentence per line, and then enter the pronoun you'd like to resolve, e.g.,

python hobbs.py demosents.txt "He"
python hobbs.py demosents.txt "it"
python hobbs.py demorefl.txt "herself"

The sentences must use Treebank tags and be parsed such that they can be converted into NLTK Trees. The pronoun must be in the last sentence of the file.

This program uses Hobbs’ algorithm to find the antecedent of a pronoun. It has also been expanded to handle reflexive pronouns. The algorithm is given below:

  1. Begin at the NP node immediately dominating the pronoun
  2. Go up the tree to the first NP or S node encountered. Call this node X and the path used to reach it p.
  3. Traverse all branches below node X to the left of path p in a left-to-right, breadth-first fashion. Propose as an antecedent any NP node that is encountered which has an NP or S node between it and X.
  4. If node X is the highest S node in the sentence, traverse the surface parse trees of previous sentences in the text in order of recency, the most recent first; each tree is traversed in a left-to-right, breadth-first manner, and when an NP node is encountered, it is proposed as an antecedent. If X is not the highest S node in the sentence, continue to step 5.
  5. From node X, go up the tree to the first NP or S node encountered. Call this new node X, and call the path traversed to reach it p.
  6. If X is an NP node and if the path p to X did not pass through the Nominal node that X immediately dominates, propose X as the antecedent.
  7. Traverse all branches below node X to the left of path p in a left-to-right, breadth-first manner. Propose any NP node encountered as the antecedent.
  8. If X is an S node, traverse all the branches of node X to the right of path p in a left-to-right, breadth-first manner, but do not go below any NP or S node encountered. Propose any NP node encountered as the antecedent.
  9. Go to step 4.

About

Implementation of Hobbs' algorithm for coreference resolution in python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%