# Co-reference Resolution for input text

Coreference resolution is the task of finding (and replacing) all expressions that refer to the same entity in a text. There are two well know methods to do this in python: SpaCy's NeuralCoref and AllenNLP's Coreference Resolution.

## Using Spacy - neuralcoref

In [None]:
%python -m spacy download en_core_web_sm

In [None]:
import spacy
import neuralcoref

nlp = spacy.load('en_core_web_sm')  # load the model
neuralcoref.add_to_pipe(nlp)

text = "Joseph Robinette Biden Jr. is an American politician who is the 46th and\
current president of the United States. A member of the Democratic Party, \
he served as the 47th vice president from 2009 to 2017 under Barack Obama and\
represented Delaware in the United States Senate from 1973 to 2009."

doc = nlp(doc)  # get the spaCy Doc (composed of Tokens)

print(doc._.coref_clusters)  # You can see cluster of similar mentions


In [None]:
print(doc._.coref_resolved)


## Using AllenNLP - allennlp-models

In [None]:
%pip install allennlp allennlp-models

Collecting allennlp
  Using cached allennlp-2.10.1-py3-none-any.whl (730 kB)
Collecting allennlp-models
  Using cached allennlp_models-2.10.1-py3-none-any.whl (464 kB)
Collecting torch<1.13.0,>=1.10.0
  Downloading torch-1.12.1-cp37-cp37m-manylinux1_x86_64.whl (776.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m776.3/776.3 MB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:04[0m
Collecting base58>=2.1.1
  Downloading base58-2.1.1-py3-none-any.whl (5.6 kB)
Collecting scipy>=1.7.3
  Using cached scipy-1.7.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.1 MB)
Collecting dill>=0.3.4
  Using cached dill-0.3.6-py3-none-any.whl (110 kB)
Collecting lmdb>=1.2.1
  Downloading lmdb-1.4.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (294 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m294.4/294.4 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hCollecting transformers<4.21,>=4.1
  Downloading transfor

In [None]:
from allennlp.predictors.predictor import Predictor

model_url = "https://storage.googleapis.com/allennlp-public-models/coref-spanbert-large-2020.02.27.tar.gz"
predictor = Predictor.from_path(model_url)

In [None]:
text = "Link awakens from a deep slumber and a mysterious voice guides him to discover what has become of the ruined country of Hyrule Kingdom. Link leaves the Shrine of Resurrection and looks out at Hyrule on top of the Great Plateau. Link then meets an Old Man by a campfire. The Old Man promises Link his Paraglider, which is the only way to get down from the plateau. However, he first wants Spirit Orbs from nearby Shrines, in particular the Oman Au Shrine, Ja Baij Shrine, Owa Daim Shrine, and the Keh Namut Shrine. After Link gets the spirit orbs, the Old Man appears, then mysteriously disappears, telling Link to meet him in the Temple of Time. The Old Man reveals himself as the spirit of the deceased King of Hyrule, King Rhoam. Link learns from King Rhoam that 100 years prior, a great evil known as the Calamity Ganon rose up and laid waste to the kingdom and its people. Unable to be defeated, it was sealed within Hyrule Castle, while the ruins of the land were ravaged by nature over time. Although trapped, the Calamity Ganon has grown in power, and Link must defeat it before it breaks free once more and destroys the world. The mysterious voice turns out to be Zelda, who is the daughter of King Rhoam."

_text = "After escaping the confines of the plateau, Link is directed to meet the wise Sheikah elder Impa, and learn about the Guardians and Divine Beasts: 10,000 years prior these machines were created and successfully used by another Hero and another Princess to defeat the Calamity Ganon. But throughout the ages, knowledge about the ancient technology was lost until excavations in Hyrule Kingdom brought them to light once more, coinciding with the expected return of Calamity Ganon a hundred years ago. The Guardians were reactivated and four Champions were chosen to control the Divine Beasts: the Zora princess Mipha, the Goron warrior Daruk, the Gerudo chief Urbosa, and the Rito archer Revali. All the while, Zelda was unsuccessfully trying to gain access to her own prophesied powers, accompanied on her quests by her knight, the Hylian Champion Link. When the Calamity Ganon ultimately attacked, it devastated the Kingdom of Hyrule Kingdom by taking control of the ancient machines and turning them against the Hyruleans. As a last resort, Zelda was able to place the gravely wounded Link in the Shrine of Resurrection and use her awoken sealing powers to trap herself with Calamity Ganon in Hyrule Castle."

_text = "As Link sets off on his quest to defeat Calamity Ganon, he is asked to investigate the fate of the Divine Beasts and their former Champions. His ultimate goal remains to reach the Calamity Ganon and free the trapped Zelda before the whole world is laid to waste. But with the entire Kingdom of Hyrule before him to explore, it is up to Link himself to decide how he wishes to fulfill his foretold role as the Hylian Champion, and to save Hyrule Kingdom."

prediction = predictor.predict(document=text)  # get prediction
print("Clusters:")
for cluster in prediction['clusters']:
    print(cluster)  # list of clusters (the indices of spaCy tokens)

Clusters:
[[0, 0], [11, 11], [25, 25], [43, 43], [57, 57], [106, 106], [122, 122], [150, 150], [217, 217]]
[[18, 23], [35, 35], [145, 145], [174, 175], [177, 177]]
[[46, 48], [53, 55], [58, 58], [75, 75], [112, 114], [125, 125], [132, 134], [136, 136]]
[[39, 41], [70, 71]]
[[78, 103], [108, 110]]
[[141, 148], [153, 154], [246, 247]]
[[160, 167], [185, 185], [208, 210], [220, 220], [222, 222]]
[[7, 9], [232, 234]]


In [None]:
print(prediction['document'])

['Link', 'awakens', 'from', 'a', 'deep', 'slumber', 'and', 'a', 'mysterious', 'voice', 'guides', 'him', 'to', 'discover', 'what', 'has', 'become', 'of', 'the', 'ruined', 'country', 'of', 'Hyrule', 'Kingdom', '.', 'Link', 'leaves', 'the', 'Shrine', 'of', 'Resurrection', 'and', 'looks', 'out', 'at', 'Hyrule', 'on', 'top', 'of', 'the', 'Great', 'Plateau', '.', 'Link', 'then', 'meets', 'an', 'Old', 'Man', 'by', 'a', 'campfire', '.', 'The', 'Old', 'Man', 'promises', 'Link', 'his', 'Paraglider', ',', 'which', 'is', 'the', 'only', 'way', 'to', 'get', 'down', 'from', 'the', 'plateau', '.', 'However', ',', 'he', 'first', 'wants', 'Spirit', 'Orbs', 'from', 'nearby', 'Shrines', ',', 'in', 'particular', 'the', 'Oman', 'Au', 'Shrine', ',', 'Ja', 'Baij', 'Shrine', ',', 'Owa', 'Daim', 'Shrine', ',', 'and', 'the', 'Keh', 'Namut', 'Shrine', '.', 'After', 'Link', 'gets', 'the', 'spirit', 'orbs', ',', 'the', 'Old', 'Man', 'appears', ',', 'then', 'mysteriously', 'disappears', ',', 'telling', 'Link', 'to',

In [None]:
print('Coref resolved: ',predictor.coref_resolved(text))  # resolved text

Coref resolved:  Link awakens from a deep slumber and a mysterious voice guides Link to discover what has become of the ruined country of Hyrule Kingdom. Link leaves the Shrine of Resurrection and looks out at the ruined country of Hyrule Kingdom on top of the Great Plateau. Link then meets an Old Man by a campfire. an Old Man promises Link an Old Man's Paraglider, which is the only way to get down from the Great Plateau. However, an Old Man first wants Spirit Orbs from nearby Shrines, in particular the Oman Au Shrine, Ja Baij Shrine, Owa Daim Shrine, and the Keh Namut Shrine. After Link gets Spirit Orbs from nearby Shrines, in particular the Oman Au Shrine, Ja Baij Shrine, Owa Daim Shrine, and the Keh Namut Shrine, an Old Man appears, then mysteriously disappears, telling Link to meet an Old Man in the Temple of Time. an Old Man reveals an Old Man as the spirit of the deceased King of the ruined country of Hyrule Kingdom, King Rhoam. Link learns from the deceased King of Hyrule, King 