# What is coreference resolution?
Coreference Resolution:
Coreference resolution (CR) is the task of finding all linguistic expressions (called mentions) in a given text that refer to the same real-world entity. After finding and grouping these mentions we can resolve them by replacing, as stated above, pronouns with noun phrases.
Example:
“I voted for Imran Khan because he was most aligned with my values”, Usama said.
“Usama voted for Imran Khan because Imran Khan was most aligned with Usama’s values”, Usama said.
Coreference resolution is an exceptionally versatile tool and can be applied to a variety of NLP tasks such as text understanding, information extraction, machine translation, sentiment analysis, or document summarization. It is a great way to obtain unambiguous sentences which can be much more easily understood by computers.

How it works?
Example:
The first step in order to apply coreference resolution is to decide whether we would like to work with single words/tokens or spans. 
But what exactly is a span? It’s most often the case that what we want to swap or what we are swapping for is not a single word but multiple adjacent tokens. Therefore, span is a whole expression. Another name for it you may come across is a mention. They are often used interchangeably.
Before Coreference Resolution: (English)
“Ali Ahmed, Patient with Hepatitis, fallen from 6th floor, as the 68-year-old became the weak day by day. It is widely known that he is one of the richest people in the town.”
Here we have spans like “he” that have only a single token in them, but we also see the span “Ali Ahmed, Patient with Hepatitis” consisting of six consecutive words.
As a result, we obtained a text without any pronouns while still being valid grammatically and semantically. 
After Coreference Resolution:
“Ali Ahmed, a Patient with Hepatitis, fell from the 6th floor, as Ali Ahmed, a Patient of Hepatitis became weak day by day. It is widely known that Ali Ahmed, a Patient with Hepatitis is one of the richest people in the town.”

Before Coreference Resolution: (Urdu)

ہیپاٹائٹس کا مریض علی احمد چھٹی منزل سے گر گیا، 68 سالہ بوڑھا دن بدن کمزور ہوتا جا رہا تھا۔ یہ بڑے پیمانے پر جانا جاتا ہے کہ وہ شہر کے امیر ترین لوگوں میں سے ایک ہے۔

After Coreference Resolution:
ہیپاٹائٹس کا مریض علی احمد چھٹی منزل سے گر گیا، علی احمد ہیپاٹائٹس کا مریض دن بدن کمزور ہوتا گیا۔ یہ بات مشہور ہے کہ ہیپاٹائٹس کا مریض علی احمد شہر کے امیر ترین لوگوں میں سے ایک ہے۔



In [3]:
# Import necessary libraries
import spacy
import neuralcoref
# import el_core_news_md
# nlp = el_core_news_md.load()
# Load the model
nlp = spacy.load('en_core_web_sm')  
neuralcoref.add_to_pipe(nlp)
text=""
# Sample text to use to do coreference resolution
with open('/content/data.txt') as f:
    f = f.readlines()
for word in f:
  text=word+text
# print("mmmm",text)
sentences=text.split("\n")
for sentence in sentences:
  doc = nlp(sentence) 
  if len(doc._.coref_clusters) >0:
      print(doc._.coref_clusters)




[They: [They, them, they, They]]
[my eyes: [my eyes, my eyes], they: [they, they, they]]
[Our: [Our, we]]
[Your reports: [Your reports, these reports]]
[the doctor: [the doctor, The doctor]]
[the next two hours: [the next two hours, he, he]]
[He: [He, his], his card: [his card, it], her: [her, She]]
[the doctor: [the doctor, He]]
[The doctor: [The doctor, his], his letterhead: [his letterhead, it]]
[She: [She, her]]
[doctor: [doctor, him]]
[radiation therapy: [radiation therapy, It]]
[We: [We, we], Radiation therapy: [Radiation therapy, it]]
[The cancer: [The cancer, it]]
[The doctor: [The doctor, his, his, the doctor], his observations on his letterhead: [his observations on his letterhead, it], the attendant: [the attendant, The attendant]]
[few tests: [few tests, these tests]]
[we: [we, We, We’ve], the reports: [the reports, the reports]]
[some snacks on the roadside eatery: [some snacks on the roadside eatery, It]]
[last night: [last night, the night]]
[how many times in the day yo