<h1 style="text-align: center;">Named Entity Recognition</h1>

<p style="text-align: center;">Named entity recognition (NER) is a subfield of natural language processing (NLP) that focuses on identifying and classifying specific data points from textual content.</p>


In [1]:
import nltk
import spacy 
import pandas as pd
from spacy import displacy

In [2]:
nlp = spacy.load("en_core_web_sm")
nlp

<spacy.lang.en.English at 0x1815712ff10>

In [3]:
content = """In 1969, Neil Armstrong became the first person to walk on the moon during NASA's Apollo 11 mission. 
The historic event was watched by millions of people around the world. Armstrong's famous words, 
"That's one small step for man, one giant leap for mankind," are etched in history books.
"""

In [4]:
doc = nlp(content)

print(f"Document: {doc}")
print(f"Named Entities: {doc.ents}")

Document: In 1969, Neil Armstrong became the first person to walk on the moon during NASA's Apollo 11 mission. 
The historic event was watched by millions of people around the world. Armstrong's famous words, 
"That's one small step for man, one giant leap for mankind," are etched in history books.

Named Entities: (1969, Neil Armstrong, first, NASA, Apollo 11, millions, Armstrong, one, one)


In [5]:
for ent in doc.ents:
    print(f"{ent.text:<20} {ent.start_char:<5} {ent.end_char:<5} {ent.label_}")

1969                 3     7     DATE
Neil Armstrong       9     23    PERSON
first                35    40    ORDINAL
NASA                 75    79    ORG
Apollo 11            82    91    LAW
millions             136   144   CARDINAL
Armstrong            173   182   PERSON
one                  208   211   CARDINAL
one                  232   235   CARDINAL


In [6]:
displacy.render(doc, style="ent")

In [7]:
entities = [(ent.text, ent.label_, ent.lemma_) for ent in doc.ents]

df = pd.DataFrame(entities, columns=['text', 'type', 'lemma'])
df

Unnamed: 0,text,type,lemma
0,1969,DATE,1969
1,Neil Armstrong,PERSON,Neil Armstrong
2,first,ORDINAL,first
3,NASA,ORG,NASA
4,Apollo 11,LAW,Apollo 11
5,millions,CARDINAL,million
6,Armstrong,PERSON,Armstrong
7,one,CARDINAL,one
8,one,CARDINAL,one
