# Named Entity Recognition
Named Entity Recognition (NER) is a natural language processing task that involves identifying and classifying named entities in text into predefined categories such as person names, organization names, locations, dates, monetary values, etc. NER is used to extract meaningful information from unstructured text and is an essential component in various NLP applications such as information retrieval, question answering, and entity linking.

### Using NLTK

In [None]:
corpus = """Apple Inc. is headquartered in Cupertino, California.
The company was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in 1976.
It is known for its iPhone, iPad, Mac, and other consumer electronics products.
Google LLC is an American multinational technology company that specializes in Internet-related services and products."""

In [None]:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('words')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package maxent_ne_chunker to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package maxent_ne_chunker is already up-to-date!
[nltk_data] Downloading package words to /root/nltk_data...
[nltk_data]   Package words is already up-to-date!


True

In [None]:
words = nltk.word_tokenize(corpus)

pos_tags = nltk.pos_tag(words)

chunks = nltk.ne_chunk(pos_tags)

In [None]:
named_entities = []
for chunk in chunks:
    if hasattr(chunk, 'label'):
        named_entities.append((chunk.label(), ' '.join(c[0] for c in chunk)))

print("Named Entities using NLTK:")
print(named_entities)

Named Entities using NLTK:
[('PERSON', 'Apple'), ('ORGANIZATION', 'Inc.'), ('GPE', 'Cupertino'), ('GPE', 'California'), ('PERSON', 'Steve Jobs'), ('PERSON', 'Steve Wozniak'), ('PERSON', 'Ronald Wayne'), ('ORGANIZATION', 'iPhone'), ('ORGANIZATION', 'iPad'), ('PERSON', 'Mac'), ('PERSON', 'Google LLC'), ('GPE', 'American')]


In [None]:
chunks.draw()

### Using spaCy

In [None]:
import spacy
nlp = spacy.load("en_core_web_sm")

In [None]:
doc = nlp(corpus)

named_entities = [(ent.text, ent.label_) for ent in doc.ents]

In [None]:
print("Named Entities using SpaCy:")
print(named_entities)

Named Entities using SpaCy:
[('Apple Inc.', 'ORG'), ('Cupertino', 'GPE'), ('California', 'GPE'), ('Steve Jobs', 'PERSON'), ('Steve Wozniak', 'PERSON'), ('Ronald Wayne', 'PERSON'), ('1976', 'DATE'), ('iPhone', 'ORG'), ('iPad', 'ORG'), ('Mac', 'PERSON'), ('Google LLC', 'ORG'), ('American', 'NORP')]
