#Using NER from spaCy

Install spaCy and the large English model

In [None]:
!pip install spacy
!python -m spacy download en_core_web_lg

Collecting en-core-web-lg==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.7.1/en_core_web_lg-3.7.1-py3-none-any.whl (587.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m587.7/587.7 MB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: en-core-web-lg
Successfully installed en-core-web-lg-3.7.1
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_lg')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


This code uses the spaCy library to perform Named Entity Recognition (NER), a task in Natural Language Processing (NLP) that identifies and classifies key entities (like people, organizations, dates, and amounts) in a text.

In [None]:
import spacy

# Load the spaCy large English model
# This model is pre-trained and capable of performing various NLP tasks like tokenization, part-of-speech tagging, and Named Entity Recognition (NER).
nlp = spacy.load("en_core_web_lg")

# Sample text
# The sentence mentions a company (Apple), a date (Tuesday), and a monetary value ($75 billion).
text_from_fig = "On Tuesday, Apple announced its plans for another major chunk of the money: It will buy back a further $75 billion in stock."

# Process the text using the model
# The text is processed by the nlp model. The result (doc) is a spaCy object containing the tokenized text and various linguistic annotations (e.g., POS tags, entities).
doc = nlp(text_from_fig)

# This loop iterates through all the entities detected in the processed text. doc.ents contains the entities, which spaCy has recognized.
# For each entity, it prints the entity's text (ent.text) and its label (ent.label_).
for ent in doc.ents:
    if ent.text:
        print(ent.text, "\t", ent.label_)


Tuesday 	 DATE
Apple 	 ORG
$75 billion 	 MONEY


**Output:**

Tuesday is recognized as a DATE. \
Apple is recognized as an ORG (organization).\
$75 billion is recognized as MONEY.
