# NER using SpaCy

SpaCy is widely recognized as the swiftest natural language processing (NLP) framework in Python. It accomplishes this by having separate optimized functions for each NLP task it performs. It is simple to learn and utilize, allowing one to carry out straightforward tasks with only a few lines of cod

## Installation

In [2]:
pip install spacy


Collecting spacy
  Downloading spacy-3.5.2-cp39-cp39-win_amd64.whl (12.2 MB)
Collecting cymem<2.1.0,>=2.0.2
  Downloading cymem-2.0.7-cp39-cp39-win_amd64.whl (30 kB)
Collecting spacy-loggers<2.0.0,>=1.0.0
  Downloading spacy_loggers-1.0.4-py3-none-any.whl (11 kB)
Collecting pathy>=0.10.0
  Downloading pathy-0.10.1-py3-none-any.whl (48 kB)
Collecting typer<0.8.0,>=0.3.0
  Downloading typer-0.7.0-py3-none-any.whl (38 kB)
Collecting murmurhash<1.1.0,>=0.28.0
  Downloading murmurhash-1.0.9-cp39-cp39-win_amd64.whl (18 kB)
Collecting thinc<8.2.0,>=8.1.8Note: you may need to restart the kernel to use updated packages.

  Downloading thinc-8.1.10-cp39-cp39-win_amd64.whl (1.5 MB)
Collecting wasabi<1.2.0,>=0.9.1
  Downloading wasabi-1.1.1-py3-none-any.whl (27 kB)
Collecting preshed<3.1.0,>=3.0.2
  Downloading preshed-3.0.8-cp39-cp39-win_amd64.whl (96 kB)
Collecting smart-open<7.0.0,>=5.2.1
  Downloading smart_open-6.3.0-py3-none-any.whl (56 kB)
Collecting spacy-legacy<3.1.0,>=3.0.11
  Downloadin

## Code for NER using spaCy.

In [7]:
#en_core_web_sm is a pre-trained statistical model for the English language that is built using the spaCy library.

In [6]:
import spacy

# Download the model
spacy.cli.download("en_core_web_sm")

# Load the model
nlp = spacy.load("en_core_web_sm")

sentence = "Apple is looking at buying U.K. startup for $1 billion"

doc = nlp(sentence)

for ent in doc.ents:
    print(ent.text, ent.start_char, ent.end_char, ent.label_)


✔ Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')
Apple 0 5 ORG
U.K. 27 31 GPE
$1 billion 44 54 MONEY


The initial column of the result indicates the entity, while the following two columns indicate the starting and ending characters of that entity in the sentence or document. The last column specifies the category.

Additionally, it is worth mentioning that spaCy's NER model considers capitalization as a factor to recognize named entities. If a slightly modified example is tested, the outcome can differ.

In [8]:
import spacy

nlp = spacy.load('en_core_web_sm')

sentence = "apple is looking at buying U.K. startup for $1 billion" # 'Apple' is written as 'apple'

doc = nlp(sentence)

for ent in doc.ents:
    print(ent.text, ent.start_char, ent.end_char, ent.label_)


apple 0 5 ORG
U.K. 27 31 GPE
$1 billion 44 54 MONEY


Here, you can see that the named entity "apple" has been identified as an organization (ORG) even though it has the same spelling as the common noun "Apple" in the original sentence. This demonstrates how capitalization is used as a cue by spaCy's NER model to differentiate named entities from common nouns.