#### NER : 

We are using spacy for NER, we can also use models from Hugging Face.

In [1]:
import spacy

nlp = spacy.load('en_core_web_sm')

In [2]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

In [4]:
doc = nlp("Tesla Ince is going to acquire twitter for $45 billion")

for ent in doc.ents:
    print(ent.text, "|", ent.label_, "|", spacy.explain(ent.label_))

Tesla Ince | PERSON | People, including fictional
$45 billion | MONEY | Monetary values, including unit


In [6]:
from spacy import displacy

displacy.render(doc, style='ent')

In [7]:
nlp.pipe_labels['ner']

['CARDINAL',
 'DATE',
 'EVENT',
 'FAC',
 'GPE',
 'LANGUAGE',
 'LAW',
 'LOC',
 'MONEY',
 'NORP',
 'ORDINAL',
 'ORG',
 'PERCENT',
 'PERSON',
 'PRODUCT',
 'QUANTITY',
 'TIME',
 'WORK_OF_ART']

In [10]:
doc = nlp("Micheal Bloomberg founded Bloomberg in 1982.")

for ent in doc.ents:
    print(ent.text, "|", ent.label_, "|", spacy.explain(ent.label_))

Micheal Bloomberg | PERSON | People, including fictional
Bloomberg | PERSON | People, including fictional
1982 | DATE | Absolute or relative dates or periods


In [11]:
doc = nlp("Micheal Bloomberg founded Bloomberg Inc in 1982.")

for ent in doc.ents:
    print(ent.text, "|", ent.label_, "|", spacy.explain(ent.label_))

Micheal Bloomberg | PERSON | People, including fictional
Bloomberg Inc | ORG | Companies, agencies, institutions, etc.
1982 | DATE | Absolute or relative dates or periods


#### Span

In [12]:
doc = nlp("Tesla is going to acquire twitter for $45 billion.")

for ent in doc.ents:
    print(ent.text, "|", ent.label_, "|", spacy.explain(ent.label_))

Tesla | ORG | Companies, agencies, institutions, etc.
$45 billion | MONEY | Monetary values, including unit


Spacy's NER model is fail to recognize the twitter as a ORG. So we will set the entity as a ORG in spacy with span.

In [16]:
from spacy.tokens import Span
s1 = Span(doc, 5,6, label='ORG')

doc.set_ents([s1], default="unmodified")

In [17]:
for ent in doc.ents:
    print(ent.text, "|", ent.label_, "|", spacy.explain(ent.label_))

Tesla | ORG | Companies, agencies, institutions, etc.
twitter | ORG | Companies, agencies, institutions, etc.
$45 billion | MONEY | Monetary values, including unit


### How can I build my own NER?

1. Simple Lookup
2. Rule Based NER
3. Conditional Random Fields