#### This file will concern itself with the *Named Entity Recognition (NER)* part of the project.

The pre-trained model is loaded only with the EntityRecognizer pipeline enabled to improve loading and inference speed. Other pipelines are disabled such as ones concerned with POS tagging, lemmatization, parsing, etc.

In [24]:
import spacy
from spacy import displacy

spacy.prefer_gpu()
nlp = spacy.load("en_core_web_sm", enable = ['ner'])
print("Spacy successfully loaded")

Spacy successfully loaded


In [25]:
nlp.pipe_labels['ner']

['CARDINAL',
 'DATE',
 'EVENT',
 'FAC',
 'GPE',
 'LANGUAGE',
 'LAW',
 'LOC',
 'MONEY',
 'NORP',
 'ORDINAL',
 'ORG',
 'PERCENT',
 'PERSON',
 'PRODUCT',
 'QUANTITY',
 'TIME',
 'WORK_OF_ART']

A pretrained model is loaded from the NLP library *Spacy* which takes sentences as input and performs several sentence tagging tasks including NER which we are interested in.

In [20]:
example_text = "klevio is a singer from Albania who usually goes to Greece and works in UBS. He lives in Lake Geneva and owns a Mercedes car."
doc = nlp(example_text)

A great visualization of the entity recognition process is displayed by the *displacy* suite.

In [22]:
displacy.render(doc, style = 'ent')

In [23]:
doc2 = nlp('The government in Senegal just passed a law on the 2nd of February regarding universal healthcare, named Universal Care Act, passed in parliament also in French')
displacy.render(doc2, style = 'ent')

In [7]:
for ent in doc.ents:
    print(ent.text, ent.label_, ent.label)

klevio PERSON 380
Albania GPE 384
Greece GPE 384
UBS ORG 383
Lake Geneva LOC 385
Mercedes PRODUCT 386


In [8]:
doc3 = nlp('World Health Organization Geneva')
displacy.render(doc3, style = 'ent')