<a href="https://colab.research.google.com/github/ghassenov/NLP_Basics/blob/main/Name_Entity_Recognition.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

POS Tagging is a fundamental concept in NLP that involves labeling each word in a sentence with its corresponding POS.

* words are categorized into different classes based on their grammatical function in a sentence.

* POS Tagging helps in syntax parsing, machine translation, speech recognition and text-to-speech systems.

* Improves NER and sentiment analysis.


In [1]:
import spacy

In [2]:
nlp = spacy.load('en_core_web_sm')

In [4]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

In [6]:
text = nlp('Tesla is going to acquire twitter for $45 billion')
for ent in text.ents:
  print(ent.text, "|",ent.label_, "|",spacy.explain(ent.label_))

Tesla | ORG | Companies, agencies, institutions, etc.
$45 billion | MONEY | Monetary values, including unit


In [7]:
from spacy import displacy
displacy.render(text,style='ent')

In [10]:
text2 = nlp('messi is the best footballer')
displacy.render(text2,style='ent')

In [11]:
nlp.pipe_labels['ner']

['CARDINAL',
 'DATE',
 'EVENT',
 'FAC',
 'GPE',
 'LANGUAGE',
 'LAW',
 'LOC',
 'MONEY',
 'NORP',
 'ORDINAL',
 'ORG',
 'PERCENT',
 'PERSON',
 'PRODUCT',
 'QUANTITY',
 'TIME',
 'WORK_OF_ART']

### Setting our own entities

In [12]:
doc = nlp('Tesla is going to acquire twitter for $45 billion')
for ent in doc.ents:
  print(ent.text, '|',ent.label_,'|')

Tesla | ORG |
$45 billion | MONEY |


In [15]:
doc

Tesla is going to acquire twitter for $45 billion

In [16]:
s = doc[2:5]
s

going to acquire

In [17]:
type(s)

spacy.tokens.span.Span

In [18]:
from spacy.tokens import Span

s1 = Span(doc,0,0,label = 'ORG')
s2 = Span(doc,7,9, 'MONEY')
doc.set_ents([s1,s2],default = 'unmodified')

In [19]:
for ent in doc.ents:
  print(ent.text, '|',ent.label_,'|')

Tesla | ORG |
$45 billion | MONEY |
