# Language Processing pipeline (Spacy)

In [1]:
import spacy
nlp = spacy.blank("en") # here we are loading a black pipeline

In [2]:
doc = nlp("Tylor wants to play with Brian but Brain doesn't want to play with him")
for token in doc:
    print(token)

Tylor
wants
to
play
with
Brian
but
Brain
does
n't
want
to
play
with
him


In [3]:
nlp.pipe_names # will give a blank array cuz we load it blank

[]

In [5]:
nlp = spacy.load("en_core_web_sm") # now loading with a english pipeline
doc = nlp("Tylor wants to play with Brian but Brain doesn't want to play with him")
for token in doc:
    print(token, " | ", token.pos_, " | ", token.lemma_) # token | part of speech | lemma (base word)

Tylor  |  PROPN  |  Tylor
wants  |  VERB  |  want
to  |  PART  |  to
play  |  VERB  |  play
with  |  ADP  |  with
Brian  |  PROPN  |  Brian
but  |  CCONJ  |  but
Brain  |  PROPN  |  Brain
does  |  AUX  |  do
n't  |  PART  |  not
want  |  VERB  |  want
to  |  PART  |  to
play  |  VERB  |  play
with  |  ADP  |  with
him  |  PRON  |  he


In [6]:
nlp.pipeline # this will show us the whole pipeline

[('tok2vec', <spacy.pipeline.tok2vec.Tok2Vec at 0x273d2b4f7c0>),
 ('tagger', <spacy.pipeline.tagger.Tagger at 0x273d6b08400>),
 ('parser', <spacy.pipeline.dep_parser.DependencyParser at 0x273d69a2c00>),
 ('attribute_ruler',
  <spacy.pipeline.attributeruler.AttributeRuler at 0x273d6c09d00>),
 ('lemmatizer', <spacy.lang.en.lemmatizer.EnglishLemmatizer at 0x273d6cf5380>),
 ('ner', <spacy.pipeline.ner.EntityRecognizer at 0x273d69a2c70>)]

In [7]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

## Named entity recognition (NER)

In [11]:
doc = nlp("Nvida INC is aquired by Google LLC for 1$")
for ent in doc.ents:
    print(ent.text, " | ", ent.label_, " | ", spacy.explain(ent.label_)) # this will show us the Named Entity Recognisation (ner)

Nvida INC  |  ORG  |  Companies, agencies, institutions, etc.
Google  |  ORG  |  Companies, agencies, institutions, etc.
1$  |  MONEY  |  Monetary values, including unit


In [12]:
from spacy import displacy # cool way to diplay the document

displacy.render(doc, style="ent")