## Spacy Pipeline

In [1]:
import spacy

In [3]:
nlp = spacy.blank("en")

doc = nlp("The vintage watch was a steal at just $145, a fraction of its original price.")

for token in doc:
    print(token)

The
vintage
watch
was
a
steal
at
just
$
145
,
a
fraction
of
its
original
price
.


In [11]:
for token in doc.sents:
    print(token)

AttributeError: 'spacy.tokens.doc.Doc' object has no attribute 'sent'

In [5]:
nlp.pipe_names

[]

Error is because the components of the pipeline are missing. We need to have a certain components to get the sentences.

In [6]:
nlp = spacy.load("en_core_web_sm")

In [9]:
doc = nlp("The vintage watch was a steal at just $145, a fraction of its original price.")

for token in doc.sents:
    print(token)

The vintage watch was a steal at just $145, a fraction of its original price.


In [12]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

#### Lemmatization

In [14]:
for token in doc:
    print(token,"-----",token.pos_,"-----",token.lemma_)

The ----- DET ----- the
vintage ----- ADJ ----- vintage
watch ----- NOUN ----- watch
was ----- AUX ----- be
a ----- DET ----- a
steal ----- NOUN ----- steal
at ----- ADP ----- at
just ----- ADV ----- just
$ ----- SYM ----- $
145 ----- NUM ----- 145
, ----- PUNCT ----- ,
a ----- DET ----- a
fraction ----- NOUN ----- fraction
of ----- ADP ----- of
its ----- PRON ----- its
original ----- ADJ ----- original
price ----- NOUN ----- price
. ----- PUNCT ----- .


#### Name Entity Recognition

In [17]:
doc = nlp("Investors eagerly awaited the quarterly earnings reports of Apple Inc., Tesla, and Amazon, anticipating their impact on the stock market.")

for token in doc.ents:
    print(token.text, "-----",token.label_,"-----",spacy.explain(token.label_))

quarterly ----- DATE ----- Absolute or relative dates or periods
Apple Inc. ----- ORG ----- Companies, agencies, institutions, etc.
Tesla ----- ORG ----- Companies, agencies, institutions, etc.
Amazon ----- ORG ----- Companies, agencies, institutions, etc.


In [18]:
from spacy import displacy

In [19]:
displacy.render(doc, style="ent")