Trained pipeline

In [2]:
import spacy

In [3]:
nlp = spacy.load("en_core_web_sm")

In [4]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

In [5]:
doc = nlp("Apple is looking at buying U.K. startup for $1 billion. This is a great opportunity.")

for token in doc:
    print(token, " | ", token.pos_, " | ", token.lemma_)

Apple  |  PROPN  |  Apple
is  |  AUX  |  be
looking  |  VERB  |  look
at  |  ADP  |  at
buying  |  VERB  |  buy
U.K.  |  PROPN  |  U.K.
startup  |  VERB  |  startup
for  |  ADP  |  for
$  |  SYM  |  $
1  |  NUM  |  1
billion  |  NUM  |  billion
.  |  PUNCT  |  .
This  |  PRON  |  this
is  |  AUX  |  be
a  |  DET  |  a
great  |  ADJ  |  great
opportunity  |  NOUN  |  opportunity
.  |  PUNCT  |  .


In [6]:
doc = nlp("Apple is looking at buying Vietnam startup for $1 billion. This is a great opportunity.")

for ent in doc.ents:
    print(ent.text, " | ", ent.label_, " | ", spacy.explain(ent.label_))

Apple  |  ORG  |  Companies, agencies, institutions, etc.
Vietnam  |  GPE  |  Countries, cities, states
$1 billion  |  MONEY  |  Monetary values, including unit


In [7]:
from spacy import displacy
displacy.render(doc, style="ent")

Adding a component to a blank pipeline

In [8]:
source_nlp = spacy.load("en_core_web_sm")
nlp = spacy.blank("en")
nlp.add_pipe("ner", source=source_nlp)
nlp.pipe_names

['ner']

In [9]:
doc = nlp("Apple is looking at buying Vietnam startup for $1 billion. This is a great opportunity.")

for ent in doc.ents:
    print(ent.text, " | ", ent.label_)

Apple  |  ORG
Vietnam  |  GPE
$1 billion  |  MONEY
