### Loading Library

In [2]:
import spacy

### Load English tokenizer, tagger, parser, NER and word vectors

In [3]:
nlp = spacy.load("en_core_web_sm")

### Sample Text

In [5]:
# Proces whole documents
text = "Google was initially funded by an August 1998 investment of $100,000 from Andy Bechtolsheim, co-founder of Sun Microsystems, a few weeks prior to September 7, 1998, the day Google was officially incorporated. This initial investment served as a motivation to incorporate the company to be able to use the funds."

In [6]:
text

'Google was initially funded by an August 1998 investment of $100,000 from Andy Bechtolsheim, co-founder of Sun Microsystems, a few weeks prior to September 7, 1998, the day Google was officially incorporated. This initial investment served as a motivation to incorporate the company to be able to use the funds.'

### NLP at Work

In [7]:
doc = nlp(text)

In [9]:
doc

Google was initially funded by an August 1998 investment of $100,000 from Andy Bechtolsheim, co-founder of Sun Microsystems, a few weeks prior to September 7, 1998, the day Google was officially incorporated. This initial investment served as a motivation to incorporate the company to be able to use the funds.

### Tokenization

In [8]:
for token in doc:
    print(token)

Google
was
initially
funded
by
an
August
1998
investment
of
$
100,000
from
Andy
Bechtolsheim
,
co
-
founder
of
Sun
Microsystems
,
a
few
weeks
prior
to
September
7
,
1998
,
the
day
Google
was
officially
incorporated
.
This
initial
investment
served
as
a
motivation
to
incorporate
the
company
to
be
able
to
use
the
funds
.


### Only Noun

In [13]:
for token in doc:
    if token.pos_ == "NOUN":
        print(token)

investment
co
-
founder
weeks
day
investment
motivation
company
funds


### Named Entity Recognition

In [15]:
for entity in doc.ents:
    print(entity.text, entity.label_)

August 1998 DATE
100,000 MONEY
Andy Bechtolsheim PERSON
Sun Microsystems ORG
a few weeks DATE
September 7, 1998 DATE
