# Setup
- python -m spacy download en
- python -m spacy download en_core_web_md
- python -m spacy download parser
- python -m spacy download glove



In [22]:
import spacy

In [23]:
nlp = spacy.load('en_core_web_md')

In [24]:
doc1 = nlp(u"this's spacy tokenize test")
print(doc1)


this's spacy tokenize test


In [25]:
for token in doc1:
    print(token)

this
's
spacy
tokenize
test


### Sentence Tokenize Test or Sentence Segmentation Test:


In [26]:
doc2 = nlp(u"this is spacy sentence tokenize test. this is second sent! is this the third sent? final test.")

In [27]:
for sent in doc2.sents:
    print(sent)

this is spacy sentence tokenize test.
this is second sent!
is this the third sent? final test.


### Lemmatize Test:

In [28]:
doc3 = nlp(u"this is spacy lemmatize testing. programming books are more better than others")

In [29]:
for token in doc3:
    print(token, token.lemma, token.lemma_)

this 552 this
is 536 be
spacy 776980 spacy
lemmatize 776982 lemmatize
testing 4191 testing
. 453 .
programming 2171 programming
books 1300 book
are 536 be
more 597 more
better 761 better
than 626 than
others 655 other


### Pos Tagging Test:

In [30]:
 doc4 = nlp(u"This is pos tagger test for spacy pos tagger")

In [31]:
for token in doc4:
    print(token, token.pos, token.pos_)

This 88 DET
is 98 VERB
pos 82 ADJ
tagger 90 NOUN
test 90 NOUN
for 83 ADP
spacy 90 NOUN
pos 90 NOUN
tagger 90 NOUN


### Named Entity Recognizer (NER) Test:

In [32]:
doc5 = nlp(u"Rami Eid is studying at Stony Brook University in New York")

In [33]:
for ent in doc5.ents:
    print(ent, ent.label, ent.label_)

Rami Eid 377 PERSON
Stony Brook University 380 ORG


### Noun Chunk Test:

In [34]:
doc6 = nlp(u"Natural language processing (NLP) deals with the application of computational models to text or speech data.")

In [35]:
for np in doc6.noun_chunks:
    print(np)

Natural language processing (NLP) deals
the application
computational models
text or speech data


### Word Vectors Test:

In [36]:
doc7 = nlp(u"Apples and oranges are similar. Boots and hippos aren't.")
apples = doc7[0]
oranges = doc7[2]
boots = doc7[6]
hippos = doc7[8]
print(apples.similarity(oranges))
print(boots.similarity(hippos))

0.0
0.0
