### Spacy Library
Spacy library is similar to NLTK and TextBlob but spacy library is advanced as compare to these two and used in NLP problems for all text related task.

In [11]:
# importing library

import spacy

In [12]:
# loading spacy model and tokenize 

nlp = spacy.load('en_core_web_sm')
doc = nlp('This time we are trying different library')

for token in doc:
    print(token.text,token.pos_,token.dep_)

# text return original word
# pos_ return part of speech
# dep_ returns dependency

This DET det
time NOUN npadvmod
we PRON nsubj
are AUX aux
trying VERB ROOT
different ADJ amod
library NOUN dobj


In [13]:
# example for text classification :

doc = nlp("I went to U.S. last year x.y.")

for token in doc:
    print(token.text)
    
# The output successfully split U.S. into seperate token however in case of nltk it was seperating U as seperate, . as 
# seperate S as seperate etc.

I
went
to
U.S.
last
year
x.y
.


In [14]:
# example showing all parameters

text = nlp("virus is very dangerous for human beings")

for token in text:
    print(token.text,token.lemma_,token.pos_,token.tag_,token.dep_,token.shape_,token.is_alpha,token.is_stop)

virus virus NOUN NN nsubj xxxx True False
is be AUX VBZ ROOT xx True True
very very ADV RB advmod xxxx True True
dangerous dangerous ADJ JJ acomp xxxx True False
for for ADP IN prep xxx True True
human human ADJ JJ amod xxxx True False
beings being NOUN NNS pobj xxxx True False


#### display using spacy

In [16]:
from spacy import displacy

nlp = spacy.load("en_core_web_sm")
text = nlp("This is display example")
displacy.serve(text,style="dep")




Using the 'dep' visualizer
Serving on http://0.0.0.0:5000 ...

Shutting down server on port 5000.


#### Named Entity

In [22]:
nlp = spacy.load("en_core_web_sm")
text = nlp("Delhi is a capital of india.Total states in india are 28")


for ent in text.ents:
    print(ent.text,ent.start_char,ent.end_char,ent.label_)

Delhi 0 5 GPE
india 22 27 GPE
india 44 49 GPE
28 54 56 CARDINAL


#### Visualizing NER

In [23]:
displacy.serve(text,style = 'ent')




Using the 'ent' visualizer
Serving on http://0.0.0.0:5000 ...

Shutting down server on port 5000.


#### word to vector

In [27]:
nlp = spacy.load("en_core_web_md")
text = nlp("banana bear lion book xyz sdbfrofewo")

for token in text:
    print(token.text,token.has_vector,token.vector_norm,token.is_oov)

banana True 6.700014 False
bear True 5.881604 False
lion True 6.5120897 False
book True 6.4986963 False
xyz True 6.2403135 False
sdbfrofewo False 0.0 True


#### Similarity check

In [28]:
text = nlp("lion bear cow book spinach")

for token1 in text:
    for token2 in text:
        print(token1.text,token2.text,token1.similarity(token2))

lion lion 1.0
lion bear 0.6390859
lion cow 0.4780627
lion book 0.13908103
lion spinach 0.10201545
bear lion 0.6390859
bear bear 1.0
bear cow 0.43222296
bear book 0.22685005
bear spinach 0.125805
cow lion 0.4780627
cow bear 0.43222296
cow cow 1.0
cow book 0.09767053
cow spinach 0.318152
book lion 0.13908103
book bear 0.22685005
book cow 0.09767053
book book 1.0
book spinach 0.06958954
spinach lion 0.10201545
spinach bear 0.125805
spinach cow 0.318152
spinach book 0.06958954
spinach spinach 1.0


In [32]:
text = nlp("Wikipedia was launched on January 15, 2001, by Jimmy Wales and Larry Sanger; Sanger coined its name as a portmanteau of  and Initially available only in English, versions in other languages were quickly developed. The English Wikipedia, with 6.3 million articles as of March 2021, is the largest of the 321 language editions. Combined, Wikipedia's editions comprise more than 56 million articles, and attract more than 17 million"
           "edits and more than 1.7 billion unique visitors per month."
        "Wikipedia has been criticized for its uneven accuracy and for exhibiting systemic bias, particularly gender bias, with the majority of editors being male.[11] In 2006, Time magazine stated that the open-door policy of allowing anyone to edit had made Wikipedia t")
displacy.serve(text,style = 'ent')




Using the 'ent' visualizer
Serving on http://0.0.0.0:5000 ...

Shutting down server on port 5000.
