## NLP with Textblob 

We've seen NLP functionality in NLTK and UDPipe. Turns out textblob also offers quite a bit of that. Do remember that textblob uses NLTK as back-end for some NLP and interfaces with google translate for some others. 

Let's quickly see some samples.

Plan is to demo [1] basic tokenization, [2] POSTagging, [3] noun_phrase extraction, [4] sentiment-an and [5] spelling corrections with textblob.

In [1]:
from textblob import TextBlob

test1 = TextBlob(""" Python is a high-level, general-purpose programming language.
I am intentionally putting some spelling mistakes in here like choclate, vanila, pyton, gogle, etc..
Textblob is so cool, easy andamazing to use, isn't it?! 
Feel freee to addd anyting else here. """)

print(test1)

 Python is a high-level, general-purpose programming language.
I am intentionally putting some spelling mistakes in here like choclate, vanila, pyton, gogle, etc..
Textblob is so cool, easy andamazing to use, isn't it?! 
Feel freee to addd anyting else here. 


In [2]:
# Basic tokenization
words1 = test1.words
print(words1[:5])
print("\n")
print(test1.sentences)

['Python', 'is', 'a', 'high-level', 'general-purpose']


[Sentence(" Python is a high-level, general-purpose programming language."), Sentence("I am intentionally putting some spelling mistakes in here like choclate, vanila, pyton, gogle, etc..
Textblob is so cool, easy andamazing to use, isn't it?!"), Sentence("Feel freee to addd anyting else here.")]


In [3]:
# POSTagging with the blob
tags_list = test1.tags  # penn treebank style tags
print(tags_list[:9])

[('Python', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('high-level', 'JJ'), ('general-purpose', 'JJ'), ('programming', 'NN'), ('language', 'NN'), ('I', 'PRP'), ('am', 'VBP')]


In [4]:
# noun phrase extraction
np_list = test1.noun_phrases  
print(np_list)

['python', 'textblob', 'feel']


In [5]:
# Sentiment-an with textblob
## yields polarity between (-1,1) & a measure of subjectivity (0 to 1)
test1.sentiment  # subjectivity = 0 means fully objective sentence.
print(test1.sentiment)
print("\n")
for sentence in test1.sentences:
    print(sentence.sentiment)


Sentiment(polarity=0.44583333333333336, subjectivity=0.7416666666666667)


Sentiment(polarity=0.0, subjectivity=0.0)
Sentiment(polarity=0.44583333333333336, subjectivity=0.7416666666666667)
Sentiment(polarity=0.0, subjectivity=0.0)


In [6]:
# Spelling Correction
## Use the correct() method to attempt spelling correction.
spell_checked = test1.correct()
print(spell_checked)

 Python is a high-level, general-purpose programming language.
I am intentionally putting some spelling mistakes in here like chocolate, manila, platon, gone, etc..
Textblob is so cool, easy andamazing to use, isn't it?! 
Feel free to added anything else here. 


In [7]:
# spell checking on 'Word' objects via .spellcheck()
from textblob import Word
for word1 in test1.words:
    print(word1.spellcheck())

[('Python', 0.0)]
[('is', 1.0)]
[('a', 1.0)]
[('high-level', 0.0)]
[('general-purpose', 0.0)]
[('programming', 1.0)]
[('language', 1.0)]
[('I', 1.0)]
[('am', 1.0)]
[('intentionally', 1.0)]
[('putting', 1.0)]
[('some', 1.0)]
[('spelling', 1.0)]
[('mistakes', 1.0)]
[('in', 1.0)]
[('here', 1.0)]
[('like', 1.0)]
[('chocolate', 1.0)]
[('manila', 0.6), ('vanilla', 0.4)]
[('platon', 0.45614035087719296), ('ton', 0.12280701754385964), ('patron', 0.07017543859649122), ('pon', 0.05263157894736842), ('lyon', 0.05263157894736842), ('anton', 0.05263157894736842), ('seton', 0.03508771929824561), ('piston', 0.03508771929824561), ('alton', 0.03508771929824561), ('yon', 0.017543859649122806), ('eton', 0.017543859649122806), ('dayton', 0.017543859649122806), ('byron', 0.017543859649122806), ('baton', 0.017543859649122806)]
[('gone', 0.3187250996015936), ('gold', 0.16600265604249667), ('sole', 0.09296148738379814), ('noble', 0.06374501992031872), ('role', 0.055776892430278883), ('angle', 0.05046480743691

While most of the other NLP funcs we saw are available elsewhere, the last one - spell checking - I wanted to demo here in particular.

I'm yet to reticulate textblob into R but Iintend to getthere soon enough.

Well, with that, I'll close this markdown.

Ciao
Sudhir