<a href="https://colab.research.google.com/github/Satyendra0207/Text-and-image_analytics/blob/main/TextBlob.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **TEXTBLOB**

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common **natural language processing** (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

In [3]:
from textblob import TextBlob

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
text=TextBlob("Article 370 of the Indian constitution gave special status to Jammu and Kashmir, a region located in the northern part of Indian subcontinent and part of the larger region of Kashmir which has been the subject of a dispute between India, Pakistan and China since 1947")

In [11]:
type(text)

textblob.blob.TextBlob

### POS Tagging using **Textblob**

The technique of assigning one of the parts of speech to a given word is known as (PoS) tagging. With POS or Part-of-speech tagging, we can list the part-of-speech tags through the tags property. It’s also known as “point-of-sale” tagging in the long-form



1.   Rule-based POS tagging 
2.   Stochastic POS Tagging
3.   Transformation-based tagging






In [31]:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('brown')
nltk.download('wordnet')
nltk.download('omw-1.4')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package brown to /root/nltk_data...
[nltk_data]   Package brown is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...


True

In [19]:
sentence=text.tags
print(sentence)

[('Article', 'NN'), ('370', 'CD'), ('of', 'IN'), ('the', 'DT'), ('Indian', 'JJ'), ('constitution', 'NN'), ('gave', 'VBD'), ('special', 'JJ'), ('status', 'NN'), ('to', 'TO'), ('Jammu', 'NNP'), ('and', 'CC'), ('Kashmir', 'NNP'), ('a', 'DT'), ('region', 'NN'), ('located', 'VBN'), ('in', 'IN'), ('the', 'DT'), ('northern', 'JJ'), ('part', 'NN'), ('of', 'IN'), ('Indian', 'JJ'), ('subcontinent', 'NN'), ('and', 'CC'), ('part', 'NN'), ('of', 'IN'), ('the', 'DT'), ('larger', 'JJR'), ('region', 'NN'), ('of', 'IN'), ('Kashmir', 'NNP'), ('which', 'WDT'), ('has', 'VBZ'), ('been', 'VBN'), ('the', 'DT'), ('subject', 'NN'), ('of', 'IN'), ('a', 'DT'), ('dispute', 'NN'), ('between', 'IN'), ('India', 'NNP'), ('Pakistan', 'NNP'), ('and', 'CC'), ('China', 'NNP'), ('since', 'IN'), ('1947', 'CD')]


### **Noun Phrases using Textblob**

In [22]:
#List of all Nouns in the example sentence
text.noun_phrases

WordList(['article', 'indian constitution', 'special status', 'jammu', 'kashmir', 'northern part', 'indian subcontinent', 'kashmir', 'india', 'pakistan', 'china'])

### **Sentiment Analysis using Textblob**

Sentiment Analysis can assist us in determining the mood and feelings of the general public as well as obtaining useful information about the setting. Sentiment Analysis is the process of assessing data and categorizing it according to the needs.

By providing an input sentence, the TextBlob’s sentiment property returns a named tuple with polarity and subjectivity scores. The polarity score ranges from -1.0 to 1.0 and the subjectivity ranges from 0.0 to 1.0 where 0.0 is an objective statement and 1 is a subjective statement.

The polarity and subjectivity of a statement are returned by TextBlob. The range of polarity is [-1,1], with -1 indicating a negative sentiment and 1 indicating a positive sentiment. Negative words are used to change the polarity of a sentence.

In [23]:
text.sentiment

Sentiment(polarity=0.0634920634920635, subjectivity=0.4682539682539682)

### **Tokenization using Textblob**

We can easily break down the sentences into words or sentences. We have words and sentences properties for that.

In [24]:
text.words

WordList(['Article', '370', 'of', 'the', 'Indian', 'constitution', 'gave', 'special', 'status', 'to', 'Jammu', 'and', 'Kashmir', 'a', 'region', 'located', 'in', 'the', 'northern', 'part', 'of', 'Indian', 'subcontinent', 'and', 'part', 'of', 'the', 'larger', 'region', 'of', 'Kashmir', 'which', 'has', 'been', 'the', 'subject', 'of', 'a', 'dispute', 'between', 'India', 'Pakistan', 'and', 'China', 'since', '1947'])

In [26]:
text.sentences

[Sentence("Article 370 of the Indian constitution gave special status to Jammu and Kashmir, a region located in the northern part of Indian subcontinent and part of the larger region of Kashmir which has been the subject of a dispute between India, Pakistan and China since 1947")]

### **Word Inflection using Textblob**

We can easily singularize and pluralize the words with the help of “singularize” and “pluralize” properties respectively.

In [27]:
text.words[4].pluralize() # the word "Indian"

'Indians'

### **Lemmatization using Textblob**

In [35]:
from textblob import Word
w = Word("radii")
w.lemmatize()

'radius'

In [36]:
w = Word("went")
w.lemmatize("v")

'go'

### **Definition using Textblob**

TextBlob also offers the functionality of defining the given word. The property called “definitions” does the job for it.


In [37]:
Word("blog").definitions

['a shared on-line journal where people can post diary entries about their personal experiences and hobbies',
 'read, write, or edit a shared on-line journal']

In [38]:
Word("Admire").definitions

['feel admiration for', 'look at with admiration']

In [39]:
Word("Persuade").definitions

['win approval or support for',
 "cause somebody to adopt a certain position, belief, or course of action; twist somebody's arm"]

In [40]:
Word("synsets").definitions

['a set of one or more synonyms']

Synsets-a set of one or more synonyms

In [41]:
word = Word("phone")
word.synsets

[Synset('telephone.n.01'),
 Synset('phone.n.02'),
 Synset('earphone.n.01'),
 Synset('call.v.03')]

In [42]:
word = Word("Adorable")
word.synsets

[Synset('adorable.s.01')]

In [45]:
word = Word("best")
print(word.synsets)

[Synset('best.n.01'), Synset('best.n.02'), Synset('best.n.03'), Synset('outdo.v.02'), Synset('best.a.01'), Synset('better.s.03'), Synset('good.a.01'), Synset('full.s.06'), Synset('good.a.03'), Synset('estimable.s.02'), Synset('beneficial.s.01'), Synset('good.s.06'), Synset('good.s.07'), Synset('adept.s.01'), Synset('good.s.09'), Synset('dear.s.02'), Synset('dependable.s.04'), Synset('good.s.12'), Synset('good.s.13'), Synset('effective.s.04'), Synset('good.s.15'), Synset('good.s.16'), Synset('good.s.17'), Synset('good.s.18'), Synset('good.s.19'), Synset('good.s.20'), Synset('good.s.21'), Synset('best.r.01'), Synset('best.r.02'), Synset('better.r.02'), Synset('well.r.01'), Synset('well.r.02'), Synset('well.r.03'), Synset('well.r.04'), Synset('well.r.05'), Synset('well.r.06'), Synset('well.r.07'), Synset('well.r.08'), Synset('well.r.09'), Synset('well.r.10'), Synset('well.r.11'), Synset('well.r.12'), Synset('well.r.13')]


### **Spelling Correction using Textblob**

In [46]:
my_sentence = TextBlob("I am not in denger. I am the dyangr.")
my_sentence.correct()

TextBlob("I am not in danger. I am the danger.")

In [48]:
my_sentence = TextBlob("I am the manster and you are miy slame.") #upto limit 
my_sentence.correct()

TextBlob("I am the master and you are may same.")

In [49]:
#spellcheck() for checking spellings
w = Word('neumonia')
w.spellcheck() 

[('pneumonia', 1.0)]

In [52]:
w = Word('sycology')
w.spellcheck()

[('psychology', 0.6666666666666666), ('sociology', 0.3333333333333333)]

In [53]:
w = Word('nife')
w.spellcheck()

[('life', 0.6231369765791341),
 ('wife', 0.2604684173172463),
 ('nine', 0.0454222853087296),
 ('nice', 0.03761533002129169),
 ('knife', 0.029808374733853796),
 ('nile', 0.0021291696238466998),
 ('rife', 0.0014194464158978)]

### **Word frequencies using Textblob**

In [61]:
betty = TextBlob("Betty Botter bought some butter. But she said the Butter’s bitter. If I put it in my batter, it will make my batter bitter. But a bit of better butter will make my batter better.")
print(betty.word_counts['butter'])
print(betty.words.count('butter', case_sensitive=True))

3
2


In [59]:
count=len(betty.words)
print(count)

37


### **Parsing using TextBlob**

The term “parsing” comes from the Latin word “pars” (which means “part”). It is used to extract exact or dictionary meaning from a text. Syntactic analysis, or syntax analysis, is another name for it. Syntax analysis examines the text for meaning by comparing it to formal grammar rules.

In [63]:
betty.parse()

'Betty/NNP/B-NP/O Botter/NNP/I-NP/O bought/VBD/B-VP/O some/DT/B-NP/O butter/NN/I-NP/O ././O/O\nBut/CC/O/O she/PRP/B-NP/O said/VBD/B-VP/O the/DT/B-NP/O Butter/NN/I-NP/O ’/NN/I-NP/O s/PRP/I-NP/O bitter/JJ/B-ADJP/O ././O/O\nIf/IN/B-PP/B-PNP I/PRP/B-NP/I-PNP put/VB/B-VP/O it/PRP/B-NP/O in/IN/B-PP/B-PNP my/PRP$/B-NP/I-PNP batter/NN/I-NP/I-PNP ,/,/O/O it/PRP/B-NP/O will/MD/B-VP/O make/VB/I-VP/O my/PRP$/B-NP/O batter/NN/I-NP/O bitter/JJ/B-ADJP/O ././O/O\nBut/CC/O/O a/DT/B-NP/O bit/NN/I-NP/O of/IN/B-PP/B-PNP better/JJR/B-NP/I-PNP butter/NN/I-NP/I-PNP will/MD/B-VP/O make/VB/I-VP/O my/PRP$/B-NP/O batter/NN/I-NP/O better/JJR/B-ADJP/O ././O/O'

### **Similarities of TextBlob with Python string**

In [68]:
my_sentence = TextBlob("Simple is better than complex.")
my_sentence[0:16]
print(my_sentence.upper())
print(my_sentence.lower())
print(my_sentence.find("better"))
print(my_sentence.find("then")) #return -1 -not exist
print(my_sentence.find("strong")) #return -1 -not exist

SIMPLE IS BETTER THAN COMPLEX.
simple is better than complex.
10
-1
-1


In [65]:
#my_sentence = TextBlob("Simple is better than complex.")
text[0:36]

TextBlob("Article 370 of the Indian constituti")

In [72]:
a = TextBlob("Black")
b = TextBlob("Blue water policy")
print(a + ' and ' + b)

print("{0} and {1}".format(a,b))

Black and Blue water policy
Black and Blue water policy


### **N-gram using TextBlob**

An N-gram is simply the sequence of ‘n’ words.
n-grams are utilized to create not only unigram (single n-gram) models but also bigram (2-gram) and trigram (3-gram) or multiple models. Web scale n-gram models have been built by researchers for a number of applications including spelling correction, word breaking, and text summarization

1. A cat is in the bag - ngram=6
2. Say my name -   ngram(n)=3





In [73]:
bob = TextBlob("How many roads should a man must walk before we can call him a man?")
print(bob.ngrams(n=10))
print(bob.ngrams(n=8))
print(bob.ngrams(n=7))##wordlist of --7 words
print(bob.ngrams(n=5)) #wordlist of --5 words

[WordList(['How', 'many', 'roads', 'should', 'a', 'man', 'must', 'walk', 'before', 'we']), WordList(['many', 'roads', 'should', 'a', 'man', 'must', 'walk', 'before', 'we', 'can']), WordList(['roads', 'should', 'a', 'man', 'must', 'walk', 'before', 'we', 'can', 'call']), WordList(['should', 'a', 'man', 'must', 'walk', 'before', 'we', 'can', 'call', 'him']), WordList(['a', 'man', 'must', 'walk', 'before', 'we', 'can', 'call', 'him', 'a']), WordList(['man', 'must', 'walk', 'before', 'we', 'can', 'call', 'him', 'a', 'man'])]
[WordList(['How', 'many', 'roads', 'should', 'a', 'man', 'must', 'walk']), WordList(['many', 'roads', 'should', 'a', 'man', 'must', 'walk', 'before']), WordList(['roads', 'should', 'a', 'man', 'must', 'walk', 'before', 'we']), WordList(['should', 'a', 'man', 'must', 'walk', 'before', 'we', 'can']), WordList(['a', 'man', 'must', 'walk', 'before', 'we', 'can', 'call']), WordList(['man', 'must', 'walk', 'before', 'we', 'can', 'call', 'him']), WordList(['must', 'walk', 'be