# Textblob

### What is Textblob?

Textblob is a python Library that used for text processing. It is a simple API for implementing NLP tasks like POS tagging, noun extraction,tokenization, word and phrase frequencies, word inflection,n-grams, classification, translation, sentimental analysis etc.

TextBlob is easy tounderstand and has detailed documentation. It also is superior to other tools for language detection as it uses the google translate. Since the API is not free, the limit is 1000 word translations per day.

### What is the difference between Textblob and NLTK?

Depending on your requirement, either can be used. While NLTK is a good option for beginners, it does not provide certain funcationalities. Textblob has features and funtionalities of both NLTK and Pattern. In general it works well for tasks like text processing by providing an intuitive interface to NLTK. However, for tasks like token and word integration, NLTK works better.

### Exploring textblob and its functionalities:

#### Import textblob

In [102]:
from textblob import TextBlob

#### Parts Of Speech tagging
Categorizes words into grammatical forms

In [103]:
text=TextBlob("A day without laughter is a day wasted - Charlie Chaplin.")

In [104]:
text.tags

[('A', 'DT'),
 ('day', 'NN'),
 ('without', 'IN'),
 ('laughter', 'NN'),
 ('is', 'VBZ'),
 ('a', 'DT'),
 ('day', 'NN'),
 ('wasted', 'VBD'),
 ('Charlie', 'NNP'),
 ('Chaplin', 'NNP')]

In [105]:
text1=TextBlob('Speech is human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words that is, all English words sound different from all French words, even if they are the same word, e.g., "role" or "hotel", and using those words in their semantic character as words in the lexicon of a language according to the syntactic constraints that govern lexical words function in a sentence. In speaking, speakers perform many different intentional speech acts, e.g., informing, declaring, asking, persuading, directing, and can use enunciation, intonation, degrees of loudness, tempo, and other non-representational or paralinguistic aspects of vocalization to convey meaning. In their speech speakers also unintentionally communicate many aspects of their social position such as sex, age, place of origin , physical states , psychic states, physico-psychic states, education or experience and the like')

In [106]:
text1.tags

[('Speech', 'NN'),
 ('is', 'VBZ'),
 ('human', 'JJ'),
 ('vocal', 'JJ'),
 ('communication', 'NN'),
 ('using', 'VBG'),
 ('language', 'NN'),
 ('Each', 'DT'),
 ('language', 'NN'),
 ('uses', 'VBZ'),
 ('phonetic', 'JJ'),
 ('combinations', 'NNS'),
 ('of', 'IN'),
 ('vowel', 'NN'),
 ('and', 'CC'),
 ('consonant', 'JJ'),
 ('sounds', 'NNS'),
 ('that', 'WDT'),
 ('form', 'VBP'),
 ('the', 'DT'),
 ('sound', 'NN'),
 ('of', 'IN'),
 ('its', 'PRP$'),
 ('words', 'NNS'),
 ('that', 'DT'),
 ('is', 'VBZ'),
 ('all', 'DT'),
 ('English', 'NNP'),
 ('words', 'NNS'),
 ('sound', 'JJ'),
 ('different', 'JJ'),
 ('from', 'IN'),
 ('all', 'DT'),
 ('French', 'JJ'),
 ('words', 'NNS'),
 ('even', 'RB'),
 ('if', 'IN'),
 ('they', 'PRP'),
 ('are', 'VBP'),
 ('the', 'DT'),
 ('same', 'JJ'),
 ('word', 'NN'),
 ('e.g.', 'NN'),
 ('role', 'NN'),
 ('or', 'CC'),
 ('hotel', 'NN'),
 ('and', 'CC'),
 ('using', 'VBG'),
 ('those', 'DT'),
 ('words', 'NNS'),
 ('in', 'IN'),
 ('their', 'PRP$'),
 ('semantic', 'JJ'),
 ('character', 'NN'),
 ('as', 'IN')

#### Noun Phrases
It extracts noun phrases from a given word string

In [107]:
for noun in text1.noun_phrases:
    print(noun)

speech
human vocal communication
language uses phonetic combinations
consonant sounds
english
french words
semantic character
syntactic constraints
lexical words function
different intentional speech acts
paralinguistic aspects
speech speakers
social position
physical states
psychic states
physico-psychic states


In [108]:
text.noun_phrases

WordList(['charlie chaplin'])

In [109]:
# introducing the article "a" changes the wordlist and splits the phrase further
text2=TextBlob("Each language uses a phonetic combination of vowel and consonant sounds that form the sound of its words.")

In [110]:
text2.noun_phrases

WordList(['language uses', 'phonetic combination', 'consonant sounds'])

#### Tokenization
splits the given string into set of words/senetnces

In [111]:
text.words

WordList(['A', 'day', 'without', 'laughter', 'is', 'a', 'day', 'wasted', 'Charlie', 'Chaplin'])

In [112]:
text.sentences

[Sentence("A day without laughter is a day wasted - Charlie Chaplin.")]

In [113]:
text1.words

WordList(['Speech', 'is', 'human', 'vocal', 'communication', 'using', 'language', 'Each', 'language', 'uses', 'phonetic', 'combinations', 'of', 'vowel', 'and', 'consonant', 'sounds', 'that', 'form', 'the', 'sound', 'of', 'its', 'words', 'that', 'is', 'all', 'English', 'words', 'sound', 'different', 'from', 'all', 'French', 'words', 'even', 'if', 'they', 'are', 'the', 'same', 'word', 'e.g', 'role', 'or', 'hotel', 'and', 'using', 'those', 'words', 'in', 'their', 'semantic', 'character', 'as', 'words', 'in', 'the', 'lexicon', 'of', 'a', 'language', 'according', 'to', 'the', 'syntactic', 'constraints', 'that', 'govern', 'lexical', 'words', 'function', 'in', 'a', 'sentence', 'In', 'speaking', 'speakers', 'perform', 'many', 'different', 'intentional', 'speech', 'acts', 'e.g', 'informing', 'declaring', 'asking', 'persuading', 'directing', 'and', 'can', 'use', 'enunciation', 'intonation', 'degrees', 'of', 'loudness', 'tempo', 'and', 'other', 'non-representational', 'or', 'paralinguistic', 'asp

In [114]:
text1.sentences

[Sentence("Speech is human vocal communication using language."),
 Sentence("Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words that is, all English words sound different from all French words, even if they are the same word, e.g., "role" or "hotel", and using those words in their semantic character as words in the lexicon of a language according to the syntactic constraints that govern lexical words function in a sentence."),
 Sentence("In speaking, speakers perform many different intentional speech acts, e.g., informing, declaring, asking, persuading, directing, and can use enunciation, intonation, degrees of loudness, tempo, and other non-representational or paralinguistic aspects of vocalization to convey meaning."),
 Sentence("In their speech speakers also unintentionally communicate many aspects of their social position such as sex, age, place of origin , physical states , psychic states, physico-psychic states, education or ex

#### Lemmmatization

In [115]:
from textblob import Word

In [116]:
w=Word("birds")

In [117]:
print("birds :", w.lemmatize()) 

birds : bird


In [118]:
l=Word("left")

In [119]:
#Verb form of left using "v"
print("left :", l.lemmatize("v")) 

left : leave


#### WordLists

In [96]:
#Conversionto singular and plural forms
sent=TextBlob("Happy  days are yet to come")

In [98]:
list=sent.words
list

WordList(['Happy', 'days', 'are', 'yet', 'to', 'come'])

In [99]:
text.words[1].singularize()

'day'

In [101]:
text.words[1].pluralize()

'days'

#### Parsing
Creating a parse senquence using .parse()

In [120]:
text.parse()

'A/DT/B-NP/O day/NN/I-NP/O without/IN/B-PP/B-PNP laughter/NN/B-NP/I-PNP is/VBZ/B-VP/O a/DT/B-NP/O day/NN/I-NP/O wasted/VBN/B-VP/O -/:/O/O Charlie/NNP/B-NP/O Chaplin/NNP/I-NP/O ././O/O'

In [124]:
text1.parse()

'Speech/NN/B-NP/O is/VBZ/B-VP/O human/JJ/B-NP/O vocal/JJ/I-NP/O communication/NN/I-NP/O using/VBG/B-VP/O language/NN/B-NP/O ././O/O\nEach/DT/B-NP/O language/NN/I-NP/O uses/VBZ/B-VP/O phonetic/JJ/B-NP/O combinations/NNS/I-NP/O of/IN/B-PP/B-PNP vowel/NN/B-NP/I-PNP and/CC/O/O consonant/JJ/B-ADJP/O sounds/VBZ/B-VP/O that/IN/B-PP/B-PNP form/NN/B-NP/I-PNP the/DT/I-NP/I-PNP sound/NN/I-NP/I-PNP of/IN/B-PP/B-PNP its/PRP$/B-NP/I-PNP words/NNS/I-NP/I-PNP that/IN/B-PP/O is/VBZ/B-VP/O ,/,/O/O all/DT/B-NP/O English/NNP/I-NP/O words/NNS/I-NP/O sound/NN/I-NP/O different/JJ/B-ADJP/O from/IN/B-PP/B-PNP all/DT/B-NP/I-PNP French/JJ/I-NP/I-PNP words/NNS/I-NP/I-PNP ,/,/O/O even/RB/B-ADVP/O if/IN/B-PP/B-PNP they/PRP/B-NP/I-PNP are/VBP/B-VP/O the/DT/B-NP/O same/JJ/I-NP/O word/NN/I-NP/O ,/,/O/O e.g./FW/O/O ,/,/O/O "/"/O/O role/NN/B-NP/O "/"/O/O or/CC/O/O "/"/O/O hotel/NN/B-NP/O "/"/O/O ,/,/O/O and/CC/O/O using/VBG/B-VP/O those/DT/B-NP/O words/NNS/I-NP/O in/IN/B-PP/B-PNP their/PRP$/B-NP/I-PNP semantic/JJ/I-NP/I

#### Frequency with word_count()

In [128]:
text.words.count('day')

2

In [129]:
text.words.count('charlie')

1

In [130]:
text.words.count('charlie', case_sensitive=True) #Treats upper and lower case as different

0

In [131]:
text1.words.count('speech')

3

In [133]:
text1.words.count('Speech', case_sensitive=True)

1

#### Wordnet Integration
Shows the form variations avaliable for any given root word

In [134]:
from textblob.wordnet import VERB

In [135]:
w=Word("play")

In [137]:
w.synsets

[Synset('play.n.01'),
 Synset('play.n.02'),
 Synset('play.n.03'),
 Synset('maneuver.n.03'),
 Synset('play.n.05'),
 Synset('play.n.06'),
 Synset('bid.n.02'),
 Synset('play.n.08'),
 Synset('playing_period.n.01'),
 Synset('free_rein.n.01'),
 Synset('shimmer.n.01'),
 Synset('fun.n.02'),
 Synset('looseness.n.05'),
 Synset('play.n.14'),
 Synset('turn.n.03'),
 Synset('gambling.n.01'),
 Synset('play.n.17'),
 Synset('play.v.01'),
 Synset('play.v.02'),
 Synset('play.v.03'),
 Synset('act.v.03'),
 Synset('play.v.05'),
 Synset('play.v.06'),
 Synset('play.v.07'),
 Synset('act.v.05'),
 Synset('play.v.09'),
 Synset('play.v.10'),
 Synset('play.v.11'),
 Synset('play.v.12'),
 Synset('play.v.13'),
 Synset('play.v.14'),
 Synset('play.v.15'),
 Synset('play.v.16'),
 Synset('play.v.17'),
 Synset('play.v.18'),
 Synset('toy.v.02'),
 Synset('play.v.20'),
 Synset('dally.v.04'),
 Synset('play.v.22'),
 Synset('dally.v.01'),
 Synset('play.v.24'),
 Synset('act.v.10'),
 Synset('play.v.26'),
 Synset('bring.v.03'),
 Syn

#### Definitions

In [138]:
Word("butterfly").definitions

['diurnal insect typically having a slender body with knobbed antennae and broad colorful wings',
 'a swimming stroke in which the arms are thrown forward together out of the water while the feet kick up and down',
 'flutter like a butterfly',
 'cut and spread open, as in preparation for cooking',
 'talk or behave amorously, without serious intentions']

In [139]:
Word("moon").definitions

['the natural satellite of the Earth',
 'any object resembling a moon',
 'the period between successive new moons (29.531 days)',
 'the light of the Moon',
 'United States religious leader (born in Korea) who founded the Unification Church in 1954; was found guilty of conspiracy to evade taxes (born in 1920)',
 'any natural satellite of a planet',
 'have dreamlike musings or fantasies while awake',
 'be idle in a listless or dreamy way',
 "expose one's buttocks to"]

#### N-grams
It gives a tuple of successive words in a string

In [140]:
text.ngrams(n=2)

[WordList(['A', 'day']),
 WordList(['day', 'without']),
 WordList(['without', 'laughter']),
 WordList(['laughter', 'is']),
 WordList(['is', 'a']),
 WordList(['a', 'day']),
 WordList(['day', 'wasted']),
 WordList(['wasted', 'Charlie']),
 WordList(['Charlie', 'Chaplin'])]

In [141]:
text.ngrams(n=4)

[WordList(['A', 'day', 'without', 'laughter']),
 WordList(['day', 'without', 'laughter', 'is']),
 WordList(['without', 'laughter', 'is', 'a']),
 WordList(['laughter', 'is', 'a', 'day']),
 WordList(['is', 'a', 'day', 'wasted']),
 WordList(['a', 'day', 'wasted', 'Charlie']),
 WordList(['day', 'wasted', 'Charlie', 'Chaplin'])]

In [142]:
text.ngrams(n=7)

[WordList(['A', 'day', 'without', 'laughter', 'is', 'a', 'day']),
 WordList(['day', 'without', 'laughter', 'is', 'a', 'day', 'wasted']),
 WordList(['without', 'laughter', 'is', 'a', 'day', 'wasted', 'Charlie']),
 WordList(['laughter', 'is', 'a', 'day', 'wasted', 'Charlie', 'Chaplin'])]

#### Sentiment Analaysis
This shows whether a given sentence or string is positive negative or neutral.
The function returns polarity and subjectivity.
The polarity ranges between -1 and 1. Where -1 is negative, 0 nuetral and 1 positive.
Subjectivity ranges from 0 to 1. Where 0 is objective and 1 is subjective.

In [146]:
text

TextBlob("A day without laughter is a day wasted - Charlie Chaplin.")

In [147]:
text.sentiment

Sentiment(polarity=-0.2, subjectivity=0.0)

In [148]:
text1

TextBlob("Speech is human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words that is, all English words sound different from all French words, even if they are the same word, e.g., "role" or "hotel", and using those words in their semantic character as words in the lexicon of a language according to the syntactic constraints that govern lexical words function in a sentence. In speaking, speakers perform many different intentional speech acts, e.g., informing, declaring, asking, persuading, directing, and can use enunciation, intonation, degrees of loudness, tempo, and other non-representational or paralinguistic aspects of vocalization to convey meaning. In their speech speakers also unintentionally communicate many aspects of their social position such as sex, age, place of origin , physical states , psychic states, physico-psychic states, education or experience and the like")

In [149]:
text1.sentiment

Sentiment(polarity=0.12202380952380953, subjectivity=0.30782312925170074)

#### Language detection

In [151]:
lang=TextBlob("여보세요")

In [153]:
print(lang.detect_language())

ko


In [154]:
print(lang.translate(to='en'))

Hello


In [158]:
lang1=TextBlob("आप कैसे हैं?")

In [159]:
print(lang1.detect_language())

hi


In [160]:
print(lang1.translate(to='en'))

How are you?


Various other functionalities like spelling correction,lower case to upper and vice versa are available. The entire documententaion to which is referenced below. While there are pros to TextBlob, there are also some cons like it does not provide neural network models, is relatively slow and has not word vectorization scheme. 

### Conclusion
For purposes of text processing and tasks like language detection and manipulation, TextBlob is preferable. It is also easy to use and has the functionalities of both NLTK and pattern. For beginners who has explored NLTK, this is definetly worth giving a try. 