## TextBlob

**TextBlob** es una libreria para procesar datos textuales. Proporciona una API simple para realizar tareas comunes de **NLP**, como análisis de sentimientos, clasificación, traducción.

**TextBlob** esta construido a partir de **NLTK**.

```python
pip install textblob
```

_**Documentación:** https://textblob.readthedocs.io/en/dev/_

In [None]:
!pip install textblob

In [7]:
import textblob # Para ver la versión
from textblob import TextBlob

In [8]:
texto = """The titular threat of The Blob has always struck me as the ultimate movie
            monster: an insatiably hungry, amoeba-like mass able to penetrate
            virtually any safeguard, capable of--as a doomed doctor chillingly
            describes it--"assimilating flesh on contact.
            Snide comparisons to gelatin be damned, it's a concept with the most
            devastating of potential consequences, not unlike the grey goo scenario
            proposed by technological theorists fearful of
            artificial intelligence run rampant."""


# TextBlob() transforma un texto a un objeto TextBlob
blob = TextBlob(texto)

In [9]:
type(blob)

textblob.blob.TextBlob

In [10]:
# Tokenizar

print(blob.words)

['The', 'titular', 'threat', 'of', 'The', 'Blob', 'has', 'always', 'struck', 'me', 'as', 'the', 'ultimate', 'movie', 'monster', 'an', 'insatiably', 'hungry', 'amoeba-like', 'mass', 'able', 'to', 'penetrate', 'virtually', 'any', 'safeguard', 'capable', 'of', 'as', 'a', 'doomed', 'doctor', 'chillingly', 'describes', 'it', 'assimilating', 'flesh', 'on', 'contact', 'Snide', 'comparisons', 'to', 'gelatin', 'be', 'damned', 'it', "'s", 'a', 'concept', 'with', 'the', 'most', 'devastating', 'of', 'potential', 'consequences', 'not', 'unlike', 'the', 'grey', 'goo', 'scenario', 'proposed', 'by', 'technological', 'theorists', 'fearful', 'of', 'artificial', 'intelligence', 'run', 'rampant']


In [11]:
# Part-of-speech Tagging

blob.tags

[('The', 'DT'),
 ('titular', 'JJ'),
 ('threat', 'NN'),
 ('of', 'IN'),
 ('The', 'DT'),
 ('Blob', 'NNP'),
 ('has', 'VBZ'),
 ('always', 'RB'),
 ('struck', 'VBN'),
 ('me', 'PRP'),
 ('as', 'IN'),
 ('the', 'DT'),
 ('ultimate', 'JJ'),
 ('movie', 'NN'),
 ('monster', 'NN'),
 ('an', 'DT'),
 ('insatiably', 'RB'),
 ('hungry', 'JJ'),
 ('amoeba-like', 'JJ'),
 ('mass', 'NN'),
 ('able', 'JJ'),
 ('to', 'TO'),
 ('penetrate', 'VB'),
 ('virtually', 'RB'),
 ('any', 'DT'),
 ('safeguard', 'NN'),
 ('capable', 'JJ'),
 ('of', 'IN'),
 ('as', 'IN'),
 ('a', 'DT'),
 ('doomed', 'JJ'),
 ('doctor', 'NN'),
 ('chillingly', 'RB'),
 ('describes', 'VBZ'),
 ('it', 'PRP'),
 ('assimilating', 'VBG'),
 ('flesh', 'NN'),
 ('on', 'IN'),
 ('contact', 'NN'),
 ('Snide', 'JJ'),
 ('comparisons', 'NNS'),
 ('to', 'TO'),
 ('gelatin', 'VB'),
 ('be', 'VB'),
 ('damned', 'VBN'),
 ('it', 'PRP'),
 ("'s", 'VBZ'),
 ('a', 'DT'),
 ('concept', 'NN'),
 ('with', 'IN'),
 ('the', 'DT'),
 ('most', 'RBS'),
 ('devastating', 'JJ'),
 ('of', 'IN'),
 ('potenti

| Tag | Meaning |
|-----|---------|
|**`CC`**| Coordinating conjunction|
|**`CD`**| Cardinal number|
|**`DT`**| Determiner|
|**`EX`**| Existential there|
|**`FW`**| Foreign word|
|**`IN`**| Preposition or subordinating conjunction|
|**`JJ`**| Adjective|
|**`JJR`**| Adjective, comparative|
|**`JJS`**| Adjective, superlative|
|**`LS`**| List item marker|
|**`MD`**| Modal|
|**`NN`**| Noun, singular or mass|
|**`NNS`**| Noun, plural|
|**`NNP`**| Proper noun, singular|
|**`NNPS`**| Proper noun, plural|
|**`PDT`**| Predeterminer|
|**`POS`**| Possessive ending|
|**`PRP`**| Personal pronoun|
|**`PRP$`**| 	Possessive pronoun|
|**`RB`**| Adverb|
|**`RBR`**| Adverb, comparative|
|**`RBS`**| Adverb, superlative|
|**`RP`**| Particle|
|**`SYM`**| Symbol|
|**`TO`**| to|
|**`UH`**| Interjection|
|**`VB`**| Verb, base form|
|**`VBD`**| Verb, past tense|
|**`VBG`**| Verb, gerund or present participle|
|**`VBN`**| Verb, past participle|
|**`VBP`**| Verb, non-3rd person singular present|
|**`VBZ`**| Verb, 3rd person singular present|
|**`WDT`**| Wh-determiner|
|**`WP`**| Wh-pronoun|
|**`WP$`**| Possessive wh-pronoun|
|**`WRB`**| Wh-adverb|

In [12]:
# Pronombres (Nouns)

blob.noun_phrases 

WordList(['titular threat', 'blob', 'ultimate movie monster', 'amoeba-like mass', 'snide', 'potential consequences', 'grey goo scenario', 'technological theorists fearful', 'artificial intelligence run rampant'])

In [13]:
# Oraciones (sentences)

blob.sentences

[Sentence("The titular threat of The Blob has always struck me as the ultimate movie
             monster: an insatiably hungry, amoeba-like mass able to penetrate
             virtually any safeguard, capable of--as a doomed doctor chillingly
             describes it--"assimilating flesh on contact."),
 Sentence("Snide comparisons to gelatin be damned, it's a concept with the most
             devastating of potential consequences, not unlike the grey goo scenario
             proposed by technological theorists fearful of
             artificial intelligence run rampant.")]

In [14]:
blob

TextBlob("The titular threat of The Blob has always struck me as the ultimate movie
            monster: an insatiably hungry, amoeba-like mass able to penetrate
            virtually any safeguard, capable of--as a doomed doctor chillingly
            describes it--"assimilating flesh on contact.
            Snide comparisons to gelatin be damned, it's a concept with the most
            devastating of potential consequences, not unlike the grey goo scenario
            proposed by technological theorists fearful of
            artificial intelligence run rampant.")

### Polaridad y Subjetividad

In [15]:
for sentence in blob.sentences:
    print(sentence.sentiment)
    print("-"*100)

Sentiment(polarity=0.06000000000000001, subjectivity=0.605)
----------------------------------------------------------------------------------------------------
Sentiment(polarity=-0.34166666666666673, subjectivity=0.7666666666666666)
----------------------------------------------------------------------------------------------------


### Lematización

In [16]:
from textblob import Word

w = Word("octopi")
w.lemmatize()

'octopus'

In [17]:
w = Word("went")
w.lemmatize("v") # "verb"

'go'

In [19]:
w.lemmatize()

'went'

### Corrector

In [20]:
b = TextBlob("Alici in wonderland is an amzingg bouk")
print(b.correct())

Alice in wonderland is an amazing book


In [21]:
# .correct() falla muy fácil

b = TextBlob("Alic in wunderland is an amzingg buuk")
print(b.correct())

Lie in wunderland is an amazing bulk


### Conteo de palabras

In [22]:
with open("../Data/alice_in_wonderland.txt") as f:
    blob = TextBlob(f.read())

In [23]:
blob.words.count("Alice", case_sensitive = True)

396

In [24]:
blob.words.count("Alice", case_sensitive = False)

399

### N-gramas

In [25]:
texto = "The quick brown fox jumps over the lazy dog."

In [26]:
blob = TextBlob(texto)

In [28]:
blob.ngrams(n = 3)

[WordList(['The', 'quick', 'brown']),
 WordList(['quick', 'brown', 'fox']),
 WordList(['brown', 'fox', 'jumps']),
 WordList(['fox', 'jumps', 'over']),
 WordList(['jumps', 'over', 'the']),
 WordList(['over', 'the', 'lazy']),
 WordList(['the', 'lazy', 'dog'])]

In [29]:
len(blob.words)

9

In [30]:
len(blob.ngrams(n = 3))

7

In [31]:
for sentence in blob.sentences:
    print(sentence.sentiment)
    print("-"*100)

Sentiment(polarity=0.04166666666666666, subjectivity=0.75)
----------------------------------------------------------------------------------------------------


### POS y NEG

In [32]:
from textblob.sentiments import NaiveBayesAnalyzer

with open("../Data/alice_in_wonderland.txt") as file:
    texto = file.read()
    
blob = TextBlob("I'm feeling good", analyzer = NaiveBayesAnalyzer())

In [None]:
for sent in blob.sentences:
    print(sent.sentiment)
    print("-"*100)

In [34]:
blob = TextBlob("I'm feeling good", analyzer = NaiveBayesAnalyzer())
blob.sentiment

Sentiment(classification='pos', p_pos=0.5706365032741384, p_neg=0.42936349672586166)

In [None]:
################################################################################################################################