### Sentiment Analyzers

There two sentiment analysis implementations in __textblob.sentiment__ module. 
+ __PatternAnalyzer__ (base on the pattern library) 
+ __NaiveBayesAnalyzer__ (an NLTK classifier trained on a movie reviews corpus).

The default is __PatternAnalyzer__. To override the analyzer, we pass another implementation into a TextBlob's constructor.

In [2]:
from textblob import TextBlob
from textblob.sentiments import NaiveBayesAnalyzer

blob = TextBlob("I love this library.", analyzer = NaiveBayesAnalyzer())

blob.sentiment

Sentiment(classification='pos', p_pos=0.7996209910191279, p_neg=0.2003790089808724)

### Tokenizers
The _words_ and _sentences_ properties are helpers that use the 
+ __textblob.tokenizers.WordTokenizer__
+ __textblob.tokenizers.SentenceTokenizer__ 

classes, respectively.

You can use other tokenizers, such as those provided by _NLTK_, by passing them into the _TextBlob_ constructor then accessing the tokens property.

In [4]:
from textblob import TextBlob
from nltk.tokenize import TabTokenizer
tokenizer = TabTokenizer()
blob = TextBlob("This is \ta rather tabby\tblob.", tokenizer=tokenizer)
blob.tokens

WordList(['This is ', 'a rather tabby', 'blob.'])

Another way is to use the __tokenize([tokenizer])__ method.

In [5]:
from textblob import TextBlob
from nltk.tokenize import BlanklineTokenizer
tokenizer = BlanklineTokenizer()
blob = TextBlob("A token\n\nof appreciation")
blob.tokenize(tokenizer)

WordList(['A token', 'of appreciation'])

### Noun Phrase Chunkers
TextBlob currently has two noun phrases chunker implementations: 
+ __textblob.np_extractors.FastNPExtractor__ (default, based on Shlomi Babluki’s implementation from this blog post)  
+ __textblob.np_extractors.ConllExtractor__, which uses the CoNLL 2000 corpus to train a tagger.

You can change the chunker implementation (or even use your own) by explicitly passing an instance of a noun phrase extractor to a TextBlob’s constructor.

In [7]:
from textblob.np_extractors import ConllExtractor
extractor = ConllExtractor()
blob = TextBlob("Python is a high-level programming language.", np_extractor=extractor)

In [8]:
blob.noun_phrases

WordList(['python', 'high-level programming language'])

### POS Taggers

TextBlob currently has two POS tagger implementations, located in __textblob.taggers__. The default is the __PatternTagger__ which uses the same implementation as the pattern library.

The second implementation is __NLTKTagger__ which uses NLTK’s __TreeBank__ tagger. Numpy is required to use the __NLTKTagger__.

Similar to the tokenizers and noun phrase chunkers, you can explicitly specify which POS tagger to use by passing a tagger instance to the constructor.

In [9]:
from textblob.taggers import NLTKTagger
nltk_tagger = NLTKTagger()
blob = TextBlob("Tag! You're it!", pos_tagger=nltk_tagger)

blob.pos_tags

[('Tag', 'NN'), ('You', 'PRP'), ("'re", 'VBP'), ('it', 'PRP')]

### Parsers

_New in Version0.6.0_

Parser implementations can also be passed to the TextBlob constructor.

In [10]:
from textblob.parsers import PatternParser

blob = TextBlob('Parseing is fun.', parser= PatternParser())
blob.parse()

'Parseing/NNP/B-NP/O is/VBZ/B-VP/O fun/NN/B-NP/O ././O/O'

### Blobber: A TextBlob Factory
_New in version 0.4.0_

 Blobber class to create TextBlobs that share the same models.
 
 First, _**instantiate**_ a Blobber with the __tagger__, __NP extractor__, __sentiment analyzer__, __classifier__, and/or __tokenizer__ of your choice.

In [11]:
from textblob.taggers import NLTKTagger
tb = Blobber(pos_tagger=NLTKTagger())

blob1 = tb("this is a blob.")
blob2 = tb('this is another blob.')
blob1.pos_tagger is blob2.pos_tagger

True

### API Reference

In [12]:
from textblob import TextBlob

b = TextBlob("Simple is better than complex.")
b.tags

[('Simple', 'NN'),
 ('is', 'VBZ'),
 ('better', 'JJR'),
 ('than', 'IN'),
 ('complex', 'JJ')]

In [13]:
b.noun_phrases

WordList(['simple'])

In [14]:
b.words

WordList(['Simple', 'is', 'better', 'than', 'complex'])

In [15]:
b.sentiment

Sentiment(polarity=0.06666666666666667, subjectivity=0.41904761904761906)

In [29]:
w = b.words[0].synsets
w[0]

Synset('simple.n.01')

In [21]:
TextBlob('hhello').correct()

TextBlob("hello")

In [22]:
b.ngrams()

[WordList(['Simple', 'is', 'better']),
 WordList(['is', 'better', 'than']),
 WordList(['better', 'than', 'complex'])]

In [24]:
b.np_counts

defaultdict(int, {'simple': 1})

In [32]:
b.lower()

TextBlob("simple is better than complex.")

In [34]:
b.parse()

'Simple/JJ/B-ADJP/O is/VBZ/B-VP/O better/JJR/B-ADJP/O than/IN/B-PP/O complex/JJ/B-ADJP/O ././O/O'

In [35]:
b.pos_tags

[('Simple', 'NN'),
 ('is', 'VBZ'),
 ('better', 'JJR'),
 ('than', 'IN'),
 ('complex', 'JJ')]

In [37]:
b.polarity

0.06666666666666667

In [39]:
b.replace('S','a')

TextBlob("aimple is better than complex.")

In [40]:
b.correct()

TextBlob("Simple is better than complex.")

In [41]:
b

TextBlob("Simple is better than complex.")

In [44]:
b.sentiment

Sentiment(polarity=0.06666666666666667, subjectivity=0.41904761904761906)

In [45]:
b.sentiment_assessments

Sentiment(polarity=0.06666666666666667, subjectivity=0.41904761904761906, assessments=[(['simple'], 0.0, 0.35714285714285715, None), (['better'], 0.5, 0.5, None), (['complex'], -0.3, 0.4, None)])