# Advanced usage

## Sentiment Analyzers

The **textblob.sentiments** module contains two sentiment alalysis implementations namely ***PatternAnalyzer***(default which belongs to pattern library) and ***NaiveBayesAnalyzer*** (NLTK classifier trained on movie reviews corpus).

In [1]:
#overriding the default pattern analysr
from textblob import TextBlob
from textblob.sentiments import NaiveBayesAnalyzer

text=TextBlob('I love this library',analyzer=NaiveBayesAnalyzer())
text.sentiment

Sentiment(classification='pos', p_pos=0.7996209910191279, p_neg=0.2003790089808724)

In [5]:
#pattern analyzer
from textblob.sentiments import PatternAnalyzer
blob=TextBlob('I love this library',analyzer=PatternAnalyzer())
blob.sentiment

Sentiment(polarity=0.5, subjectivity=0.6)

## Tokenizers

The *words* and *sentences* properties are helpers that use the ***textblob.tokenizers.WordTokenizer***
and ***textblob.tokenizers.SentenceTokenizer*** classes, respectively.

Apart from those, we can use other tokenizers like **TabTokenizer** and **BlanklineTokenizer** imported from NLTK.

In [6]:
from nltk.tokenize import TabTokenizer,BlanklineTokenizer


In [7]:
tokenizer = TabTokenizer()
blob = TextBlob("This is\ta rather tabby\tblob.", tokenizer=tokenizer)
blob.tokens

WordList(['This is', 'a rather tabby', 'blob.'])

In [9]:
tokenizer = BlanklineTokenizer()
blob = TextBlob("This is\n\n a rather tabby\n\nblob.", tokenizer=tokenizer)
blob.tokens

WordList(['This is', 'a rather tabby', 'blob.'])

## Noun Phrase chunkers

TextBlob currently has two noun phrases chunker implementations, ***textblob.np_extractors.FastNPExtractor***(default, based on Shlomi Babluki’s implementation) and ***textblob.np_extractors.ConllExtractor***, which uses the CoNLL 2000 corpus to train a tagger.

In [10]:
#ConllEtractor
from textblob import TextBlob
from textblob.np_extractors import ConllExtractor
extractor = ConllExtractor()
blob = TextBlob("Python is a high-level programming language.", np_extractor=extractor)
blob.noun_phrases

WordList(['python', 'high-level programming language'])

In [11]:
#FastNPExtractor
from textblob.np_extractors import FastNPExtractor
extractor = FastNPExtractor()
blob = TextBlob("Python is a high-level programming language.", np_extractor=extractor)
blob.noun_phrases

WordList(['python'])

## POS Taggers

TextBlob currently has two POS tagger implementations, located in *textblob.taggers*. The default is the ***PatternTagger*** which uses the same implementation as the pattern library.


The second implementation is NLTKTagger which uses NLTK’s TreeBank tagger. Numpy is required to use the ***NLTKTagger***.

Similar to the tokenizers and noun phrase chunkers, we can explicitly specify which POS tagger to use by passing a tagger instance to the constructor

In [12]:
from textblob import TextBlob
from textblob.taggers import NLTKTagger
nltk_tagger = NLTKTagger()
blob = TextBlob("Tag! You're It!", pos_tagger=nltk_tagger)
blob.pos_tags

[('Tag', 'NN'), ('You', 'PRP'), ("'re", 'VBP'), ('It', 'PRP')]

In [13]:
from textblob.taggers import PatternTagger
pattern_tagger = PatternTagger()
blob = TextBlob("Tag! You're It!", pos_tagger=pattern_tagger)
blob.pos_tags

[('Tag', 'NN'), ('You', 'PRP'), ("'", 'POS'), ('re', 'NN'), ('It', 'PRP')]

## Parsers

Parsers can also be passed to TextBlob constructor

In [14]:
from textblob import TextBlob
from textblob.parsers import PatternParser
blob = TextBlob("Parsing is fun.", parser=PatternParser())
blob.parse()

'Parsing/VBG/B-VP/O is/VBZ/I-VP/O fun/NN/B-NP/O ././O/O'

## Classifiers

class ***textblob.classifiers.BaseClassifier(train_set, feature_extractor=<function ba-
sic_extractor>, format=None, **kwargs)***

class ***textblob.classifiers.DecisionTreeClassifier(train_set, feature_extractor=<function ba- sic_extractor>, format=None,
**kwargs)*** A classifier based on the decision tree algorithm, as implemented in NLTK.

class ***textblob.classifiers.MaxEntClassifier(train_set, feature_extractor=<function ba-
sic_extractor>, format=None, **kwargs)***

class ***textblob.classifiers.NLTKClassifier(train_set, feature_extractor=<function ba-
sic_extractor>, format=None, **kwargs)***

<li>Example:</li>
<li>class MyClassifier(NLTKClassifier):
nltk_class = nltk.classify.svm.SvmClassifier</li>

class ***textblob.classifiers.NaiveBayesClassifier(train_set, feature_extractor=<function basic_extractor>, format=None,
**kwargs)*** A classifier based on the Naive Bayes algorithm, as implemented in NLTK.


class ***textblob.classifiers.PositiveNaiveBayesClassifier(positive_set, unlabeled_set, feature_extractor=<function contains_extractor>, positive_prob_prior=0.5, **kwargs)***
A variant of the Naive Bayes Classifier that performs binary classification with partially-labeled training sets, i.e. when only one class is labeled and the other is not. Assuming a prior distribution on the two labels, uses the unlabeled set to estimate the frequencies of the features.

## Blobber

class ***textblob.blob.Blobber(tokenizer=None, pos_tagger=None, np_extractor=None, ana- lyzer=None, parser=None, classifier=None)***
A factory for TextBlobs that all share the same tagger, tokenizer, parser, classifier, and np_extractor.

In [16]:
from textblob import Blobber
from textblob.taggers import NLTKTagger
from textblob.tokenizers import SentenceTokenizer
tb = Blobber(pos_tagger=NLTKTagger(), tokenizer=SentenceTokenizer())
blob1 = tb("This is one blob.")
blob2 = tb("This blob has the same tagger and tokenizer.")
blob1.pos_tagger is blob2.pos_tagger

True