# TextBlob can't distinguish names from other proper nouns

In [1]:
import textblob
from textblob import TextBlob
from textblob.en.taggers import PatternTagger

#### Sentences: "Alices is next to Bob", "Alice is next to China"

In [2]:
blob1 = TextBlob("Alice is next to Bob.")
blob2 = TextBlob("Alice is next to China.")

### Starting with the default tagger (NLTKTagger):

In [3]:
print(f"blob1 and blob2 taggers: {blob1.pos_tagger}, {blob2.pos_tagger}")

blob1 and blob2 taggers: <textblob.en.taggers.NLTKTagger object at 0x112eb56c0>, <textblob.en.taggers.NLTKTagger object at 0x112eb56c0>


In [4]:
print(f"blob1.tags: {blob1.tags}")
print(f"blob2.tags: {blob2.tags}")
print(f"blob1.parse(): {blob1.parse()}")
print(f"blob2.parse(): {blob2.parse()}")

blob1.tags: [('Alice', 'NNP'), ('is', 'VBZ'), ('next', 'JJ'), ('to', 'TO'), ('Bob', 'NNP')]
blob2.tags: [('Alice', 'NNP'), ('is', 'VBZ'), ('next', 'JJ'), ('to', 'TO'), ('China', 'NNP')]
blob1.parse(): Alice/NNP/B-NP/O is/VBZ/B-VP/O next/JJ/B-ADJP/O to/TO/B-PP/B-PNP Bob/NNP/B-NP/I-PNP ././O/O
blob2.parse(): Alice/NNP/B-NP/O is/VBZ/B-VP/O next/JJ/B-ADJP/O to/TO/B-PP/B-PNP China/NNP/B-NP/I-PNP ././O/O


### Trying again with the other provided tagger (PatternTagger) gives the exact same results:

In [5]:
from textblob.en.taggers import PatternTagger
blob1.pos_tagger = PatternTagger()
blob2.pos_tagger = PatternTagger()

In [6]:
print(f"blob1.tags: {blob1.tags}")
print(f"blob2.tags: {blob2.tags}")
print(f"blob1.parse(): {blob1.parse()}")
print(f"blob2.parse(): {blob2.parse()}")

blob1.tags: [('Alice', 'NNP'), ('is', 'VBZ'), ('next', 'JJ'), ('to', 'TO'), ('Bob', 'NNP')]
blob2.tags: [('Alice', 'NNP'), ('is', 'VBZ'), ('next', 'JJ'), ('to', 'TO'), ('China', 'NNP')]
blob1.parse(): Alice/NNP/B-NP/O is/VBZ/B-VP/O next/JJ/B-ADJP/O to/TO/B-PP/B-PNP Bob/NNP/B-NP/I-PNP ././O/O
blob2.parse(): Alice/NNP/B-NP/O is/VBZ/B-VP/O next/JJ/B-ADJP/O to/TO/B-PP/B-PNP China/NNP/B-NP/I-PNP ././O/O


For the `.tags` property, both "Bob" and "China" are classified as "NNP" [(singular proper noun)](https://www.guru99.com/pos-tagging-chunking-nltk.html). For the `.parse()` method, both "Bob" and "China" are classified as "NNP/B-NP/I-PNP" (singular proper noun, [beginning](https://www.nltk.org/book/ch07.html#:~:text=A%20token%20is%20tagged%20as%20B%20if%20it%20marks%20the%20beginning%20of%20a%20chunk.) of a [noun phrase](https://coling.epfl.ch/TP/corr/TP-parsing-sol.php#:~:text=S%20%2D%20Sentence-,NP%20%2D%20Noun%20Phrase,-VP%20%2D%20Verb%20Phrase), [part of](https://www.nltk.org/book/ch07.html#:~:text=Subsequent%20tokens%20within%20the%20chunk%20are%20tagged%20I.) a [prepositional noun phrase](https://coling.epfl.ch/TP/corr/TP-parsing-sol.php#:~:text=PNP%20%2D%20Prepositional%20Noun%20Phrase))

It makes sense that names wouldn't be classified separately from other proper nouns;
the `.tags` property is an alias for the `.pos_tags()` method, which stands for "part of speech".
A name isn't really a separate "part of speech" from other proper nouns.
`.parse()` seems to basically be `.tags` plus labels for where phrases begin and end.


But after reading through the [TextBlob documentation](https://buildmedia.readthedocs.org/media/pdf/textblob/latest/textblob.pdf), nothing else looks up to the task either, aside from training a classifier with some public dataset (TextBlob doesn't seem to provide any). 

Since other libraries like NLTK and Spacey can detect people's names out-of-the-box, we probably shouldn't use TextBlob.