https://stackabuse.com/python-for-nlp-introduction-to-the-pattern-library/

In [5]:
from pattern.en import parse
from pattern.en import pprint

print(parse('I drove my car to the hospital yesterday', relations=True, lemmata=True))

I/PRP/B-NP/O/NP-SBJ-1/i drove/VBD/B-VP/O/VP-1/drive my/PRP$/B-NP/O/NP-OBJ-1/my car/NN/I-NP/O/NP-OBJ-1/car to/TO/O/O/O/to the/DT/B-NP/O/O/the hospital/NN/I-NP/O/O/hospital yesterday/NN/I-NP/O/O/yesterday


#Pluralizing and Singularizing the Tokens

In [6]:
from pattern.en import pluralize, singularize

print(pluralize('leaf'))
print(singularize('theives'))


leaves
theife


Converting Adjective to Comparative and Superlative Degrees

In [7]:
from pattern.en import comparative, superlative

print(comparative('good'))
print(superlative('good'))


better
best


Finding N-Grams

In [9]:
from pattern.en import ngrams
print(ngrams("He goes to hospital", n=2))

[('He', 'goes'), ('goes', 'to'), ('to', 'hospital')]


Finding Sentiments

In [10]:
from pattern.en import sentiment

print(sentiment("This is an excellent movie to watch. I really love it"))

(0.75, 0.8)


In [11]:
from pattern.en import parse, Sentence
from pattern.en import modality

text = "Paris is the capital of France"
sent = parse(text, lemmata=True)
sent = Sentence(sent)

print(modality(sent))

1.0


In [12]:
text = "I think we can complete this task"
sent = parse(text, lemmata=True)
sent = Sentence(sent)

print(modality(sent))

0.25


Spelling correction

In [14]:
from pattern.en import suggest

print(suggest("Whitle"))

[('While', 0.6459209419680404), ('White', 0.2968881412952061), ('Title', 0.03280067283431455), ('Whistle', 0.023549201009251473), ('Chile', 0.0008410428931875525)]


In [15]:
from pattern.en import suggest
print(suggest("Fracture"))

[('Fracture', 1.0)]


Working with Numbers

In [16]:
from pattern.en import number, numerals

print(number("one hundred and twenty two"))
print(numerals(256.390, round=2))

122
two hundred and fifty-six point thirty-nine


In [17]:
from pattern.en import quantify

print(quantify(['apple', 'apple', 'apple', 'banana', 'banana', 'banana', 'mango', 'mango']))

several bananas, several apples and a pair of mangoes


In [18]:
from pattern.en import quantify

print(quantify({'strawberry': 200, 'peach': 15}))
print(quantify('orange', amount=1200))

hundreds of strawberries and a number of peaches
thousands of oranges


Pattern Library Functions for Data Mining

In [19]:
from pattern.web import download

page_html = download('https://en.wikipedia.org/wiki/Artificial_intelligence', unicode=True)

In [20]:
from pattern.web import URL, extension

page_url = URL('https://upload.wikimedia.org/wikipedia/commons/f/f1/RougeOr_football.jpg')
file = open('football' + extension(page_url.page), 'wb')
file.write(page_url.download())
file.close()

Finding URLs within Text

In [21]:
from pattern.web import find_urls

print(find_urls('To search anything, go to www.google.com', unique=True))

['www.google.com']


Making Asynchronous Requests for Webpages

In [23]:
from pattern.web import asynchronous, time, Google

asyn_req = asynchronous(Google().search, 'artificial intelligence', timeout=4)
while not asyn_req.done:
    time.sleep(0.1)
    print('searching...')

print(asyn_req.value)

print(find_urls(asyn_req.value, unique=True))

searching...
searching...
searching...
searching...
searching...
searching...
searching...
[Result({'url': 'https://en.wikipedia.org/wiki/Artificial_intelligence', 'title': 'Artificial intelligence - Wikipedia', 'text': '<b>Artificial intelligence</b> (<b>AI</b>), sometimes called machine intelligence, is intelligence <br>\ndemonstrated by machines, unlike the natural intelligence displayed by humans<br>\n&nbsp;...'}), Result({'url': 'https://builtin.com/artificial-intelligence', 'title': 'What is Artificial Intelligence? How Does AI Work? | Built In', 'text': '<b>Artificial intelligence</b> (<b>AI</b>) is wide-ranging branch of computer science concerned <br>\nwith building smart machines capable of performing tasks that typically require&nbsp;...'}), Result({'url': 'https://futureoflife.org/background/benefits-risks-of-artificial-intelligence/', 'title': 'Benefits & Risks of Artificial Intelligence - Future of Life Institute', 'text': 'What is <b>AI</b>? From SIRI to self-driving car

Getting Search Engine Results with APIs

In [24]:
from pattern.web import Google

google = Google(license=None)
for search_result in google.search('artificial intelligence'):
    print(search_result.url)
    print(search_result.text)


https://en.wikipedia.org/wiki/Artificial_intelligence
<b>Artificial intelligence</b> (<b>AI</b>), sometimes called machine intelligence, is intelligence <br>
demonstrated by machines, unlike the natural intelligence displayed by humans<br>
&nbsp;...
https://builtin.com/artificial-intelligence
<b>Artificial intelligence</b> (<b>AI</b>) is wide-ranging branch of computer science concerned <br>
with building smart machines capable of performing tasks that typically require&nbsp;...
https://futureoflife.org/background/benefits-risks-of-artificial-intelligence/
What is <b>AI</b>? From SIRI to self-driving cars, <b>artificial intelligence</b> (<b>AI</b>) is progressing <br>
rapidly. While science fiction often&nbsp;...
https://www.investopedia.com/terms/a/artificial-intelligence-ai.asp
... <b>Artificial intelligence</b> refers to the simulation of human intelligence in machines <br>
that are programmed to think and act like humans.
https://www.sas.com/en_us/insights/analytics/what-is-artific

In [25]:
from pattern.web import Twitter

twitter = Twitter()
index = None
for j in range(3):
    for tweet in twitter.search('artificial intelligence', start=index, count=3):
        print(tweet.text)
        index = tweet.id

RT @BFCXguru: 9 Soft Skills Every #Employee Will Need In The Age Of #ArtificialIntelligence (#AI) 

https://t.co/lVQd6L5hpM 

#Technology @bernardmarr @MikeQuindazzi @SpirosMargaris @Ronald_vanLoon @andi_staub @sallyeaves @Fabriziobustama @HaroldSinnott @YuHelenYu @TopCyberNews @Hal_Good
A Brief Introduction to Artificial Intelligence #ArtificialIntelligence via https://t.co/qObPsEDIzA https://t.co/dR7MNXvpiI
Does Artificial Intelligence Keep Its Promises? #MachineLearning #learning via https://t.co/NEo7LankbK https://t.co/5PHOR8y1Kh
RT @BFCXguru: 7 Reason Why #ArtificialIntelligence In #Manufacturing Revolutionizing -

https://t.co/QdbjlnUId1 https://t.co/tVAvLXSpWy 

#AI #Technology #ManufacturingIndustry @Techiexpert @SpirosMargaris @JimMarous @Xbond49 @ahier @BrettKing @missmetaverse @psb_dc @leimer @TopCyberNews
“Most strings are random. Most meaningful strings are not.

Compression = modeling + coding. Coding is a solved problem.

Modeling is provably not solvable.
Compression is

Converting HTML Data to Plain Text

In [26]:
from pattern.web import URL, plaintext

html_content = URL('https://stackabuse.com/python-for-nlp-introduction-to-the-textblob-library/').download()
cleaned_page = plaintext(html_content.decode('utf-8'))
print(cleaned_page)

Python for NLP: Introduction to the TextBlob Library

Toggle navigation Stack Abuse

* JavaScript
* Python
* Java
* Jobs

Python for NLP: Introduction to the TextBlob Library

By

Usman Malik

•0 Comments

Introduction

This is the seventh article in my series of articles on Python for NLP. In my previous article, I explained how to perform topic modeling using Latent Dirichlet Allocation and Non-Negative Matrix factorization. We used the Scikit-Learn library to perform topic modeling.

In this article, we will explore TextBlob, which is another extremely powerful NLP library for Python. TextBlob is built upon NLTK and provides an easy to use interface to the NLTK library. We will see how TextBlob can be used to perform a variety of NLP tasks ranging from parts-of-speech tagging to sentiment analysis, and language translation to text classification.

The detailed download instructions for the library can be found at the official link. I would suggest that you install the TextBlob libra

Parsing PDF Documments

In [40]:
from pattern.web import URL, PDF
pdf_doc = URL('http://demo.clab.cs.cmu.edu/NLP/syllabus_f18.pdf').download()
#print(PDF(pdf_doc.decode('utf-8')))

Clearing the Cache

In [42]:
from pattern.web import cache

cache.clear()


In [43]:
import os
os.getcwd()

'C:\\Users\\plthi'