Skip to content

Spell Checking by downloading tweets from selected accounts

Notifications You must be signed in to change notification settings

botify-hq/python-twitter-spell-checking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Twitter Spell Checking

What is it ?

Twitter Spell Checking aims to correct a word from a text database generated by twitter streams.

First, set twitter accounts that fit most with the words that you want to correct (ex: news, sport, music, health...). Then, launch a crontab every hour to get new tweets (twitter_spelling fetch -n [namespace] -c [settings_file_location]) Finally, just call the api to check the word.

Be careful, if you want to correct some music artists from a specific genre, try to watch best specialised twitter accounts !

Installation

python setup.py install

Configuration

Create a conf file as this one :

[twitter]
CONSUMER_KEY = gFlXceJo9HKBeZjXOjgw
CONSUMER_SECRET = QpCq1uQLsBC3FvreKMqzkfKUllf0R22LM6oCP50
ACCESS_TOKEN = 195869-eTemy6p0ljmq2cWL6kGTXTR1BT7BpJqX9uNwzftpo
ACCESS_TOKEN_SECRET = kfAcMwqrJ9sEUjwy1vf0ItL7Nf673DBEnTCLvaM

[namespace:fr]
files = /var/www/twitter_spelling/
accounts = Caradisiac, gizmodofr, Rue89, futurasciences, purecharts, jeuxvideo, LeNouvelObs, sports_direct, 20minutes, SportF24, Slatefr, PremiereFR, LaRedouteFR, liberation_info, purepeople, Maxisciences, aufeminin_com, ZDNet, AutoPlus, lequipe, lemondefr, Telerama, Clubic, GEOfr, lesinrocks, France24_fr, LesEchos, doctissimo, ELLEfrance,  Boursier_com, francefootball

[namespace:en]
files = /var/www/twitter_spelling/
accounts = Slate, gizmodo

Fetch the tweets

twitter_spelling fetch -n [namespace] -c [settings_file_location]

The tweets will be loggued into the file [files]/tweets_[namespace].txt

Spell Checking

Some parts of code from http://norvig.com/spell-correct.html

from twitter_spelling import Correct
c = Correct(settings_file_location)
c.correct('my expression')

If in your namespace, your are following some sport and news account :

>> c.correct('franck riberi')
franck ribery
>> c.correct('steve job')
steve jobs
>> c.correct('news york')
new york
>> c.correct('comunity')
community

About

Spell Checking by downloading tweets from selected accounts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages