In [2]:
%%html
<style>
table {float:left}
</style>

In [91]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<IPython.core.display.Javascript object>

<h1 id="tocheading">Overview</h1>
<div id="toc"></div>

## Learning Goals

* Learn and understand MongoDB and how to use it from Python
* Learn and understand Twitter REST and streaming APIs and how to use them from Python
* Understand what Sentiment Analysis is
* Build a twitter sentiment mining application
* Learn how to deploy a data science product as an API

## The Road Ahead

* Mining Twitter Sentiment
* NBC for Spam and Ham
* Kaggle Titanic Competition
* SQL in the Wild!

## Evaluation Scheme

* 10% for submission of Assignment 1 (bdap2015/NoSQL)
* 40% for submission of Assignment 2
* 50% for final exam

## Why Twitter Sentiment Analysis

* Twitter - learn to use rest and streaming apis
* Twitter - perfect data structure to learn mongodb
* Sentiment Analysis - introduction to machine learning and natural language processing
* Sentiment Analysis - we'll keep revisiting as we learn more and more sophisticated techniques

# Introduction to MongoDB

## CRUD

### Create/Insert

    mongo twitter
    
    db.users.insert({
      name: "Shakuny Mama",
      email: "shakuni.mama@mahabharata.com",
      age:42
    })

    show collections

    db.users.find()

**Notes**
* Databases and Collections are lazily created - created when we need them, not when they are defined.
* With greater flexibility comes greater responsibility - beware of typos

**Note: What is special about _id?**
* Auto-generated
* Auto-generated vs Auto-incremented
* Horizontal Sharding

### Read

    db.users.find({ "_id" : ObjectId("566a247ddae35821b3a0c523") })

(select fields)

    db.users.find({ _id : ObjectId("566a247ddae35821b3a0c523") }, { name : 1 })
    db.users.find({ _id : ObjectId("566a247ddae35821b3a0c523") }, { name : 0 }) #omit only name

(more sophisticated queries)

    db.users.find(
        { name : /^P/, age : { $lt : 40 } },
        { name : 1, age : 1 }
    )

(an even more complicated example)

    var age_range = {}
    age_range['$lt'] = 1000000
    age_range['$gt'] = 10000
    
    db.users.find(
        { name : /^P/, age : age_range },
        { name: 1 }
    )

### Update

    db.users.update(
        { _id : ObjectId("4d0ada87bb30773266f39fe5") },
        { $set : { "name" : "Something Else" } }
    );

### Delete

    var bad_bacon = {
        'exports.foods' : {
            $elemMatch : {
                name : 'bacon',
                tasty : false
            }
        }
    }

    db.countries.find( bad_bacon )

    db.countries.remove( bad_bacon )
    db.countries.count()

## JSON

<img src="mongodb_record_as_json_diag.png">

## A quick comparison

<img src="sql_vs_mongodb_schema_arrangement.png">

| Concept | SQL | MongoDB |
|:---|---|---|
| One User                         | One Row                    | One Document |
| All Users                        | Users Table                | Users Collection |
| One Username Per User (1-to-1)   | Username Column            | Username Property |
| Many Emails Per User (1-to-many) | SQL JOIN with Emails Table | Embed relevant email doc in User Document |
| Many Items Owned by Many Users (many-tomany) | SQL JOIN with Items Table | Programmatically Join with Items Collection |


## MongoDB from Python

### Connect to database

In [33]:
import sys
from pymongo import MongoClient
from pymongo.errors import ConnectionFailure

def main():
    """ Connect to MongoDB """
    try:
        #Connect to Database
        client = MongoClient(host="localhost", port=27017)
        print "Connected successfully"

    except ConnectionFailure, e:
        sys.stderr.write("Could not connect to MongoDB: %s" % e)
        sys.exit(1)

if __name__ == "__main__":
    main()



Connected successfully


In [9]:
client = MongoClient('localhost', 27017)

# The URI format
client = MongoClient('mongodb://localhost:27017/')

In [38]:
import sys
from pymongo import MongoClient
from pymongo.errors import ConnectionFailure

def main():
    """ Connect to MongoDB """
    try:
        #Connect to Database
        client = MongoClient(host="localhost", port=27017)
        print "Connected successfully"
        
        # Get a Database handle to a database named "twitterdb"
        dbh = client["twitterdb"]
        print "Successfully set up a database handle"
        
    except ConnectionFailure, e:
        sys.stderr.write("Could not connect to MongoDB: %s" % e)
        sys.exit(1)

if __name__ == "__main__":
    main()


Connected successfully
Successfully set up a database handle


In [10]:
client["twitterdb"]

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), u'twitterdb')

In [41]:
client.twitterdb

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), u'twitterdb')

### Create/Insert

In [7]:
import sys
from pymongo import MongoClient
from pymongo.errors import ConnectionFailure
from datetime import datetime

def main():
    """ Connect to MongoDB """
    try:
        #Connect to Database
        client = MongoClient(host="localhost", port=27017)
        print "Connected successfully"
        
        # Get a Database handle to a database named "twitterdb"
        dbh = client["twitterdb"]
        #assert dbh.connection == c
        print "Successfully set up a database handle"
        
    except ConnectionFailure, e:
        sys.stderr.write("Could not connect to MongoDB: %s" % e)
        sys.exit(1)
        
    user_doc = {
        "username" : "janedoe",
        "firstname" : "Jane",
        "surname" : "Doe",
        "dateofbirth" : datetime(1974, 4, 12),
        "email" : "janedoe74@example.com",
        "score" : 0
        }
    dbh.users.insert_one(user_doc)
    print "Successfully inserted document: %s" % user_doc

if __name__ == "__main__":
    main()


Connected successfully
Successfully set up a database handle
Successfully inserted document: {'username': 'janedoe', 'surname': 'Doe', 'firstname': 'Jane', 'dateofbirth': datetime.datetime(1974, 4, 12, 0, 0), 'score': 0, '_id': ObjectId('5669f598871c2559f56bb5d1'), 'email': 'janedoe74@example.com'}


**Notes**
* The PyMongo driver supports Python datetime objects (it'll translate between mongodb datetime objects and python datatime objects), which is great for us. We'll not have to translate between the two data structures.
* Just like we noted before, we don't have to create our collection “users” before we insert documents to it.

In [11]:
result = client.twitterdb.users.insert_one({
    "username" : "Pavitra",
    "firstname" : "Pavitra",
    "surname" : "Pravakar",
    "dateofbirth" : datetime(1986, 4, 12),
    "email" : "spiderman@marvelheroes.com",
    "score" : 0
})
result.inserted_id

ObjectId('5669f5a6871c2559f56bb5d2')

### Read

In [101]:
user_doc = client.twitterdb.users.find_one({"username" : "janedoe"})
if not user_doc:
    print "no document found for username janedoe"

In [107]:
users = client.twitterdb.users.find({"username":"janedoe"})
for user in users:
    print user.get("email")

janedoe74@example2.com


### Update

In [78]:
user_doc = {
    "username" : "janedoe",
    "firstname" : "Jane",
    "surname" : "Doe",
    "dateofbirth" : datetime(1974, 4, 12),
    "email" : "janedoe74@example.com",
    "score" : 0
}

In [97]:
# first query to get a copy of the current document
import copy
old_user_doc = client.twitterdb.users.find_one({"username":"janedoe"})
new_user_doc = copy.deepcopy(old_user_doc)

# modify the copy to change the email address
new_user_doc["email"] = "janedoe74@example2.com"

# run the update query
# replace the matched document with the contents of new_user_doc
client.twitterdb.users.replace_one({"username":"janedoe"}, new_user_doc)

<pymongo.results.UpdateResult at 0x7f64113405f0>

Building the whole replacement document can be cumbersome, and worse, can introduce race conditions. Imagine you want to increment the score property of the “janedoe” user. In order to achieve this with the replacement approach, you would have to first fetch the document, modify it with the incremented score, then write it back to
the database. With that approach, you could easily lose other score changes if something else were to update the score in between you reading and writing it.

In order to solve this problem, the update document supports an additional set of MongoDB operators called “update modifiers”. These update modifiers include operators such as atomic increment/decrement, atomic list push/pop and so on. It is very helpful to be aware of which update modifiers are available and what they can do when
designing your application.

In [87]:
client.twitterdb.users.update_one({"username":"janedoe"},
                {"$set":{"email":"janedoe74@example2.com"}})

<pymongo.results.UpdateResult at 0x7f63dffeb0f0>

In [92]:
client.twitterdb.users.update_one({"username":"janedoe"},
                 {"$set":{"email":"janedoe74@example2.com", "score":1}})

<pymongo.results.UpdateResult at 0x7f63dffeb190>

In [2]:
result = client.twitterdb.users.update_one({"username":"janedoe"},
                 {"$set":{"email":"janedoe74@example2.com", "score":1}})
result.modified_count

NameError: name 'client' is not defined

### Delete

In [90]:
client.twitterdb.users.delete_one({"score":1})

<pymongo.results.DeleteResult at 0x7f63dffeb140>

# The Twitter API in Action

## Organization of Twitter Data

<img src="teamindia_tweet.png">

A Tweet contains:
* date and time
* links
* user mentions (@)
* hash tags (#)
* retweets count
* locale language
* favorites count
* geocode


## Accessing Twitter Data

### REST API

* [Twitter REST API Documentation](https://dev.twitter.com/rest/public)

### Streaming API

* [Twitter Streaming API Documentation](https://dev.twitter.com/streaming/overview)

### OAuth

* [Twitter OAuth Documentation](https://dev.twitter.com/oauth)
* Instructions for getting access:
    - Create a Twitter account
    - Go to https://apps.twitter.com/
    - Create New App (button on top right corner(-ish))
    - Fill out details in the next page. Value of *Website* doesn't matter right now (use http://google.com). Create your Twitter application.
    - In the next screen, select the *KeyandAccessTokens* tab.
    - Note down the following credentials:
        * Consumer Key (API Key)
        * Consumer Secret (API Secret)
    - Click on *Create my access token*. After tokens are generated, note down the following credentials:
        * Access Token
        * Access Token Secret
    - Add the credentials to *.profile*
        * .profile vs .bashrc vs .Renviron

## Introduction to Twython

    pip install twython

* [Official Twython Documentation](https://twython.readthedocs.org/en/latest/)
* Supports both REST and Streaming APIs
* For more wrappers, see https://dev.twitter.com/overview/api/twitter-libraries

### Searching by Topic

In [30]:
import os
TWITTER_CONSUMER_KEY = os.environ["TWITTER_CONSUMER_KEY"]
TWITTER_CONSUMER_SECRET = os.environ["TWITTER_CONSUMER_SECRET"]
TWITTER_ACCESS_TOKEN = os.environ["TWITTER_ACCESS_TOKEN"]
TWITTER_ACCESS_TOKEN_SECRET = os.environ["TWITTER_ACCESS_TOKEN_SECRET"]

In [33]:
from twython import Twython
twitter = Twython(TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET, TWITTER_ACCESS_TOKEN, TWITTER_ACCESS_TOKEN_SECRET)

In [35]:
result = twitter.search(q="bigdata")

**Note**
* If Twython fails to authenticate, result will have the following json as its value:
        {"errors":[{"message":"Bad Authentication data", "code":215}]}
* If successful, Twython will convert the JSON it receives to a native python object.

In [36]:
for status in result["statuses"]:
    print(status)

{u'contributors': None, u'truncated': False, u'text': u'RT @analyticbridge: Top LinkedIn Groups for Analytics, Big Data, and Data Science  \nhttps://t.co/PrIONSq1pU https://t.co/zSXAoDo7aZ', u'is_quote_status': False, u'in_reply_to_status_id': None, u'id': 675102837379104768, u'favorite_count': 0, u'source': u'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', u'retweeted': False, u'coordinates': None, u'entities': {u'symbols': [], u'user_mentions': [{u'id': 14174897, u'indices': [3, 18], u'id_str': u'14174897', u'screen_name': u'analyticbridge', u'name': u'Big Data Science'}], u'hashtags': [], u'urls': [{u'url': u'https://t.co/PrIONSq1pU', u'indices': [84, 107], u'expanded_url': u'http://www.datasciencecentral.com/profiles/blogs/top-linkedin-groups-for-analytics-big-data-data-mining-and-data', u'display_url': u'datasciencecentral.com/profiles/blogs\u2026'}], u'media': [{u'source_user_id': 14174897, u'source_status_id_str': u'675034264702988288', u'expanded_url': u'ht

In [39]:
for status in result["statuses"]:
    print("user: {0} text: {1}".format(status["user"]["name"], 
                                       status["text"]))

user: Inside Analysis text: RT @analyticbridge: Top LinkedIn Groups for Analytics, Big Data, and Data Science  
https://t.co/PrIONSq1pU https://t.co/zSXAoDo7aZ
user: Podsystem M2M text: What Is #IoT? Why it Matters and What You Should Do @HuffPostBiz https://t.co/ViuQMrNztM #InternetOfThings #BigData https://t.co/o9BP9bHnVq
user: Podsystem Group text: What Is #IoT? Why it Matters and What You Should Do @HuffPostBiz https://t.co/4OM6sOhz6H #InternetOfThings #BigData https://t.co/MJXUqI2br7
user: Inside Analysis text: RT @TheWebAnalytics: Basic Google Analytics Filters for Every Site https://t.co/bcL93342xX
user: Sarah Austin text: #DataScience acknowledges causation to test scientifically. The reverse denies the scientific method and is called data magic. #bigdata #AI
user: Inside Analysis text: RT @BigdataProfiles: Atlanta Hawks Select RetailNext For In-Store Retail Analytics Solutions https://t.co/OtV99dWZii
user: Inside Analysis text: RT @ActiveDEMAND: Are your campaigns effective? K

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 139: ordinal not in range(128)

In [48]:
result = twitter.search(q="data science")
for status in result["statuses"]:
    print("user: {0} \n text: {1} \n".format(status["user"]["name"].encode("utf-8"), 
                                             status["text"].encode("utf-8")))

user: Jojari CannonEdwards 
 text: RT @HostingSocial: Cleversafe Founder Donates $7.6M to Computer Science at Illinois Tech: The founder of big data storage provide... https:… 

user: Hossam Hassanien 
 text: RT @DataScience_Top: Follow the top Data Science stories for Dec 11 on our topical page: https://t.co/aGhVGWKd6k 

user: Angelo N Ferrara DC 
 text: SCIENCE BASED HYPOTHESIS AND METHOD SUPPORTED BY EMPIRICAL DATA AND CONFIRMED BY CLINICAL DOCUMENTED   RESULTS 

user: kazuhiro arai 
 text: RT @O_S_M: New @O_S_M Biological Data: https://t.co/VCUlwmXp96 including compounds made by @SydneyChemistry @Sydney_Science UG! #openscience 

user: DataScienceTopNews 
 text: Follow the top Data Science stories for Dec 11 on our topical page: https://t.co/aGhVGWKd6k 

user: Inside Analysis 
 text: RT @Ronald_vanLoon: 30 Cant miss Harvard Business Review articles on Data Science, Big Data and Analytics | #Da… https://t.co/pcTrbZhdhl ht… 

user: parvez ahmed 
 text: RT @SmartDataCo: 4 Effortless T

More documentation at https://dev.twitter.com/rest/reference/get/search/tweets

### Retrieving Timeline (your own)

In [52]:
timeline = twitter.get_home_timeline()

In [64]:
for tweet in timeline:
    print(" User: {0} \n Created: {1} \n Text: {2} \n".format(tweet["user"]["name"].encode("utf-8"), 
                                                            tweet["created_at"].encode("utf-8"), 
                                                            tweet["text"].encode("utf-8")))

 User: Anuj More 
 Created: Fri Dec 11 00:01:45 +0000 2015 
 Text: The proof for this one is surprisingly difficult:
3x + 1 conjecture: https://t.co/lgNUgEZC6f 

 User: Vinay Pathak 
 Created: Thu Dec 10 23:46:23 +0000 2015 
 Text: https://t.co/RBEtnj2wn7 via youtube
Thank you @ravishndtv for an authentic journalistic and liberal view! We def need more like you ! 👍🏽👏🏽 

 User: John Myles White 
 Created: Thu Dec 10 23:36:46 +0000 2015 
 Text: This makes me want to spend my holidays doing RL coding: https://t.co/Zeys4mISfI 

 User: Data Science Central 
 Created: Thu Dec 10 23:05:40 +0000 2015 
 Text: Enterprises that Have Truly Mastered Customer Experience https://t.co/7cUM3EjbQp 

 User: Cloudera 
 Created: Thu Dec 10 23:00:05 +0000 2015 
 Text: Only 24 hrs left to help shape our #BigData Predictions! Tell us how your org's use of #BigData is evolving https://t.co/UG2PgZF5H2 

 User: David Smith 
 Created: Thu Dec 10 22:39:59 +0000 2015 
 Text: RT @EricKnorr: Matt Asay nails it: The s

### Retrieving Timeline (other users)

In [65]:
tl = twitter.get_user_timeline(screen_name = "iamsrk", count = 5)
for tweet in tl:
    print(" User: {0} \n Created: {1} \n Text: {2} \n".format(tweet["user"]["name"].encode("utf-8"),
                                                            tweet["created_at"].encode("utf-8"),
                                                            tweet["text"].encode("utf-8")))

 User: Shah Rukh Khan 
 Created: Thu Dec 10 23:40:07 +0000 2015 
 Text: It’s exactly the same!!! Well done. https://t.co/tseZAzxUaR 

 User: Shah Rukh Khan 
 Created: Thu Dec 10 23:16:20 +0000 2015 
 Text: Yes does look like mine!! https://t.co/siY2EzkhbH 

 User: Shah Rukh Khan 
 Created: Thu Dec 10 22:26:32 +0000 2015 
 Text: Thank you Pritamda. My favourite mad man.  https://t.co/Xm5414woJO 

 User: Shah Rukh Khan 
 Created: Thu Dec 10 22:24:59 +0000 2015 
 Text: I wish u the best my brother.  https://t.co/m4ClJnOvYp 

 User: Shah Rukh Khan 
 Created: Thu Dec 10 22:10:04 +0000 2015 
 Text: How beautiful are you!!! https://t.co/qwYJso9WjF 



* [Official Documentation for home timeline](https://dev.twitter.com/rest/reference/get/statuses/home_timeline)
* [Offician Documentation for (other) user timeline](https://dev.twitter.com/rest/reference/get/statuses/user_timeline)

### Get a list of followers

In [66]:
followers = twitter.get_followers_list(screen_name="dataBiryani")

In [67]:
for follower in followers["users"]:
    print(" {0} \n ".format(follower))

 {u'follow_request_sent': False, u'has_extended_profile': True, u'profile_use_background_image': True, u'default_profile_image': False, u'id': 4319212154, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'blocked_by': False, u'profile_text_color': u'333333', u'muting': False, u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/668283219041583105/uwLl7mid_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'description': {u'urls': []}}, u'followers_count': 10, u'profile_sidebar_border_color': u'C0DEED', u'id_str': u'4319212154', u'profile_background_color': u'C0DEED', u'listed_count': 0, u'status': {u'contributors': None, u'truncated': False, u'text': u'https://t.co/RknemKN92V', u'in_reply_to_status_id': None, u'id': 668322466410790912, u'favorite_count': 0, u'source': u'<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>', u'retweeted': False, u'coordinates

In [68]:
for follower in followers["users"]:
    print(" user: {0} \n name: {1} \n Number of tweets: {2} \n".format(follower["screen_name"],
                                                                       follower["name"],
                                                                       follower["statuses_count"]))

 user: ArmanRathod3 
 name: paritosh bhandarkar 
 Number of tweets: 1 

 user: RiddhikRathod 
 name: Riddhik Rathod 
 Number of tweets: 3 

 user: hiteshbjirawla 
 name: Hitesh B Jirawla 
 Number of tweets: 17 

 user: chiraggambhira 
 name: chirag gambhira 
 Number of tweets: 3 

 user: SaiBundel 
 name: Sai Singh Bundel 
 Number of tweets: 0 

 user: singhsaloni1151 
 name: singhsaloni 
 Number of tweets: 2 

 user: akhuperkar 
 name: Abhijit Khuperkar 
 Number of tweets: 130 

 user: shash_007 
 name: Shashwat Chaturvedi 
 Number of tweets: 1117 

 user: ChaturBalakSays 
 name: Chatur Balak 
 Number of tweets: 20 

 user: Mumbairb 
 name: Mumbai Ruby Group 
 Number of tweets: 209 

 user: JawaharKeerthi 
 name: Jawahar, keerthi 
 Number of tweets: 2 

 user: cskofjbims 
 name: Chandrashekhar SK 
 Number of tweets: 26 

 user: 17lokeshm 
 name: lokesh kumar luha 
 Number of tweets: 0 

 user: nayangupta21 
 name: NAYAN GUPTA 
 Number of tweets: 54 

 user: TrlManoj 
 name: Manoj Amar

# Setiment Classification

## What is Sentiment Classification?

Sentiment classification is a special task of text classification whose objective is to classify a text according to the sentimental polarities of opinions it contains - favorable or unfavorable, positive or negative.

<img src="sentiment_classification_process.png">

## Dataset

### Affective Norms for English Words

*  The ANEW provides a set of normative emotional ratings as a text corpus for a large number of words in the English language.
* These sets of verbal materials have been rated in terms of pleasure, arousal, and dominance, in order to create a standard for use in studies of emotion and attention.

### Sentiment140

* http://help.sentiment140.com/for-students
* Download link
    - http://cs.stanford.edu/people/alecmgo/trainingandtestdata.zip
    - (mirror) https://docs.google.com/file/d/0B04GJPshIjmPRnZManQwWEdTZjg/edit
* The data has been processed so that the emoticons are stripped off.
* CSV format
* Data file format has 6 fields:
    - 0 - the polarity of the tweet (0 = negative, 2 = neutral, 4 = positive)
    - 1 - the id of the tweet (2087)
    - 2 - the date of the tweet (Sat May 16 23:58:44 UTC 2009)
    - 3 - the query (lyx). If there is no query, then this value is NO_QUERY.
    - 4 - the user that tweeted (robotickilldozr)
    - 5 - the text of the tweet (Lyx is cool)

## NLTK

In [117]:
import nltk
nltk.download() # Models > punkt

showing info http://www.nltk.org/nltk_data/


True

In [111]:
nltk.word_tokenize("Busy day ahead of me. Also just remembered that I left peah slices in the fridge at work on Friday. ")

['Busy',
 'day',
 'ahead',
 'of',
 'me',
 '.',
 'Also',
 'just',
 'remembered',
 'that',
 'I',
 'left',
 'peah',
 'slices',
 'in',
 'the',
 'fridge',
 'at',
 'work',
 'on',
 'Friday',
 '.']

In [None]:
def bagOfWords(tweets):
    wordsList = []
    for (words, sentiment) in tweets:
        wordsList.extend(words)
return wordsList

In [None]:
def wordFeatures(wordList):
    wordList = nltk.FreqDist(wordList)
    wordFeatures = wordList.keys()
    return wordFeatures

In [113]:
def getFeatures(doc):
    docWords = set(doc)
    feat = {}
    for word in wordFeatures:
        feat['contains(%s)' % word] = (word in docWords)
    return feat

In [None]:
# Fill these up with values from Sentiment140 dataset
positiveTweets = ???
negativeTweets = ???

In [None]:
corpusOfTweets = []
for (words, sentiment) in positiveTweets + negativeTweets:
    wordsFiltered = [e.lower() for e in nltk.word_tokenize(words) if len(e) >= 3]
    tweets.append((wordsFiltered, sentiment))

In [114]:
wordFeatures = wordFeatures(bagOfWords(corpusOfTweets))
training = nltk.classify.apply_features(getFeatures, corpusOfTweets)
classifier = nltk.NaiveBayesClassifier.train(training)
print(classifier.show_most_informative_features(32))

NameError: name 'wordFeatures' is not defined

**Predicting Sentiment of new Tweets**

In [None]:
from twython import Twython

twitter = Twython(ConsumerKey, ConsumerSecret, AccessToken, AccessTokenSecret)

result = twitter.search(q="python")

for status in result["statuses"]:
    print("Tweet: {0} \n Sentiment: {1}".format(status["text"], 
                                                classifier.classify(extract_features(status["text"].split()))))

# What Next?

## The Assignment

1. Write a blog post on how to use **OR** operator for find queries in mongodb.
2. Feed negative and positive tweets to the classification function for training. (using the Sentiment140 dataset)
3. Crawl all followers of ***naveen_odisha***, Odisha CM (note: you'll have to pay attention to rate limiting)
4. Crawl all followers of SRK. How can you calculate if this is feasible or not? (show the math)
5. Predict the sentiment of tweets by followers of ***naveen_odisha*** 