# Exercise 2 Solution

The purpose of this exercise is to help you get familiar with the Python tweepy library for downloading data from Twitter. Follow the instructions below to complete the exercise. Use the template notebook below to write and execute your python program. Rename the template file as exercise2.ipynb and submit it via D2L. You should also convert the notebook to PDF or HTML and submit the converted file as well.

**Step 1.** Create a Twitter account and create a new Twitter application using the account. Make sure you write down your consumer and access tokens and secret keys.

**Step 2.** Following the Python tweepy example given in the lecture, use the Twitter Search (REST) API to download 30 most recent tweets from the Twitter feed of â€œCDCgovâ€.  This can be done by changing the keyword string given in the sample code to â€œfrom:CDCgovâ€. For more information about the query term to use, read the documentation available at https://dev.twitter.com/rest/public/search. Make sure you install tweepy on your machine before running the code. Save the text messages of the returned tweets into an ASCII text file named cdc.txt. **Make sure you save only the text part of the tweet message NOT the entire JSON message** (which includes username, coordinates, etc). See the example given in the lecture notes on how to access the text part of the tweets.  The following is an example about how to save a string into a file:

with open(â€œfilenameâ€,â€wâ€) as f:

    text = â€œthis is a stringâ€ 
	f.write(text)
    f.write(â€œ\nâ€)
    
**Solution:**

In [1]:
import tweepy
from tweepy import OAuthHandler
from tweepy import API

# Replace the X's with your consumer tokens and access tokens.
# To create the application, go to https://dev.twitter.com

C_KEY = 'XXXXXXX'
C_SECRET = 'XXXXXXX'
A_TOKEN_KEY = 'XXXXXXX'
A_TOKEN_SECRET = 'XXXXXXX'

auth = tweepy.OAuthHandler(C_KEY, C_SECRET)
auth.set_access_token(A_TOKEN_KEY, A_TOKEN_SECRET)
api = tweepy.API(auth)

keyword = 'CDCgov'
posts = api.search(q=keyword,count=30)

with open('cdc.txt', 'w') as f:
    for tweet in posts:
        f.write(str(tweet.text.encode("utf-8")))
        f.write('\n')

---
**Step 3.** Use the Twitter Streaming API to download tweets containing the keyword â€œhealthâ€. Set the time limit to 30 seconds and store the results into a file named health.json. 

**Solution:**

In [2]:
import time
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream

C_KEY = 'XXXXXXX'
C_SECRET = 'XXXXXXX'
A_TOKEN_KEY = 'XXXXXXX'
A_TOKEN_SECRET = 'XXXXXXX'

# Create a StreamListener class 
class MyListener(StreamListener):

    def __init__(self, time_limit=30):
        self.start_time = time.time()
        self.limit = time_limit
        self.outFile = open('health.json', 'w')
        super(MyListener, self).__init__()
    
    def on_data(self, data):
        if (time.time() - self.start_time) < self.limit:
            self.outFile.write(data.strip())     
            self.outFile.write('\n') 
            return True
        else:
            self.outFile.close()
            return False

    def on_error(self, status):
        print(status)
        
auth = OAuthHandler(C_KEY,C_SECRET)
auth.set_access_token(A_TOKEN_KEY,A_TOKEN_SECRET)
myStream = Stream(auth, MyListener(time_limit=30))
myStream.filter(track=[ 'health'])

In [3]:
# Checking to make sure that your json file is stored correctly.
# If the json.loads command fails, modify your on_data() method to the following
# to deal with the newline character at the end of each tweet message:
#    self.outFile.write(data.strip())   
#    self.outFile.write('\n') 

import json

with open('health.json') as f:
    tweets = [json.loads(line) for line in f]
    for twt in tweets:
        print(twt['id'], 'posted', twt['text'], '\n')

1087895614493466625 posted @SamHeughan hi Sam, just saw youâ€™re movie â€œThe Spy.....â€ u were wonderful!!! U would be the perfect Bond; dashing sâ€¦ https://t.co/X6oTlNOHo8 

1087895614711611393 posted This is murder. Plan and Simple. @POTUS @VP where is a judge to stop this? 

1087895616867483648 posted RT @drtlaleng: Abortion is Health Care. 

#RoeAt46 https://t.co/pvhzx3ztU8 

1087895617538408448 posted RT @rajniwadhwa902: #StayFitStayHealthySaysStMSG  Saint Dr Gurmeet Ram Rahim Singh Ji Insan educates the young generation to be health consâ€¦ 

1087895618561998859 posted RT @SitaramYechury: Wealth of top 9 billionaires, is equivalent to the bottom 50%. Centre &amp; states together spend on medical, public healthâ€¦ 

1087895618595573761 posted Everyday I see where coaches stand with athletes, I wonder do they ever know when pushing someone too hard is enougâ€¦ https://t.co/Hg8A41t5Ui 

1087895619056881668 posted RT @conservmillen: Late term abortions now legalized in NY, in whic

**Step 4:** Install MySQL server on your machine. Go to https://dev.mysql.com/downloads/ and download the installation files. After installing, create a user named 'cse482' with a password named 'cse482'. Create also a database named 'cse482'. To do this, login to the mysql client as administrator and then performed the following steps:

mysql> CREATE USER 'cse482' IDENTIFIED BY 'cse482';

mysql> CREATE DATABASE cse482;

mysql> GRANT ALL PRIVILEGES ON cse482.* TO 'cse482'@'%';

Note that the preceding commands are only available if you're the administrator of the database.

**Step 5:** Install the mysql-connector-python library. Create a mysql table named 'Health', which has the following 3 columns:

    TweetID BIGINT PRIMARY KEY,

    UserID BIGINT,
    
    Message VARCHAR(200)


In [6]:
uname = "cse482"
pwd = "cse482"
hname = "localhost"
dbname = "cse482"

import mysql.connector
from mysql.connector import errorcode

try:
    cnx = mysql.connector.connect(user=uname, password=pwd, host=hname, database=dbname)
    cursor = cnx.cursor()
    query = "DROP TABLE IF EXISTS Health"
    cursor.execute(query)

    query = """ CREATE TABLE IF NOT EXISTS Health (TweetID bigint PRIMARY KEY, UserID bigint, Message varchar(200) )"""
    cursor.execute(query)
    cnx.commit()
    cursor.close()
    cnx.close()
except mysql.connector.Error as e:
    print(e.msg)       # error message

**Step 6:** Open the health.json file you've created. For each tweet, insert the tweetID, userID, and tweet text message into the Health table in MySQL.

In [7]:
import json
try:
    cnx = mysql.connector.connect(user=uname, password=pwd, host=hname, database=dbname)
    cursor = cnx.cursor()
    
    with open('health.json') as f:
        tweets = [json.loads(line) for line in f]
        for twt in tweets:
            tweetID = twt['id']
            userID = twt['user']['id']
            msg = twt['text']
            newmsg = msg.replace("'","\\'")

            query = "INSERT INTO Health VALUES (" + str(tweetID) + ",'" 
            query += str(userID) + "','" + newmsg + "')"
            cursor.execute(query)
            
    cnx.commit()
    cursor.close()
    cnx.close()
    
except mysql.connector.Error as e:
    print(e.msg)       # error message