# Querying Twitter

This notebook contains instructions for performing general queries, as well as querying the home user's timeline.

## Libraries

In [1]:
from twython import Twython
import pandas as pd
from pandas import DataFrame

## Get connection credentials

In [2]:
# Read in from csv
creds = pd.read_csv("twitter_creds.csv")
# Remove unnecessary column
creds = creds.drop(["Unnamed: 0"], axis=1)

In [4]:
# Store credentials in variables
# Strip function is applied to remove trailing whitespaces
APP_KEY = creds["Value"][0].strip()
APP_SECRET = creds["Value"][1].strip()
OAUTH_TOKEN = creds["Value"][2].strip()
OAUTH_TOKEN_SECRET = creds["Value"][3].strip()

## Get timeline, store in variable and close api connection

In [37]:
# Connect to api
twitter = Twython(APP_KEY, APP_SECRET,
                 OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
# Perform query
results = twitter.get_home_timeline(count=200)
# Close api connection
del twitter

## Perform a query, store in variable and close api connection

In [33]:
# Connect to api
twitter = Twython(APP_KEY, APP_SECRET,
                 OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
# Perform query
myquery = "Trump -filter:retweets -filter:replies"
results = twitter.search(q = myquery,
                         count = 100,
                         result_type = "popular",
                         tweet_mode = "extended",
                         include_entities = False)

# Close api connection
del twitter

## Inspect query results (for timeline results see below)

In [106]:
# Objects returned by the query
for r in results:
    print(r)

statuses
search_metadata


In [34]:
# How many tweets were returned?
len(results["statuses"])

15

In [134]:
# results["statuses"][0]

In [35]:
# Create list containers to store query results, which will be combined into a dataframe
screen_name = []
verified = []
description = []
created_at = []
full_text = []
favorite_count = []
retweet_count = []

In [36]:
# Print tweets returned by the query
for item in results["statuses"]:
    # print(item["user"]["screen_name"])
    screen_name.append(item["user"]["screen_name"])
    # print(item["user"]["verified"])
    verified.append(item["user"]["verified"])
    # print(item["user"]["description"])
    description.append(item["user"]["description"])
    # print(item["created_at"])
    created_at.append(item["created_at"])
    # print(item["full_text"])
    full_text.append(item["full_text"])
    # print("Favorite count:", item["favorite_count"])
    favorite_count.append(item["favorite_count"])
    # print("Retweet count:", item["retweet_count"])
    retweet_count.append(item["retweet_count"])
    # print("-" * 50)

In [37]:
# Combine tweet data into a dataframe
tweet_data = {"screen_name":screen_name,
              "verified":verified,
              "description":description,
              "created_at":created_at,
              "full_text":full_text,
              "favorite_count":favorite_count,
              "retweet_count":retweet_count}
df_tweets = pd.DataFrame(tweet_data)

In [38]:
df_tweets

Unnamed: 0,created_at,description,favorite_count,full_text,retweet_count,screen_name,verified
0,Sun Mar 11 05:39:00 +0000 2018,"“Ignorance, allied with power, is the most fer...",38444,At some point we need to confront the fact tha...,13377,JoyAnnReid,True
1,Sun Mar 11 04:50:20 +0000 2018,"Journalist, storyteller, and lifelong reader. ...",30241,Tonight Pres. Trump once again stoked a crowd ...,11492,DanRather,True
2,Sun Mar 11 00:48:29 +0000 2018,Political Advisor to @HillaryClinton. Dad. Ukr...,9981,Trump is currently talking about Conor Lamb an...,4885,AdamParkhomenko,True
3,Sat Mar 10 20:27:24 +0000 2018,President Donald J Trump's Most Outspoken & Lo...,5212,McCain Outed Secretly Plotting to ‘Take Down’ ...,2839,DiamondandSilk,True
4,Sun Mar 11 03:34:26 +0000 2018,President of The New Agenda. Advocate for wome...,2183,"Yea, this seems normal too:\n“Trump would be a...",2753,Amy_Siskind,True
5,Sat Mar 10 16:01:00 +0000 2018,President of The New Agenda. Advocate for wome...,8917,"Folks, to be clear: when I say this scandal co...",2999,Amy_Siskind,True
6,Sun Mar 11 02:10:18 +0000 2018,Founder of @NextGenAmerica & @Need2Impeach. Wo...,26616,If you ever wondered whether Mr. Trump was del...,10118,TomSteyer,True
7,Sun Mar 11 16:41:50 +0000 2018,Founder & Exec. Director of @TPUSA Proud capit...,7851,"""Experts"" said Trump would lose in historic fa...",3633,charliekirk11,True
8,Sat Mar 10 07:25:05 +0000 2018,"Author, Lecturer, Columnist (Justia), CNN cont...",11939,Cult expert finds Trump and followers fit as a...,6039,JohnWDean,True
9,Sun Mar 11 14:44:27 +0000 2018,🎥 filmmaker. 📚 author. 🎤 speaker. My bestselli...,10417,Actually no. Trump insults white people all t...,3614,DineshDSouza,True


In [39]:
# Save df_tweets to csv
df_tweets.to_csv("df_tweets.csv",
                sep = ",")

In [31]:
for t in df_tweets["full_text"]:
    print(t)
    print("-" * 50)

"It correctly identified which signals were planets and which signals were not planets 96 percent of the time"

Read more about our planet-finding model here → https://t.co/EvD6keED9Z

Learn more #MadeWithTensorFlow stories at #TFDevSummit → https://t.co/VthKCxDYz9 https://t.co/DEDAmSpPPp
--------------------------------------------------
TensorFlow 1.6.0 has been released! 

Please see the full release notes for details on added features and changes ↓ https://t.co/FnLsn99u2F
--------------------------------------------------
We’re excited to release the #TensorFlow model for processing @NASAKepler data, training our #neuralnetwork, and making predictions about new exoplanet candidate signals. We hope this release will prove a useful starting point for developing similar models https://t.co/oRzNHLwJRa
--------------------------------------------------
TensorFlow is the platform of choice for deep learning in the research community. These are deep learning framework mentions on arXiv ov

## Inspect timeline results

In [None]:
# How many items were returned
len(results)

In [48]:
# Print example of a tweet returned from timeline
# results[50]

In [68]:
# Attributes of a tweet:
for item in results[0]:
    print(item)

KeyError: 0

In [69]:
# Get screen names (this is preceded by @ symbol in Twitter UI) and names
# of timeline tweets
for t in results:
    print("%s : %s" % (results["user"]["screen_name"], results["user"]["name"]))

KeyError: 'user'

In [45]:
for t in results:
    if results["user"]["screen_name"] == "realDonaldTrump":
        print(results["user"]["screen_name"])
        print(results["text"])
        print(results["favorite_count"])
        print("-" * 50)

realDonaldTrump
Rasmussen and others have my approval ratings at around 50%, which is higher than Obama, and yet the political pund… https://t.co/x6MUw9P9sQ
45099
--------------------------------------------------
realDonaldTrump
The Democrats continue to Obstruct the confirmation of hundreds of good and talented people who are needed to run o… https://t.co/AVCxZlPVIJ
39589
--------------------------------------------------
