# Data Science and Transnational Security Collider

### Archiving Twitter Data
The first step in our project is to download tweets from terrorist or terrorist-like sources. The following code will save the tweet history of a user the 'archived_tweets' folder.

These functions will be used in the process:

In [11]:
def load_keys(path):
    """Loads your Twitter authentication keys from a file on disk.
    
    Args:
        path (str): The path to your key file.  The file should
          be in JSON format and look like this (but filled in):
            {
                "consumer_key": "<your Consumer Key here>",
                "consumer_secret":  "<your Consumer Secret here>",
                "access_token": "<your Access Token here>",
                "access_token_secret": "<your Access Token Secret here>"
            }
    
    Returns:
        dict: A dictionary mapping key names (like "consumer_key") to
          key values."""
    import json
    # Loading keys
    with open(path) as f:
        keys = json.load(f)
    return keys

def download_recent_tweets_by_user(user_account_name, keys):
    """Downloads tweets by one Twitter user.

    Args:
        user_account_name (str): The name of the Twitter account
          whose tweets will be downloaded.
        keys (dict): A Python dictionary with Twitter authentication
          keys (strings), like this (but filled in):
            {
                "consumer_key": "<your Consumer Key here>",
                "consumer_secret":  "<your Consumer Secret here>",
                "access_token": "<your Access Token here>",
                "access_token_secret": "<your Access Token Secret here>"
            }

    Returns:
        list: A list of Status objects, each representing one tweet."""
    import tweepy
    auth = tweepy.OAuthHandler(keys["consumer_key"], keys["consumer_secret"])
    auth.set_access_token(keys["access_token"], keys["access_token_secret"])
    api = tweepy.API(auth)

    # Getting as many recent tweets as Twitter will let us have:
    tweets = list(tweepy.Cursor(api.user_timeline, id=user_account_name).items(100))
    return tweets

def save_tweets(tweets, path):
    """Saves a list of tweets to a file in the local filesystem.

    Args:
        tweets (list): A list of tweet objects (of type Status) to
          be saved.
        path (str): The place where the tweets will be saved.

    Returns:
        None"""
    # Saving the tweets to a file as "pickled" objects:
    with open(path, "wb") as f:
        import pickle
        pickle.dump(tweets, f)

def archive_tweets(user_account_name, keys_path):
    """Archives recent tweets from one user.
    
    Args:
        user_account_name (str): The Twitter handle of a user, without the @.
        keys_path (str): The path to a JSON keys file in your filesystem.
    """
    import time
    import datetime
    st = datetime.datetime.fromtimestamp(time.time()).strftime('%m-%d-%Y___%H-%M-%S')
    save_path = "archived_tweets/{}___{}.pkl".format(user_account_name, st)
    keys = load_keys(keys_path)
    tweets = download_recent_tweets_by_user(user_account_name, keys)
    save_tweets(tweets, save_path)

Now we will archive the tweets. Replace the "ACCOUNT_NAME_HERE" with the handle of the user you want to download tweets from. Then run the cell below.

Ex. @ali_wetrill (follow me!!) would be "ali_wetrill"

In [15]:
"""
Saves to 'archived_tweets' folder in format 'account-name____date____time'
"""
user_account_name = "obouu_7"

keys_path = "keys.json"

archive_tweets(user_account_name, keys_path)

TweepError: Twitter error response: status code = 401

### Loading Twitter Data

The following cells will load the data.

In [8]:
def load_tweets(path):
    """Loads tweets that have previously been saved.
    
    Args:
        path (str): The place where the tweets were be saved.

    Returns:
        list: A list of Status objects, each representing one tweet."""
    with open(path, "rb") as f:
        import pickle
        tweets = pickle.load(f)
    return tweets

Now run the next cell, replacing "FILE_NAME_HERE" with the file you want to access. Go to the 'archived_tweets' folder and copy the filename of the tweets you want to access.

In [9]:
"""
Note: must include '.pkl' at end. Also must be in quotation marks.

Ex. "ali_wetrill___02-23-2017__17-34-36.pkl" is VALID
Ex. ali_wetrill_\__02-23-2017_\__17-34-36.pkl is invalid
Ex. "ali_wetrill___02-23-2017___17-34-36" is invalid
"""

filename = "KylieJenner___04-17-2017___13-16-59.pkl"

tweet_data = load_tweets("archived_tweets/" + filename)

In [10]:
# We can access the latest tweet by running this code
tweet_data[0].text

'RT @julianarose_3: GIRL your hair is popppinnn 😻😻😭 so in love 💚 @KylieJenner https://t.co/z2eCHWXMKT'

We can load certain tweet attributes into a PANDAS dataframe object to run further analysis

In [24]:
import pandas as pd

def make_dataframe(tweets):
    """Make a DataFrame from a list of tweets, with a few relevant fields.
    
    Args:
        tweets (list): A list of tweets, each one a Status object.
    
    Returns:
        DataFrame: A Pandas DataFrame containing one row for each element
          of tweets and one column for each relevant field."""
    df = pd.DataFrame()
    df['text'] = [t.text for t in tweets] 
    df['created_at'] = [t.created_at for t in tweets] 
    df['favorites'] = [t.favorite_count for t in tweets] 
#     df['location'] = [t.coordinates for t in tweets]
#     df['place'] = [t.place for t in tweets]
    df['reply_users'] = [t.in_reply_to_screen_name for t in tweets]
#     df['has_link'] = [t.possibly_sensitive for t in tweets]
    return df

In [25]:
terr_df = make_dataframe(tweet_data)

pd.set_option('display.max_colwidth', 150)

terr_df.head(5)

Unnamed: 0,text,created_at,favorites,reply_users
0,RT @julianarose_3: GIRL your hair is popppinnn 😻😻😭 so in love 💚 @KylieJenner https://t.co/z2eCHWXMKT,2017-04-15 17:57:58,0,
1,RT @gcardenas_: @kyliecosmetics @KylieJenner 's Salted Caramel kylighter 😍✨ the package is so pretty https://t.co/DC3HFaToy3,2017-04-14 20:49:03,0,
2,RT @Anna_Stewart22: @KylieJenner nail appreciation because she changed the game with these nails💅💕 https://t.co/d3sfKuyF3G,2017-04-13 00:04:10,0,
3,I just want Clay 😫 https://t.co/abobBlzYY7,2017-04-12 02:48:00,71775,
4,Chic-Fil-A sauce is that even a question . https://t.co/vsnxIp3Rlz,2017-04-11 01:57:37,21569,
