# Tweepy
> Author: Andrew Eng | Date: 2020-10-05

## Objectives
This notebook serves as a scratch pad for learning how to use [Tweepy](http://docs.tweepy.org/en/latest/index.html).  

In [34]:
# Import tweepy as tw
import tweepy as tw

from datetime import datetime as dt
from elasticsearch import Elasticsearch

## Authentication

I setup my api credentials in a text file called twitter.keys in /home/andrew.  This is so my api keys won't accidentally upload when I git push the repository to the public.

The format of my twitter.keys is:

api_key = <key>
    
api_secret_key = <key>

access_token = <key>

access_token_secret = <key>

In [35]:
# Import keys from a saved file instead of inputting it directly into the script.  
# Strip whitespaces and split on = as I only want the key values

key_location = "/home/andrew/twitter.keys"
apikeys = []
with open(key_location) as keys:
    for i in keys:
        apikeys.append(i.split("=")[1].strip(" ").strip("\n"))
keys.close()

# Initialize dictionary
twitter_cred = dict()

# Enter API keys
twitter_cred["CONSUMER_KEY"] = apikeys[0]
twitter_cred["CONSUMER_SECRET"] = apikeys[1]

# Access Tokens
twitter_cred["ACCESS_KEY"] = apikeys[2]
twitter_cred["ACCESS_SECRET"] = apikeys[3]

# Set authentication object

auth = tw.OAuthHandler(twitter_cred["CONSUMER_KEY"], twitter_cred["CONSUMER_SECRET"])
auth.set_access_token(twitter_cred["ACCESS_KEY"], twitter_cred["ACCESS_SECRET"])

# Create api object with authentication

api = tw.API(auth, wait_on_rate_limit=True)

es = Elasticsearch('127.0.0.1', port=9200)

## Hello World

To test out if my api keys are working, let's grab some tweets from my timeline using home_timeline() method

In [36]:
public_tweets = api.home_timeline()
count = 1
for tweet in public_tweets:
    print(f"TWEET #{count} - {tweet.text} \n")
    count +=1

TWEET #1 - Share your crypto story for the chance to win Binance swag and #BNB! #MyCryptoLife 

TWEET #2 - https://t.co/1rJtYb5PPp Is a Provably Fair Gambling Casino Supporting Multiple Cryptocurrencies… https://t.co/4IPKMzkzEq 

TWEET #3 - Innovative Artificial Intelligence (AI) Tool Optimizes Care for T2D Patients. #dataresponsible #bigdata #aiethics https://t.co/uFYJe2d5IM 

TWEET #4 - [Course Video] Data Science and Machine Learning: Course Introduction https://t.co/WGAkBrxMvl https://t.co/hZXToQf1xx 

TWEET #5 - AttackDefense Labs | Maintaining Access Persistence: Leveraging PostgreSQL https://t.co/5w8vhj0EsP… https://t.co/c2proC1rD3 

TWEET #6 - Litecoin has successfully redeployed its MimbleWimble testnet as regulators grow increasingly wary of privacy-enhan… https://t.co/aHQjTnUHkW 

TWEET #7 - Over 1900+ live online labs and 1500+ HD Videos at a 70% Discount! https://t.co/TmEy1qdtyJ Stay Safe and Happy Lear… https://t.co/c3Eg8yn6jw 

TWEET #8 - Faraday Future plans to go publi

## Methods

In [29]:
user = api.get_user("JATayler")

print(f"Screen name is: {user.screen_name}")
print(f"There are {user.followers_count} followers")
print(f"Currently following:")
for friend in user.friends():
   print(f"    - {friend.screen_name}")

Screen name is: JATayler
There are 6369 followers
Currently following:
    - VanHicklestein
    - viewtoakel
    - gfstarr1
    - ms_creilly
    - theshortgirlash
    - manymanywords
    - heyadiana
    - _hkdl_
    - mariokartdwi
    - lalavin666
    - mollylambert
    - normcharlatan
    - jcmccaffrey
    - slwein
    - RichardStaff
    - JasonRRMartinez
    - CurbedNY
    - THECITYNY
    - davelevitan
    - whstancil


## Authentication

There are 2 authentication methods that can be used:
1. OAuth 1a Authentication
    - application-user authenticaiton
2. OAuth 2 Authentication
    - applicaiton-only authentication

In [30]:
auth = tw.OAuthHandler(twitter_cred['CONSUMER_KEY'], twitter_cred['CONSUMER_SECRET'])

# For applications that need a callback URL
#auth = tweepy.OAuthHandler(twitter_cred['CONSUMER_KEY'], twitter_cred['CONSUMER_SECRET'], callback_url)


Authentican workflow as follows:

1. Get a request token from twitter
2. Redirect user to twitter.com to authorize application
3. If using callback, twitter will redirect the user to our callback, otherwise, user must suppply verifier code
4. Exchange the authorized request token for an access token

In [31]:
try:
    redirect_url = auth.get_authorization_url()
    print('It works!')
except tweepy.TweepError:
    print('Error! Failed to get request token.')

It works!


## Query Data

In [18]:
feed = {}

for tweet in tw.Cursor(api.search, q='trump', tweet_mode='extended').items(1):
    feed.update(tweet._json)




In [19]:
# Use this cell to see the different keys and items

feed['retweeted_status']['user']['id']

KeyError: 'retweeted_status'

In [37]:
def acqData(search, acq):

    index_name = search.split(' ')[0] + '-' + dt.today().strftime('%Y-%m-%d')
    feed = []
    
    print('::Acquiring Data::')
   
    for tweet in tw.Cursor(api.search, q=search, tweet_mode='extended').items(acq):
        feed.append(tweet._json)

    count = 0
    
    print('::Transferring to Elasticsearch Search::')
    
    while count < len(feed):
        tweet_date = feed[count]['created_at']
        username = feed[count]['user']['screen_name']
        account_creation_date = feed[count]['created_at']
        user_description = feed[count]['user']['description']
        user_url = feed[count]['user']['url']
        verified_status = feed[count]['user']['verified']
        geo_enabled = feed[count]['user']['geo_enabled']
        friends_count = feed[count]['user']['friends_count']
        followers_count = feed[count]['user']['followers_count']
        retweeted_count = feed[count]['retweet_count']
        favorite_count = feed[count]['favorite_count']
        hashtags = feed[count]['entities']['hashtags']
        tweet_full_text = feed[count]['full_text']

        doc = {
            '@timestamp': dt.now(),
            'tweet_date': tweet_date,
            'username': str(username),
            'account_creation_date': str(account_creation_date),
            'user_description': str(user_description),
            'user_url': str(user_url),
            'verified_status': bool(verified_status),
            'geo_enabled': bool(geo_enabled),
            'friends_count': int(friends_count),
            'followers_count': int(followers_count),
            'retweeted_count': int(retweeted_count),
            'favorite_count': int(favorite_count),
            'hashtags': hashtags,
            'tweet_full_text': str(tweet_full_text),
            'word_list': str(tweet_full_text).split(' ')
        }

        es.index(index=index_name, body=doc)
        print(f'{count}: {doc}')
        
        count +=1

In [38]:
acqData('palantir OR PLTR', 100)

::Acquiring Data::
::Transferring to Elasticsearch Search::
0: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 43, 270709), 'tweet_date': 'Tue Oct 06 07:06:37 +0000 2020', 'username': 'lindadavies56', 'account_creation_date': 'Tue Oct 06 07:06:37 +0000 2020', 'user_description': '', 'user_url': 'None', 'verified_status': False, 'geo_enabled': False, 'friends_count': 582, 'followers_count': 419, 'retweeted_count': 22, 'favorite_count': 0, 'hashtags': [], 'tweet_full_text': 'RT @StefSimanowitz: 6/. On 8 July, the UK govt sidelined SAGE in favour of the secretive Joint Biosecurity Centre headed by a senior spy\n\nO…', 'word_list': ['RT', '@StefSimanowitz:', '6/.', 'On', '8', 'July,', 'the', 'UK', 'govt', 'sidelined', 'SAGE', 'in', 'favour', 'of', 'the', 'secretive', 'Joint', 'Biosecurity', 'Centre', 'headed', 'by', 'a', 'senior', 'spy\n\nO…']}
1: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 43, 288355), 'tweet_date': 'Tue Oct 06 07:04:49 +0000 2020', 'username': 'StockAlertsA

14: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 43, 468150), 'tweet_date': 'Tue Oct 06 06:44:14 +0000 2020', 'username': 'KoichiTsunoda', 'account_creation_date': 'Tue Oct 06 06:44:14 +0000 2020', 'user_description': 'ヤプリってノーコードSaaSの会社で取締役CFOとベンチプレスやってます。採用したい → iOS/Android/Goエンジニア、広報 | @UCBerkeley → 外資金融 → CFO @マナボ（売却）| Living a startup dream in Tokyo. No-code B2B SaaS.', 'user_url': 'None', 'verified_status': False, 'geo_enabled': False, 'friends_count': 985, 'followers_count': 8164, 'retweeted_count': 0, 'favorite_count': 6, 'hashtags': [], 'tweet_full_text': '直接上場は広まってほしいけど、全てを解決する魔法じゃないしもっと色んな選択肢があって良い。SPACも話題だけど主幹事をつけながら自社でアロケーションしたUnityのIPOも気になる\n\nなぜAsanaやPalantir、Spotify、Slackは伝統的IPOを回避したのか？ https://t.co/CgAnRYHLr1 via @coral_capital', 'word_list': ['直接上場は広まってほしいけど、全てを解決する魔法じゃないしもっと色んな選択肢があって良い。SPACも話題だけど主幹事をつけながら自社でアロケーションしたUnityのIPOも気になる\n\nなぜAsanaやPalantir、Spotify、Slackは伝統的IPOを回避したのか？', 'https://t.co/CgAnRYHLr1', 'via', '@coral_capital']}
15: {'@timestamp': datetim

34: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 43, 683426), 'tweet_date': 'Tue Oct 06 06:18:42 +0000 2020', 'username': 'NanaB_2010', 'account_creation_date': 'Tue Oct 06 06:18:42 +0000 2020', 'user_description': 'Psych major, Behavior Analyst  #BlackLivesMatter #Autism  #PTSD #BP #SpecialOlympics #Advocacy #Climate #Equality', 'user_url': 'None', 'verified_status': False, 'geo_enabled': False, 'friends_count': 1530, 'followers_count': 829, 'retweeted_count': 6, 'favorite_count': 0, 'hashtags': [], 'tweet_full_text': 'RT @MijenteComite: For the past two weeks, folks have been chasing them across the country in protests demanding they cut the ICE contracts…', 'word_list': ['RT', '@MijenteComite:', 'For', 'the', 'past', 'two', 'weeks,', 'folks', 'have', 'been', 'chasing', 'them', 'across', 'the', 'country', 'in', 'protests', 'demanding', 'they', 'cut', 'the', 'ICE', 'contracts…']}
35: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 43, 695155), 'tweet_date': 'Tue Oct 06 06

53: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 43, 890571), 'tweet_date': 'Tue Oct 06 06:03:01 +0000 2020', 'username': 'pitbullet8', 'account_creation_date': 'Tue Oct 06 06:03:01 +0000 2020', 'user_description': '元T企業勤務 ＞ FIRE中。脳内リセットしてランダムに学びながら、米国ETF・個別株の中長期投資。投資は学び続けるための一種のモチベーション', 'user_url': 'None', 'verified_status': False, 'geo_enabled': False, 'friends_count': 103, 'followers_count': 44, 'retweeted_count': 0, 'favorite_count': 0, 'hashtags': [], 'tweet_full_text': 'https://t.co/0laBeyss9A', 'word_list': ['https://t.co/0laBeyss9A']}
54: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 43, 901549), 'tweet_date': 'Tue Oct 06 05:59:22 +0000 2020', 'username': 'itarumusic', 'account_creation_date': 'Tue Oct 06 05:59:22 +0000 2020', 'user_description': '@kai_ogita とジャーナリストや専門家、作家などの個人ブランド向けに、月額ニュースレター配信サービス『theLetter (https://t.co/n7ZEF5HbKC)』を運営してます。タトゥー、音楽、映画、小説、二郎が好き。ジャーナリズムと個人のマネタイズについて日々研究しています。エッセイは下記リンクより', 'user_url': 'https://t.co/5pYM9XxEUW', 'verified_statu

73: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 44, 99199), 'tweet_date': 'Tue Oct 06 05:53:51 +0000 2020', 'username': 'blackrepublican', 'account_creation_date': 'Tue Oct 06 05:53:51 +0000 2020', 'user_description': 'Authentic #blackconservatism has always had the fundamental and explicit goal of opposing white supremacy.\n\n— Kareim Oliphant\n\n#BlackConservative', 'user_url': 'https://t.co/8b7EAlLWPg', 'verified_status': False, 'geo_enabled': False, 'friends_count': 88070, 'followers_count': 71891, 'retweeted_count': 14, 'favorite_count': 0, 'hashtags': [], 'tweet_full_text': "RT @carolineha_: Here's an excerpt, showing the scale of Palantir use within the LAPD, and some of the colleges + school districts that sha…", 'word_list': ['RT', '@carolineha_:', "Here's", 'an', 'excerpt,', 'showing', 'the', 'scale', 'of', 'Palantir', 'use', 'within', 'the', 'LAPD,', 'and', 'some', 'of', 'the', 'colleges', '+', 'school', 'districts', 'that', 'sha…']}
74: {'@timestamp': datetime.date

92: {'@timestamp': datetime.datetime(2020, 10, 6, 14, 6, 44, 307246), 'tweet_date': 'Tue Oct 06 05:34:27 +0000 2020', 'username': 'blackrepublican', 'account_creation_date': 'Tue Oct 06 05:34:27 +0000 2020', 'user_description': 'Authentic #blackconservatism has always had the fundamental and explicit goal of opposing white supremacy.\n\n— Kareim Oliphant\n\n#BlackConservative', 'user_url': 'https://t.co/8b7EAlLWPg', 'verified_status': False, 'geo_enabled': False, 'friends_count': 88070, 'followers_count': 71891, 'retweeted_count': 194, 'favorite_count': 0, 'hashtags': [], 'tweet_full_text': 'RT @StefSimanowitz: On 8 July, the UK govt sidelined SAGE in favour of the secretive Joint Biosecurity Centre headed by a senior spy.\n\nOn 2…', 'word_list': ['RT', '@StefSimanowitz:', 'On', '8', 'July,', 'the', 'UK', 'govt', 'sidelined', 'SAGE', 'in', 'favour', 'of', 'the', 'secretive', 'Joint', 'Biosecurity', 'Centre', 'headed', 'by', 'a', 'senior', 'spy.\n\nOn', '2…']}
93: {'@timestamp': datetim

In [None]:
tweets = []
for tweet in tweepy.Cursor(api.search, q='#PLTR').items(10):
    tweets.append(tweet._json) 


In [None]:
import pandas as pd
df = pd.read_csv('test.csv')

In [None]:
tweets[0]['user']['name']

In [None]:
tweets[0]['text']

In [None]:
created = tweets[0]['user']['created_at']
screen_name = tweets[0]['user']['screen_name']
name = tweets[0]['user']['name']
followers_count = tweets[0]['user']['followers_count']
friends_count = tweets[0]['user']['friends_count']
geo = tweets[0]['geo']
coordinates = tweets[0]['coordinates']
place = tweets[0]['place']
retweet_count = tweets[0]['retweet_count']
tweet_text = tweets[0]['text']

friends_list = []
followers_list = []

user = api.get_user(screen_name)
for friend in user.friends():
   friends_list.append(friend.screen_name)

In [None]:
len(friends_list)

In [None]:
print(f'{created}\n \
        Screen Name: {screen_name}\n \
        Real Name: {name}\n \
        Number of Followers {followers_count}\n \
        Number of Friends {friends_count}\n \
        Number of Retweets: {retweet_count}\n \
        Location: {geo}, {coordinates}, {place}\n \
        Friends List: {friends_list}\n \
        Tweet: {tweet_text}\n')

In [None]:
import datetime as dt
search = "palantir OR PLTR"
date_begin = (dt.datetime.today() - dt.timedelta(days=7)).strftime("%Y-""%m-""%d")

In [None]:
tweets = tw.Cursor(api.search, q = search, tweet_mode = "extended", lang = "en", since = date_begin).items(2)

In [None]:
feed = []
for tweet in tweets:
    feed.append(tweet)

In [None]:
import json
json_data = json.loads(feed[0])