## Understanding Tweepy

In [1]:
# to interact with Twitter API
import tweepy
# for Authenticaion Keys
from os import getenv

In [2]:
# Load Environment Variables
TWITTER_API_KEY = getenv('TWITTER_API_KEY')
TWITTER_API_KEY_SECRET = getenv('TWITTER_API_KEY_SECRET')

In [3]:
# Achive Authorization
auth = tweepy.OAuthHandler(TWITTER_API_KEY, TWITTER_API_KEY_SECRET)
TWITTER = tweepy.API(auth)

In [4]:
# Get Information on a Specific Twitter User
elon = TWITTER.get_user('elonmusk')

In [5]:
dir(elon)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_api',
 '_json',
 'contributors_enabled',
 'created_at',
 'default_profile',
 'default_profile_image',
 'description',
 'entities',
 'favourites_count',
 'follow',
 'follow_request_sent',
 'followers',
 'followers_count',
 'followers_ids',
 'following',
 'friends',
 'friends_count',
 'geo_enabled',
 'has_extended_profile',
 'id',
 'id_str',
 'is_translation_enabled',
 'is_translator',
 'lang',
 'listed_count',
 'lists',
 'lists_memberships',
 'lists_subscriptions',
 'location',
 'name',
 'notifications',
 'parse',
 'parse_list',
 'profile_background_color',
 'profile_background_image_url',
 'profile_back

In [6]:
dir(elon.status)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_api',
 '_json',
 'contributors',
 'coordinates',
 'created_at',
 'destroy',
 'entities',
 'favorite',
 'favorite_count',
 'favorited',
 'geo',
 'id',
 'id_str',
 'in_reply_to_screen_name',
 'in_reply_to_status_id',
 'in_reply_to_status_id_str',
 'in_reply_to_user_id',
 'in_reply_to_user_id_str',
 'is_quote_status',
 'lang',
 'parse',
 'parse_list',
 'place',
 'retweet',
 'retweet_count',
 'retweeted',
 'retweets',
 'source',
 'source_url',
 'text',
 'truncated']

In [7]:
elon.timeline()[0].text

'@juanjacobs @jwangARK Firmware is probably a slightly more accurate description, but yes'

In [8]:
elon.timeline()[0].in_reply_to_screen_name

'juanjacobs'

In [9]:
elon.timeline()[0].retweeted

False

In [10]:
elon.timeline()[0].created_at

datetime.datetime(2020, 10, 17, 19, 7, 39)

In [11]:
# How to Get the Text of a Tweet
elon.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[0].full_text

'The gauntlet has been thrown down! \n\nThe prophecy will be fulfilled. \n\nModel S price changes to $69,420 tonight!'

## Understanding the Analysis

In [12]:
# Import NLP Library
import spacy
# Load NN Model w/ Default Word Vectors
nlp = spacy.load('en_core_web_md')

### Vectorizing a phrase/doc
-> A real-valued meaning representation. 

-> Defaults to an average of the token vectors.

See:

https://spacy.io/api/doc#vector

https://spacy.io/usage/vectors-similarity

#### Elon Musk Tweets

In [13]:
# State Example User
elon = TWITTER.get_user('elonmusk')

In [14]:
# Example of Embedding
gauntlet_tweet = elon.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[0].full_text
print(gauntlet_tweet)
print("Characters in Tweet:", len(gauntlet_tweet), "\n")

vector = nlp(gauntlet_tweet).vector
print('Shape', vector.shape)
print(vector[:30])

The gauntlet has been thrown down! 

The prophecy will be fulfilled. 

Model S price changes to $69,420 tonight!
Characters in Tweet: 112 

Shape (300,)
[-9.4506284e-03  1.5336229e-01 -2.5372095e-03 -9.5228769e-02
 -2.7187144e-02 -1.0545822e-01 -1.6963283e-03  4.8190664e-02
 -2.1860823e-02  1.8075575e+00 -1.1649820e-01  1.1165074e-01
  5.4758918e-02 -1.8347213e-02 -4.3208938e-02 -3.8607262e-02
 -8.6710207e-02  9.1198415e-01 -1.3321900e-01 -2.7166331e-02
 -2.6441609e-02 -4.9684543e-02 -2.8614223e-02  5.8328699e-02
 -6.8295249e-03  1.0789394e-01 -8.1784010e-02  2.0494087e-02
 -2.6310230e-02  3.7932806e-02]


In [15]:
# Example of Embedding
roomy_tweet = elon.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[1].full_text
print(roomy_tweet)
print("Characters in Tweet:", len(roomy_tweet), "\n")

vector = nlp(roomy_tweet).vector
print('Shape', vector.shape)
print(vector[:30])

Will be less roomy with 3 vacuum rocket engines added https://t.co/pKtDFdiZYC
Characters in Tweet: 77 

Shape (300,)
[-0.08404455  0.11927336 -0.13547546 -0.02787445 -0.08094203 -0.10540713
 -0.14473881 -0.14430866  0.16513301  1.3458518   0.06666209 -0.10184363
 -0.09117419  0.0021664  -0.27958182 -0.15109317 -0.01114936  1.34824
 -0.34028542  0.05826197 -0.13286272  0.01760194 -0.02729966 -0.03167208
 -0.09737719  0.09402583 -0.15103136 -0.048516    0.1156179   0.03222555]


In [16]:
# Example of Embedding
rewatched_tweet = elon.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[8].full_text
print(rewatched_tweet)
print("Characters in Tweet:", len(rewatched_tweet), "\n")

vector = nlp(rewatched_tweet).vector
print('Shape', vector.shape)
print(vector[:30])

Rewatched Young Frankenstein this weekend. Still awesome. Ovaltine? https://t.co/WiMdyFSuiq
Characters in Tweet: 91 

Shape (300,)
[ 0.04336233  0.08990633 -0.07694017 -0.12693374 -0.01657242  0.03619684
  0.0665415  -0.09073585  0.05467108  1.3778566  -0.2515944   0.11460087
  0.08428337  0.05942682  0.00580774 -0.0925743   0.0160935   0.5512089
 -0.02123524 -0.00896758  0.171113   -0.08572566  0.11272017 -0.13973583
 -0.04344182  0.18941484 -0.16934758 -0.06531025  0.12107075  0.063259  ]


#### MorningBrew Tweets

In [17]:
mb = TWITTER.get_user('MorningBrew')

In [18]:
weekend_tweet = mb.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[0].full_text
print(weekend_tweet)
print("Characters in Tweet:", len(weekend_tweet), "\n")

vector = nlp(weekend_tweet).vector
print('Shape', vector.shape)
print(vector[:30])

Have a great weekend everyone &lt;3
Characters in Tweet: 35 

Shape (300,)
[ 7.47425191e-04  1.77610993e-01 -1.61640152e-01 -3.16599756e-03
  1.31394118e-01 -8.04710388e-02  1.07331716e-01 -1.58713430e-01
  1.96298566e-02  2.24454284e+00 -1.20576426e-01 -7.09922910e-02
  2.43200008e-02 -3.17635722e-02 -1.58640951e-01 -1.59043297e-01
 -8.88497159e-02  7.73969591e-01 -2.03505859e-01  7.24914297e-02
 -1.11009851e-01 -1.84286699e-01  1.33420145e-02 -2.63271462e-02
  2.52532866e-02  5.07072806e-02 -7.16851428e-02 -6.18248582e-02
  6.04867525e-02 -1.76978558e-02]


In [19]:
most_tweet = mb.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[2].full_text
print(most_tweet)
print("Characters in Tweet:", len(most_tweet), "\n")

vector = nlp(most_tweet).vector
print('Shape', vector.shape)
print(vector[:30])

Most companies: Let’s conduct a market analysis looking at competitive products, buying trends, option value, CPI fluctuations and then select a price that will place our product in a favorable position

Tesla: sex and weed https://t.co/lP2Bdk1BJa
Characters in Tweet: 247 

Shape (300,)
[-0.13101478  0.25659296 -0.06821233  0.02609474  0.13153425 -0.04175844
  0.02653612 -0.06788127  0.00193392  1.8733026  -0.301463    0.01727908
  0.02835705 -0.02690436 -0.0058216  -0.09144186 -0.08722001  1.3007772
 -0.1766076   0.03175659  0.01271561  0.03268842 -0.05577465 -0.08634067
 -0.04159696  0.15932964 -0.02774055  0.03130227  0.0397816   0.01205579]


In [20]:
stock_tweet = mb.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[10].full_text
print(stock_tweet)
print("Characters in Tweet:", len(stock_tweet), "\n")

vector = nlp(stock_tweet).vector
print('Shape', vector.shape)
print(vector[:30])

What stock lives in your head rent free?
Characters in Tweet: 40 

Shape (300,)
[-0.03236389  0.18767375 -0.12933321 -0.01424167  0.2346611  -0.14320299
 -0.11365011 -0.11226565  0.08604945  2.293589   -0.41994277  0.10055289
  0.05933633 -0.0736953  -0.09571348  0.00844477 -0.1388231   1.28989
 -0.18791042 -0.01218244  0.03391578 -0.04419067  0.04386834 -0.03780711
 -0.02163049  0.0673554  -0.04470244  0.09453979  0.1174009   0.13391137]


In [21]:
# I need every tweet to be a doc.

## Create CSV of Tweets & Handles

In [22]:
example_users = [
    'elonmusk', 'eminem', 'shakira', 
    'BarakObama', 'espn', 'NASA', 
    'BillGates', 'Oprah', 'badbanana',
    'Buzzfeed', 'GuyKawasaki', 'KrangTNelson',
    'AOC', 'TEDTalks', 'JenSincero', 'TheOnion',
    'mindykaling', 'AdamMGrant', 'mashable', 
    'Charlie_Burris', 'NatureNews', 'ScienceChannel'
]
example_users_small = [
    'OfficialKat', 'ConanOBrien', 'funnyordie',
    'darth', 'netw3rk', 'davidchang', 'mental_floss',
    'MorningBrew'
]

In [23]:
# Get 10 Tweets from Each User
# Return a Dictionary
example_tweets = {}
for twitter_handel in example_users:
    user_api_conn = TWITTER.get_user(twitter_handel)
    user_tweets = []
    for tweet in range(10):
        user_tweets.append(user_api_conn.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[tweet].full_text)
    print(twitter_handel)
    example_tweets.update({twitter_handel: user_tweets})

elonmusk
eminem
shakira
BarakObama
espn
NASA
BillGates
Oprah
badbanana
Buzzfeed
GuyKawasaki
KrangTNelson
AOC
TEDTalks
JenSincero
TheOnion
mindykaling
AdamMGrant
mashable
Charlie_Burris
NatureNews
ScienceChannel


In [24]:
import pandas as pd

In [25]:
large = pd.DataFrame(example_tweets)

In [26]:
print(large.shape)
large.head()

(10, 22)


Unnamed: 0,elonmusk,eminem,shakira,BarakObama,espn,NASA,BillGates,Oprah,badbanana,Buzzfeed,...,AOC,TEDTalks,JenSincero,TheOnion,mindykaling,AdamMGrant,mashable,Charlie_Burris,NatureNews,ScienceChannel
0,The gauntlet has been thrown down! \n\nThe pro...,"Proud of Matt for writing this, everyone needs...","Querido @carlosvives, lindísima esta canción! ...",The country's really getting too concerned abo...,DOMINANCE.\n\nClemson takes down Georgia Tech ...,"Born after Nov. 1, 2000? 👶 You're a member of ...",Europe has an opportunity to get its economy b...,Great job friend @GayleKing. Especially commen...,"Congratulations, Canada.",Sophie Turner And Joe Jonas Did Their Best Kyl...,...,Thank you everyone for the birthday wishes!\n\...,How algorithms (for fish!) and big data (about...,Kevin Hart listens to You Are a Badass while o...,Russian Scientists Grip Heads In Agony As Tele...,At least he has a great sense of humor. \nVia ...,A modern drag on happiness: time confetti.\n\n...,Resources for adults who think climate science...,Pruitt continuing to start JG feels like he’s ...,"Vaccines often work poorly in older people, me...","This year, the Nobel Prize award in physics we..."
1,Will be less roomy with 3 vacuum rocket engine...,So happy to be a part of the sequel- @BigSean ...,Las Latinas somos la fuerza detrás de nuestras...,Sometimes I wonder if I really should have tak...,The time for talk is over ...🥊\n\n#LomaLopez |...,It’s been quite an eventful week! \n\n👩‍🚀 A ne...,The COVID-19 pandemic has set back efforts to ...,Can’t wait to see what @ava does with @isabelw...,I think America would have learned just as muc...,I Shed A Lot And I Swear By This Broom To Get ...,...,When politicians use faith as an excuse to pas...,"""Even if climate change stopped right now, the...","When we push against who we naturally are, we ...",Blue Lives Matter Supporters Say Kyle Rittenho...,I was so lucky to work with Peter. His passion...,The best conversation partners are unafraid of...,The world's largest 3D-printed boat was built ...,Lol come on https://t.co/ss3LDnhENr,Collaborating in science is nothing new. But t...,"When a Cordyceps fungus attacks a host, the my..."
2,“What is love? Baby don’t hurt me.”— Winston C...,Honored to be @YoungMAMusic’s first guest on h...,Latinas may be a minority but our voices will ...,This health care discussion has me thinking......,Yussuf Poulsen with a stunner for RB Leipzig 😱...,This is where stars are born. \n\nThis @NASAHu...,.@StephenCurry30’s work in the community is as...,😂😂😂 #TheOprahConversation https://t.co/59s1Zy5W3F,"""For the final segment tonight, you'll each ha...",26 Teen Characters That You Forgot Were Actual...,...,Maybe we should be. https://t.co/AC5491EW56,Here's what people *actually* need to lift the...,Take the risks and stay the course until you r...,Today Now!: How To Pretend You Give A Shit Abo...,"And if that happens to be 3pm, so be it. @Bare...",The root of insecurity is craving the approval...,This new 3D printing technique is totally mesm...,"It just never ends, that’s probably the toughe...","""It seems that the only way to be successful i...",This ancient Viking mortuary house was built t...
3,🤘 The Illuminaughty 🤘,Vote for Sway! https://t.co/p4S36kWBlP,14/10/20 - Happy place. \nPics by @JaumedeLaig...,I hope Congress does not provide health care. ...,Time for Countdown to GameDay with @jasonfitz ...,"""I'd love to live on the Moon. Why not?""\n\nIn...",Dr. Tunji Funsho’s work with @Rotary was essen...,"Hey @mariahcarey, talent runs in your #lambily...",Before you focus all of your time and money to...,32 Tweets From This Month So Far That Sent Me ...,...,Genuine question for the media / law enforceme...,"In this truly one-of-a-kind Talk, mega-produce...",Share your Badassery with the world why doncha...,Fox News Limits Pandemic Coverage To Avoid Giv...,Am I the only one who makes this face while pu...,Hey managers: the solution to your diversity p...,These Adidas sneakers could be a huge step for...,"What an absolutely unacceptable, atrocious, un...",“The loss in research and human resources is i...,"The ""Wow"" signal from 1977 rocked the scientif..."
4,4th flight &amp; landing for this booster http...,"""This is just the song to go ballistic on"" 🦖 t...",Feliz cumpleaños a mi hermano Tonino!! El más ...,I haven't twittered in a while. Been very bus...,The MLB playoffs are going to be electric Satu...,"⬜⛏️ Ice mining experiment, anyone?\n\nWe're te...",This honor is well deserved. COVID-19 is not j...,That’s a good idea! We just had water for our ...,Introducing “Murder Hornets” in May of 2020 is...,Viola Davis Opened Up About Working With Chadw...,...,I still can’t get over how bad the climate cha...,Deciding by majority vote seems like the best ...,You have got to get a handle on your thoughts ...,WWII Bomb Explodes During Disposal Operation h...,What’s the gif that best personifies you? Here...,There's no moral code requiring you to finish ...,The Google Grim Reaper has claimed another app...,These announcers are talking like there’s gonn...,Scientists have mapped the location and size o...,"When frogs mate, the male climbs on the back o..."


In [27]:
# Get 10 Tweets from Each User
# Return a Dictionary
example_tweets_small = {}
for twitter_handel in example_users_small:
    user_api_conn = TWITTER.get_user(twitter_handel)
    user_tweets = []
    for tweet in range(10):
        user_tweets.append(user_api_conn.timeline(
            count=200, exclude_replies=True, include_rts=False,
            tweet_mode='extended', since_id=None)[tweet].full_text)
    print(twitter_handel)
    example_tweets_small.update({twitter_handel: user_tweets})

OfficialKat
ConanOBrien
funnyordie
darth
netw3rk
davidchang
mental_floss
MorningBrew


In [28]:
small = pd.DataFrame(example_tweets_small)

In [29]:
print(small.shape)
small.head()

(10, 8)


Unnamed: 0,OfficialKat,ConanOBrien,funnyordie,darth,netw3rk,davidchang,mental_floss,MorningBrew
0,I hope everyone is watching #TheWayISeeIt,My father has always cast a long shadow in my ...,Good morning! Ever see someone wish you were t...,https://t.co/5yHX2C45qm,TV AD: VOTE YES on 69 to keep millions of fami...,Isaac you still get a B-minus https://t.co/SPp...,The reason for the orange coffee pot is just o...,Have a great weekend everyone &lt;3
1,This made me barfy https://t.co/weQ47MGXJ2,I can’t remember the last time I was this shoc...,"Wait, there's more to TikTok than hot people d...",that this is even an issue is totally fucked u...,just recorded this with a really fun custom re...,Thou shall not talk shit about sliders or flat...,Is this the world's only tree donation inspire...,Which company has the best chance to pass Appl...
2,What if 2020 was just some intern fucking arou...,If you like my conversation with @HillaryClint...,"Today, we celebrate @TheRitaMoreno and her inc...",californians we can not do anything to force t...,As an official request the sports caller is co...,Hey @IsaacKLee I want to be a guest on this po...,If you see a giant toilet bowl or SpongeBob on...,Most companies: Let’s conduct a market analysi...
3,Copy and paste but change what the bunny is ho...,Damn — @WillieNelson came on the podcast and ...,New Quarantine Emojis by @ben_rosen https://t....,let me tell u about self-serving individuals s...,LOL SECOND SEASON?!?!?! https://t.co/jRotcVyyCi,He doesn’t think hard enough https://t.co/jkdV...,Halloween is ~cool~. https://t.co/Odh3kc2iNW,Read it all\n\nthen tell us what yours is http...
4,Ok well now I am sobbing?! https://t.co/rHoiM0...,"I love the fall: The crisp air, the colorful t...",.@itsjuliebowen and @lancereddick rock out in ...,i will wait for the all important trump twitte...,https://t.co/7oWqrPZxUL,This is how I also prepare for meetings. This ...,You'll be saying Walter White's name—and proba...,Holy retail sales: 1.9% vs. 0.7% expected \n\n...


In [30]:
large.to_csv('large.csv')

In [31]:
small.to_csv('small.csv')