# ReThink Media Twitter API

This notebook is for the development and exploration of code for ReThink Media's Twitter API Python interface. The main goals of this notebook are:

- Search Tweets: query, date (optional)
  - Past seven days
  - Past 30 days
  - Full archive
  - Language = English
- Collect Tweets in .csv file
- Add data visualization
  - Top hashtags, keywords, influencers
  - Volume over time for queries/topics

In [1]:
# importing necessary modules
from dotenv import load_dotenv
import os
import json
import numpy as np
import pandas as pd
import tweepy

load_dotenv()

True

## Authentication

The variables below are what allow access to the Twitter API. I've defined them in a `.env` file, and I'm retrieving them with the code below. We then pass those variables in to a tweepy client in order to instantiate a Twitter API instance.

In [2]:
# retrieving environment variables
consumer_key = os.getenv("API_KEY")
consumer_secret = os.getenv("API_KEY_SECRET")
bearer_token = os.getenv("BEARER_TOKEN")
access_token = os.getenv("ACCESS_TOKEN")
access_secret = os.getenv("ACCESS_SECRET")

In [3]:
# Twitter API authentication
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)

In [4]:
# instantiating a Twitter API instance
api = tweepy.Client(bearer_token=bearer_token,
                consumer_key=consumer_key,
                consumer_secret=consumer_secret,
                access_token=access_token,
                access_token_secret=access_secret)
api

<tweepy.client.Client at 0x7fc4d6126700>

## Recent Search

The search function available to us in the Standard API package restricts our search to the past seven days. For searches further back in the archive, we need to upgrade to the Academic API package, which is given to researchers with a clear thesis or research paper goal in mind.

The query can be 512 characters maximum, and the user can specify a `start_time` and `end_time` (as `datetime` or `str` objects) within the past seven days. The user can also search for hashtags as well. The default behavior for white space is "AND" joins, e.g., hello world = hello AND world. More information about Twitter API queries can be found [in their documentation](https://developer.twitter.com/en/docs/twitter-api/tweets/search/integrate/build-a-query).

In [24]:
# searching for "hello world" over the past seven days.
response = api.search_recent_tweets(query="hello world lang:en", max_results=20)

The `response` object is a tuple, and it consists of four items: `(data, includes, errors, meta)`.

The `data` object contains the Tweets that are retrieved, and `meta` is the metadata for those Tweets. In this reponse object, `includes` and `errors` are empty, so I'm not sure what `includes` is yet.

In [25]:
# printing Tweets
for i in range(len(response[0])):
    print(f"Tweet {i}:")
    print(response[0][i]['text']+"\n")

Tweet 0:
RT @Cryptorio_: Hello World !!!
#DeFi #BSC #MATIC #Solana #ETH #BTC

Tweet 1:
RT @SB19Official: 🎉 #SB19xATINAnnivMonth

Hello to all A'TIN all around the world 💙

We are glad to announce that we will be celebrating ou…

Tweet 2:
RT @RoseandPom: Hello there friends! It would mean a lot to me if you went and voted for my piece in this contest. My grandparents on both…

Tweet 3:
RT @Cryptorio_: Hello World !!!
#DeFi #BSC #MATIC #Solana #ETH #BTC

Tweet 4:
RT @bakedonline: Our holiday priced at 800 dollars a night ! (95k for 7 nights) but get it at a fraction of the price with my code “AISHABA…

Tweet 5:
@nonfungibles Hello/Hola
This a gamified NFT Collection consisting of 13 portraits of world-famous Mexican Rockstars and 87 NFTs. Check road map https://t.co/ZcLPdbNSem 
We would like to Give away one NFT during your show is that possible? Comment your wallet if so. https://t.co/bynKdbYANK Thx

Tweet 6:
RT @SB19Official: 🎉 #SB19xATINAnnivMonth

Hello to all A'TIN all around the wo

In [19]:
# printing metadata for Tweets in response
response[3]

{'newest_id': '1450507333193306117',
 'oldest_id': '1450506728676605953',
 'result_count': 20,
 'next_token': 'b26v89c19zqg8o3fpdv5skjzm2z5nknp5rxgzkxxnakxp'}

## Stream

A Stream is an object that can filter and sample realtime Tweets.

In [11]:
# instantiating Stream object
stream = tweepy.Stream(consumer_key, consumer_secret, access_token, access_secret)
stream

<tweepy.streaming.Stream at 0x7fc4d6038130>

In [17]:
stream.sample(languages=["en"])

KeyboardInterrupt: 