# Get Tweets

This script extracts all the tweets with hashtag #covid-19 related to the day before today (yesterday) and saves them into a .csv file.
We use the `tweepy` library, which can be installed with the command `pip install tweepy`.

Firstly, we import the configuration file, called `config.py`, which is located in the same directory of this script.

In [17]:
from config import *
import tweepy
import datetime

import sys
import logging

logger = logging.getLogger('tweets_search')

In [18]:
format = "%(asctime)s - %(levelname)s - %(message)s"
logging.basicConfig(format=format, stream=sys.stdout, level = logging.DEBUG)

We setup the connection to our Twitter App by using the `OAuthHandler()` class and its `access_token()` function. Then we call the Twitter API through the `API()` function.

In [19]:
auth = tweepy.OAuthHandler(TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET)
auth.set_access_token(TWITTER_ACCESS_TOKEN, TWITTER_ACCESS_TOKEN_SECRET)
api = tweepy.API(auth,wait_on_rate_limit=True)

Now we setup dates. We need to setup today and yesterday.

In [20]:
today = datetime.date.today()
yesterday= today - datetime.timedelta(days=1)
today, yesterday

(datetime.date(2021, 6, 20), datetime.date(2021, 6, 19))

We search for tweets on Twitter by using the `Cursor()` function. 
We pass the `api.search` parameter to the cursor, as well as the query string, which is specified through the `q` parameter of the cursor.
The query string can receive many parameters, such as the following (not mandatory) ones:
* `from:` - to specify a specific Twitter user profile
* `since:` - to specify the beginning date of search
* `until:` - to specify the ending date of search
The cursor can also receive other parameters, such as the language and the `tweet_mode`. If `tweet_mode='extended'`, all the text of the tweet is returned, otherwise only the first 140 characters.

In [None]:
# # example 
# code tweets = tweepy.Cursor(api.search, tweet_mode=’extended’) 
# for tweet in tweets:
#     content = tweet.full_text

In [13]:
# tweets_list = tweepy.Cursor(api.search, q="#Covid-19 since:" + str(yesterday)+ " until:" + str(today),tweet_mode='extended', lang='en').items()

In [21]:
tweets_list = tweepy.Cursor(api.search, q=f"#Covid-19 since:{str(yesterday)} until:{str(today)}",tweet_mode='extended', lang='en').items(5)

Now we loop across the `tweets_list`, and, for each tweet, we extract the text, the creation date, the number of retweets and the favourite count. We store every tweet into a list, called `output`.

In [22]:
output = []
for tweet in tweets_list:
    text = tweet._json["full_text"]
    print(text)
    logger.debug(f"full_text: '{text}'")
    favourite_count = tweet.favorite_count
    retweet_count = tweet.retweet_count
    created_at = tweet.created_at
    
    line = {'text' : text, 'favourite_count' : favourite_count, 'retweet_count' : retweet_count, 'created_at' : created_at}
    output.append(line)

2021-06-20 12:31:49,011 - DEBUG - PARAMS: {'q': b'#Covid-19 since:2021-06-19 until:2021-06-20', 'tweet_mode': b'extended', 'lang': b'en'}
2021-06-20 12:31:49,027 - DEBUG - Signing request <PreparedRequest [GET]> using client <Client client_key=pw0ihLFxH3nwDrd4HBd7pqUrc, client_secret=****, resource_owner_key=1360011857969479682-iLrxBUlqdtExwkqiN9iZsHYDXIFTZz, resource_owner_secret=****, signature_method=HMAC-SHA1, signature_type=AUTH_HEADER, callback_uri=None, rsa_key=None, verifier=None, realm=None, encoding=utf-8, decoding=None, nonce=None, timestamp=None>
2021-06-20 12:31:49,033 - DEBUG - Including body in call to sign: False
2021-06-20 12:31:49,035 - DEBUG - Collected params: [('q', '#Covid-19 since:2021-06-19 until:2021-06-20'), ('tweet_mode', 'extended'), ('lang', 'en'), ('oauth_nonce', '102771151222866023801624181509'), ('oauth_timestamp', '1624181509'), ('oauth_version', '1.0'), ('oauth_signature_method', 'HMAC-SHA1'), ('oauth_consumer_key', 'pw0ihLFxH3nwDrd4HBd7pqUrc'), ('oaut

In [23]:
output

[{'text': 'RT @StateDeptSpox: Wheels up! Our donation of 2.5 million vaccine doses is on the way to Taiwan, whose health partnership with the U.S. hel…',
  'favourite_count': 0,
  'retweet_count': 1419,
  'created_at': datetime.datetime(2021, 6, 19, 23, 59, 59)},
 {'text': '“On average, it takes about two weeks for [immunoglobulin G] antibodies to be detectable...so now we’re talking about a December 24 potential infection or sometime even prior to that.” —@kerinalthoff https://t.co/CDoLJ4GAHz',
  'favourite_count': 3,
  'retweet_count': 1,
  'created_at': datetime.datetime(2021, 6, 19, 23, 59, 59)},
 {'text': 'RT @CNNIndonesia: Jubir Satgas Covid-19 Wiku Adisasmito Positif Corona https://t.co/4SMFUBwwTY',
  'favourite_count': 0,
  'retweet_count': 191,
  'created_at': datetime.datetime(2021, 6, 19, 23, 59, 59)},
 {'text': "RT @BrianMteleSUR: Brazil reached the horrible milestone of 500,000 Covid 19 deaths today. This is US-puppet ruler Jair Bolsonaro's grim le…",
  'favourite_count': 

Finally, we convert the `output` list to a `pandas DataFrame` and we store results.

In [None]:
output

In [24]:
import pandas as pd

df = pd.DataFrame(output)
df.to_csv('output.csv', mode='a', header=False)
#df.to_csv('output.csv')

In [25]:
df.shape

(5, 4)

In [26]:
df.head(10)

Unnamed: 0,text,favourite_count,retweet_count,created_at
0,RT @StateDeptSpox: Wheels up! Our donation of ...,0,1419,2021-06-19 23:59:59
1,"“On average, it takes about two weeks for [imm...",3,1,2021-06-19 23:59:59
2,RT @CNNIndonesia: Jubir Satgas Covid-19 Wiku A...,0,191,2021-06-19 23:59:59
3,RT @BrianMteleSUR: Brazil reached the horrible...,0,74,2021-06-19 23:59:59
4,RT @thehill: Unvaccinated NFL player rips leag...,0,40,2021-06-19 23:59:58
