# Querying an API on a loop and updating a file

This notebook shows how to create a script to query Twitter's API multiple times and update and output a `.csv` file.

**Importing libraries**

In [1]:
import pandas as pd
import numpy as np
from requests import get
from tweepy import OAuthHandler, API

**Set keys and tokens**

In [2]:
apikey = '' # Your credentials here
apiSecret = '' # Your credentials here
accessToken = '' # Your credentials here
accessSecret = '' # Your credentials here

**Build the api object that will interact with Tiwtter**

In [3]:
auth = OAuthHandler(apikey, apiSecret)
auth.set_access_token(accessToken, accessSecret)
api = API(auth)

In [4]:
print(api)

<tweepy.api.API object at 0x1181dacc0>


**Create an empty dataframe with the right columns to hold the data**

In [14]:
twitterData = pd.DataFrame(columns=['created_at', 'tweet_id', 'text', 'user_id', 'user_name'])

**Query the Twitter API once to start populating the dataframe**

In [15]:
tweets = api.search('#WWC2019')

In [16]:
for tweet in tweets:
    created_at = tweet._json['created_at']
    tweet_id = tweet._json['id_str']
    text = tweet._json['text']
    user_id = tweet._json['user']['id_str']
    user_name = tweet._json['user']['name']
    thisRow = [created_at, tweet_id, text, user_id, user_name]
    twitterData.loc[len(twitterData)] = thisRow
twitterData.head()

Unnamed: 0,created_at,tweet_id,text,user_id,user_name
0,Fri Jun 07 14:21:00 +0000 2019,1137001517104422914,.@AndyKerrtv is joined by former #Matildas str...,576249121,beIN SPORTS
1,Fri Jun 07 14:20:49 +0000 2019,1137001467821350912,Quand tu reçois un SMS ce matin te disant que ...,2854595981,Laetitia Béraud
2,Fri Jun 07 14:20:14 +0000 2019,1137001324854296576,RT @ritaguari: Al via la #wwc2019 e domenica ...,880920558,Esedra
3,Fri Jun 07 14:19:54 +0000 2019,1137001239009357824,"The #FifaWWC kicks off today in #Paris, #Franc...",19654089,Keph Senett
4,Fri Jun 07 14:19:53 +0000 2019,1137001235360440320,RT @LaLigaEN: 🤩 The @FIFAWWC starts TODAY! 🤩\n...,881833633505521664,Asuzú


**Export the dataframe to a new file**

In [17]:
twitterData.to_csv('../output/TwitterData.csv', encoding='utf-8')

**Create a script that does the following:**
* Loads the existing file with tweets into a dataframe
* Queries the Twitter API
* For every tweet:
    * Verifies to see if that tweet is already in the dataframe by comparing `tweet_id`
    * If it is not in the dataframe, adds it to it
* Exports the updated file

In [67]:
myData = pd.read_csv('../output/TwitterData.csv', delimiter=',', encoding='utf-8', index_col=0) # loading existing csv with data
myData.head()
print('There are already', len(myData), 'tweets in the file')
tweets = api.search('#WWC2019') # querying twitter api for tweets
counter = 0
for tweet in tweets:
    created_at = tweet._json['created_at']
    tweet_id = tweet._json['id_str']
    text = tweet._json['text']
    user_id = tweet._json['user']['id_str']
    user_name = tweet._json['user']['name']
    thisRow = [created_at, tweet_id, text, user_id, user_name]
    
    # test to see if tweets are already in the dataframe
    existing_ids = myData['tweet_id'].unique()
    if int(tweet_id) in existing_ids:
        pass
    else:
        myData.loc[len(myData)] = thisRow
        counter += 1
myData.to_csv('../output/TwitterData.csv', encoding='utf-8')
print(counter, 'tweets were added to the file')
myData.tail(10)

There are already 377 tweets in the file
5 tweets were added to the file


Unnamed: 0,created_at,tweet_id,text,user_id,user_name
372,Fri Jun 07 15:08:02 +0000 2019,1137013352859750400,RT @LaLigaEN: 🤩 The @FIFAWWC starts TODAY! 🤩\n...,277912537,DejiFapo
373,Fri Jun 07 15:07:54 +0000 2019,1137013318479155200,🇧🇷 Si hablamos de promesas el nombre de GEYSE ...,2941481601,Somos Cantera™
374,Fri Jun 07 15:07:52 +0000 2019,1137013309037776901,🇨🇱 Elisa DURAN (17 años) fue una de la sorpres...,2941481601,Somos Cantera™
375,Fri Jun 07 15:07:50 +0000 2019,1137013299902603264,🇦🇷 La más joven del plantel de Argentina es: D...,2941481601,Somos Cantera™
376,Fri Jun 07 15:07:47 +0000 2019,1137013287965646848,🇺🇸 Las actuales campeonas del Mundial tendrá e...,2941481601,Somos Cantera™
377,Fri Jun 07 15:10:38 +0000 2019,1137014007007797248,RT @LaLigaEN: 🤩 The @FIFAWWC starts TODAY! 🤩\n...,724854349,Chuky Unadulterated
378,Fri Jun 07 15:10:34 +0000 2019,1137013989244899329,⏳Unas horas antes la Copa Mundial Femenina de ...,952104950,Francia en Guatemala
379,Fri Jun 07 15:10:24 +0000 2019,1137013947696058368,RT @Candlewick: Every school subject can be ex...,884499458863529984,Laurie Smith
380,Fri Jun 07 15:09:26 +0000 2019,1137013703822454785,RT @MikeLiggins: Watch this !! On @BBCLookEast...,196675005,Nathan Moore
381,Fri Jun 07 15:08:13 +0000 2019,1137013398502346752,RT @extra3: Heute beginnt die Frauen-WM. Wer k...,901839136012808195,xDark
