# APIs

We are going to work through three very different examples _from scratch_ using the examples in their documentation in order to showcase the different flavors of APIs and the thought process of working through them.

Take-away messages:
* Googling the data you'd like + API will likely yield results (if the API exists).
* APIs come in different shapes and sizes:
    * Some contain just the json data (e.g. RKI API)
    * Some take parameters that help you customize the data you get out (e.g. Google Trends API)
    * Some have dedicated libraries (e.g. twitter API / tweepy, Genius lyrics API)

### API example

### 1. Robert Koch Institut (RKI) API

**Task: I would like to have the data for COVID incidence numbers in Germany**

In [1]:
import requests
import pandas as pd

In [2]:
url = 'https://api.corona-zahlen.org/germany/history/incidence'

In [3]:
response = requests.get(url)

In [4]:
dir(response)

['__attrs__',
 '__bool__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__enter__',
 '__eq__',
 '__exit__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__nonzero__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_content',
 '_content_consumed',
 '_next',
 'apparent_encoding',
 'close',
 'connection',
 'content',
 'cookies',
 'elapsed',
 'encoding',
 'headers',
 'history',
 'is_permanent_redirect',
 'is_redirect',
 'iter_content',
 'iter_lines',
 'json',
 'links',
 'next',
 'ok',
 'raise_for_status',
 'raw',
 'reason',
 'request',
 'status_code',
 'text',
 'url']

In [6]:
response.text

'{"data":[{"weekIncidence":0.0012013870157263002,"date":"2020-01-07T00:00:00.000Z"},{"weekIncidence":0.0012013870157263002,"date":"2020-01-08T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-09T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-10T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-11T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-12T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-13T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-14T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-15T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-16T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-17T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-18T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-19T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-20T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-21T00:00:00.000Z"},{"weekIncidence":0,"date":"2020-01-22T00:00:00.000Z"},{"weekIncidence":0.0012013870157263002,"date":"2020-01-23T00:00:00.000Z"},{"weekIncide

In [8]:
response.json()

{'data': [{'weekIncidence': 0.0012013870157263002,
   'date': '2020-01-07T00:00:00.000Z'},
  {'weekIncidence': 0.0012013870157263002, 'date': '2020-01-08T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-09T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-10T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-11T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-12T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-13T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-14T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-15T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-16T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-17T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-18T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-19T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-20T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-21T00:00:00.000Z'},
  {'weekIncidence': 0, 'date': '2020-01-22T00:00:

In [9]:
rki_data = response.json()

In [10]:
type(rki_data)

dict

In [11]:
rki_data.keys()

dict_keys(['data', 'meta'])

In [21]:
rki_data['meta']

{'source': 'Robert Koch-Institut',
 'contact': 'Marlon Lueckert (m.lueckert@me.com)',
 'info': 'https://github.com/marlon360/rki-covid-api',
 'lastUpdate': '2022-10-04T00:00:00.000Z',
 'lastCheckedForUpdate': '2022-10-04T08:40:56.351Z'}

In [13]:
data_of_interest = rki_data['data']

In [14]:
rki_df = pd.DataFrame(data_of_interest)

In [16]:
rki_df.head()

Unnamed: 0,weekIncidence,date
0,0.001201,2020-01-07T00:00:00.000Z
1,0.001201,2020-01-08T00:00:00.000Z
2,0.0,2020-01-09T00:00:00.000Z
3,0.0,2020-01-10T00:00:00.000Z
4,0.0,2020-01-11T00:00:00.000Z


In [17]:
rki_df['date'] = pd.to_datetime(rki_df['date'])

In [18]:
rki_df.dtypes

weekIncidence                float64
date             datetime64[ns, UTC]
dtype: object

In [19]:
rki_df.head()

Unnamed: 0,weekIncidence,date
0,0.001201,2020-01-07 00:00:00+00:00
1,0.001201,2020-01-08 00:00:00+00:00
2,0.0,2020-01-09 00:00:00+00:00
3,0.0,2020-01-10 00:00:00+00:00
4,0.0,2020-01-11 00:00:00+00:00


In [20]:
rki_df.tail()

Unnamed: 0,weekIncidence,date
996,498.033786,2022-09-29 00:00:00+00:00
997,497.005399,2022-09-30 00:00:00+00:00
998,480.811903,2022-10-01 00:00:00+00:00
999,471.050634,2022-10-02 00:00:00+00:00
1000,373.953334,2022-10-03 00:00:00+00:00


### 2. Google Trends API

**Task: I would like to examine google search results for a specific keyword in a specific region**

**Method 1:**

In [None]:
# define search params

api_key =  "89c5510a126ddca54cecfbb2619677a79a8634a6d394a5710272762771bad372", #my api key for authentication
engine =  "google", # search engine
q = "coffee",       # keyword
location =  "Austin, Texas, United States" # where we want the search to originate
gl =  "us"         # country to use for search
hl =  "en"          # language of search results



In [22]:
# construct the url with params

base_url = "https://serpapi.com/search"

url_params = ".json?" + "engine=google" + \
             "&q=coffee" + "&location=Austin%2C+Texas%2C+United+States" + \
             "&gl=us" + "&hl=en" + \
             "&api_key=89c5510a126ddca54cecfbb2619677a79a8634a6d394a5710272762771bad372"
            
url = base_url + url_params

In [23]:
url

'https://serpapi.com/search.json?engine=google&q=coffee&location=Austin%2C+Texas%2C+United+States&gl=us&hl=en&api_key=89c5510a126ddca54cecfbb2619677a79a8634a6d394a5710272762771bad372'

In [24]:
response = requests.get(url)

In [25]:
response_json = response.json()

In [26]:
response_json

{'search_metadata': {'id': '633bf6642e7d6b362b45cff8',
  'status': 'Success',
  'json_endpoint': 'https://serpapi.com/searches/e23d7516facbbc80/633bf6642e7d6b362b45cff8.json',
  'created_at': '2022-10-04 09:01:24 UTC',
  'processed_at': '2022-10-04 09:01:24 UTC',
  'google_url': 'https://www.google.com/search?q=coffee&oq=coffee&uule=w+CAIQICIaQXVzdGluLFRleGFzLFVuaXRlZCBTdGF0ZXM&hl=en&gl=us&sourceid=chrome&ie=UTF-8',
  'raw_html_file': 'https://serpapi.com/searches/e23d7516facbbc80/633bf6642e7d6b362b45cff8.html',
  'total_time_taken': 3.21},
 'search_parameters': {'engine': 'google',
  'q': 'coffee',
  'location_requested': 'Austin, Texas, United States',
  'location_used': 'Austin,Texas,United States',
  'google_domain': 'google.com',
  'hl': 'en',
  'gl': 'us',
  'device': 'desktop'},
 'search_information': {'organic_results_state': 'Results for exact spelling',
  'query_displayed': 'coffee',
  'total_results': 3780000000,
  'time_taken_displayed': 0.89,
  'menu_items': [{'position': 

In [27]:
response_json.keys()

dict_keys(['search_metadata', 'search_parameters', 'search_information', 'local_map', 'local_results', 'knowledge_graph', 'inline_videos', 'related_questions', 'organic_results', 'related_searches', 'pagination', 'serpapi_pagination'])

In [28]:
response_json['search_metadata']

{'id': '633bf6642e7d6b362b45cff8',
 'status': 'Success',
 'json_endpoint': 'https://serpapi.com/searches/e23d7516facbbc80/633bf6642e7d6b362b45cff8.json',
 'created_at': '2022-10-04 09:01:24 UTC',
 'processed_at': '2022-10-04 09:01:24 UTC',
 'google_url': 'https://www.google.com/search?q=coffee&oq=coffee&uule=w+CAIQICIaQXVzdGluLFRleGFzLFVuaXRlZCBTdGF0ZXM&hl=en&gl=us&sourceid=chrome&ie=UTF-8',
 'raw_html_file': 'https://serpapi.com/searches/e23d7516facbbc80/633bf6642e7d6b362b45cff8.html',
 'total_time_taken': 3.21}

In [29]:
response_json['search_information']

{'organic_results_state': 'Results for exact spelling',
 'query_displayed': 'coffee',
 'total_results': 3780000000,
 'time_taken_displayed': 0.89,
 'menu_items': [{'position': 1, 'title': 'All'},
  {'position': 2,
   'title': 'Images',
   'link': 'https://www.google.com/search?q=coffee&hl=en&gl=us&source=lnms&tbm=isch&sa=X&ved=2ahUKEwjK5pKkm8b6AhWLkWoFHdAOCoQQ_AUoAXoECAEQAw',
   'serpapi_link': 'https://serpapi.com/search.json?device=desktop&engine=google&gl=us&google_domain=google.com&hl=en&location=Austin%2C+Texas%2C+United+States&q=coffee&tbm=isch'},
  {'position': 3,
   'title': 'Maps',
   'link': 'https://maps.google.com/maps?hl=en&q=coffee&uule=w+CAIQICIaQXVzdGluLFRleGFzLFVuaXRlZCBTdGF0ZXM&gl=us&um=1&ie=UTF-8&sa=X&ved=2ahUKEwjK5pKkm8b6AhWLkWoFHdAOCoQQ_AUoAnoECAEQBA'},
  {'position': 4,
   'title': 'Shopping',
   'link': 'https://www.google.com/search?q=coffee&hl=en&gl=us&source=lnms&tbm=shop&sa=X&ved=2ahUKEwjK5pKkm8b6AhWLkWoFHdAOCoQQ_AUoA3oECAEQBQ',
   'serpapi_link': 'https://se

**method 2:**

In [30]:
!pip install google-search-results




In [31]:
from serpapi import GoogleSearch

params = {
  "api_key": "89c5510a126ddca54cecfbb2619677a79a8634a6d394a5710272762771bad372",
  "engine": "google",
  "q": "Coffee",
  "location": "Austin, Texas, United States",
  "google_domain": "google.com",
  "gl": "us",
  "hl": "en"
}

search = GoogleSearch(params)
results = search.get_dict()

https://serpapi.com/search


In [32]:
results.keys()

dict_keys(['search_metadata', 'search_parameters', 'search_information', 'local_map', 'local_results', 'knowledge_graph', 'inline_videos', 'related_questions', 'organic_results', 'related_searches', 'pagination', 'serpapi_pagination'])

In [33]:
results['search_metadata']

{'id': '633bf7711d986ec8bb84d7ea',
 'status': 'Success',
 'json_endpoint': 'https://serpapi.com/searches/9a8252b096d3ca4c/633bf7711d986ec8bb84d7ea.json',
 'created_at': '2022-10-04 09:05:53 UTC',
 'processed_at': '2022-10-04 09:05:53 UTC',
 'google_url': 'https://www.google.com/search?q=Coffee&oq=Coffee&uule=w+CAIQICIaQXVzdGluLFRleGFzLFVuaXRlZCBTdGF0ZXM&hl=en&gl=us&sourceid=chrome&ie=UTF-8',
 'raw_html_file': 'https://serpapi.com/searches/9a8252b096d3ca4c/633bf7711d986ec8bb84d7ea.html',
 'total_time_taken': 1.55}

In [34]:
results['search_information']

{'organic_results_state': 'Results for exact spelling',
 'query_displayed': 'Coffee',
 'total_results': 3780000000,
 'time_taken_displayed': 0.74,
 'menu_items': [{'position': 1, 'title': 'All'},
  {'position': 2,
   'title': 'Images',
   'link': 'https://www.google.com/search?q=Coffee&gl=us&hl=en&source=lnms&tbm=isch&sa=X&ved=2ahUKEwj8-s2jnMb6AhXNkmoFHajlB6kQ_AUoAXoECAIQAw',
   'serpapi_link': 'https://serpapi.com/search.json?device=desktop&engine=google&gl=us&google_domain=google.com&hl=en&location=Austin%2C+Texas%2C+United+States&q=Coffee&tbm=isch'},
  {'position': 3,
   'title': 'Maps',
   'link': 'https://maps.google.com/maps?uule=w+CAIQICIaQXVzdGluLFRleGFzLFVuaXRlZCBTdGF0ZXM&gl=us&hl=en&q=Coffee&um=1&ie=UTF-8&sa=X&ved=2ahUKEwj8-s2jnMb6AhXNkmoFHajlB6kQ_AUoAnoECAIQBA'},
  {'position': 4,
   'title': 'Shopping',
   'link': 'https://www.google.com/search?q=Coffee&gl=us&hl=en&source=lnms&tbm=shop&sa=X&ved=2ahUKEwj8-s2jnMb6AhXNkmoFHajlB6kQ_AUoA3oECAIQBQ',
   'serpapi_link': 'https://se

### 3. Twitter API example

In [35]:
!pip install tweepy



In [1]:
# the following script is meant for being used for the TWITTER API-V2
# the least tweepy version to use is 4.01

from credentials_v2 import *
import tweepy
import logging

ModuleNotFoundError: No module named 'credentials_v2'

In [38]:

##### AUTHENTICATION #####
client = tweepy.Client(bearer_token=BEARER_TOKEN, 
                       access_token=API_KEY,
                       access_token_secret=API_KEY_SECRET)


if client:
    logging.critical("\nAutentication OK")
else:
    logging.critical('\nVerify your credentials')


CRITICAL:root:
Autentication OK


In [41]:
#### 1. LOOKUP USERS USING THEIR USERNAME

# for user_fields parameters check here:
# https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user

elon = client.get_user(username='elonmusk', user_fields=['name','id','created_at'])

greta = client.get_user(username='gretathunberg', user_fields=['name','id','created_at'])


#print(f'user with name {elon.data.name} and ID {elon.data.id} created its twitter account on {elon.data.created_at}')

print(f'user with name {greta.data.name} and ID {greta.data.id} created its twitter account on {greta.data.created_at}')

user with name Greta Thunberg and ID 1006419421244678144 created its twitter account on 2018-06-12 06:14:23+00:00


In [42]:
#### LOOKUP AT USER'S TIMELINE

## elon musk's timeline
## passing elon id into the function below
# for tweets_fields parameters check here:
# https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet

elon_tweets = client.get_users_tweets(id=elon.data.id, tweet_fields=['id','text','created_at'], max_results=10)

#print(elon_tweets.data)

for tweet in elon_tweets.data:
    print(f"the user {elon.data.name} at {tweet.created_at} wrote:{tweet.text}\n")


the user Elon Musk at 2022-10-03 23:50:00+00:00 wrote:@FrenchyRagin @Kasparov63 A small number of terminals were paid for by the govt, vast majority were not. This is just another bs WaPo hit piece.

the user Elon Musk at 2022-10-03 23:47:45+00:00 wrote:@ZelenskyyUa I still very much support Ukraine, but am convinced that massive escalation of the war will cause great harm to Ukraine and possibly the world.

the user Elon Musk at 2022-10-03 23:41:33+00:00 wrote:@Kasparov63 We gave Starlinks to Ukraine &amp; lost $80M+ in doing so, while putting SpaceX &amp; myself at serious risk of Russian cyberattack.

What have you done besides tweet?

the user Elon Musk at 2022-10-03 23:25:30+00:00 wrote:@DavidSacks SpaceX’s out of pocket cost to enable &amp; support Starlink in Ukraine is ~$80M so far. Our support for Russia is $0. Obviously, we are pro Ukraine.

Trying to retake Crimea will cause massive death, probably fail &amp; risk nuclear war. This would be terrible for Ukraine &amp; Earth.


In [43]:
#### 3. SEARCHING FOR TWEETS #####

# Defining a query search string
query = 'climate change lang:en -is:retweet'  


search_tweets = client.search_recent_tweets(query=query, tweet_fields=['id','created_at','text'], max_results=10)
#print(search_tweets)

for tweet in search_tweets.data:
    logging.critical(f'\n\n\nINCOMING TWEET:\n{tweet.text}\n\n\n')

CRITICAL:root:


INCOMING TWEET:
"Imagine if that footage were accompanied by headlines and commentary linking the horrific scenes to climate change — and then further linking them to Big Oil."
https://t.co/NqBu5zjviR via @commondreams



CRITICAL:root:


INCOMING TWEET:
@RichardWellings Tantamount to climate change lockdowns



CRITICAL:root:


INCOMING TWEET:
Un message important à lire en 2022 par @ShellenbergerMD "I would like to formally apologise for the climate scare we created over the past 30 years". "Climate change is (...) not the end of the world. It’s not even our most serious environmental problem"
https://t.co/1Vs9iquFqf



CRITICAL:root:


INCOMING TWEET:
@mikejbabcock @JimMWeber Seems really out of character for him. Must be climate change.



CRITICAL:root:


INCOMING TWEET:
@jparkerbrown @Duranozfan @goodfoodgal lol. i wish i'm under the hole right now and it's not healing at all. skin cancer rates are up up up. might have stopped growing for a while then china start