# Examples using API's in Python

## How to use API's

API's are very different but usually consist of the following components:

**Request:** Like any other interactions with the web, using API's involves sending a request (GET or POST) to the API.

**Endpoint:** API's usually consists of different endpoints. These can be considered different outlets. Endpoints are simply URLs we send the request to.

**Parameters:** Parameters are the arguements the endpoint accepts. Some may be required, others are optionals.

**Authentication:** Most API's requires some kind of authentication. This can be either HTTPS authentication (username and password) or authentication via tokens. Tokens are essentially unique keys that identify who is making the request.

## Example 1: Using the Statistics Denmark’s API for StatBank

Link to API documentation: https://www.dst.dk/en/Statistik/brug-statistikken/muligheder-i-statistikbanken/api

The Statistics Denmark's API for StatBank makes it possible to access the data in Statbank.

The following demonstrates how to interact with the API directly via python.

*Note*: The StatBank API does not require authentication

### Finding the right table

The StatBank has several API's. The most useful is their API for extracting data: https://api.statbank.dk/v1/data

However, making use of the data API requires knowing what to ask it, which depends on the table we want to extract data from.

The "tableinfo" API returns information regarding a specific table in StatBank: https://api.statbank.dk/v1/tableinfo

Before even interacting with the API, it makes the most sense to find the table you want to draw data from via the main StatBank page: http://www.statbank.dk/statbank5a/default.asp?w=2560


### Extracting information about the Danish population ("FOLK1C")

In the following, we use the "tableinfo" API to find information regarding the table: FOLK1C.

In [None]:
import json
import requests

statbank_api = "https://api.statbank.dk/v1/tableinfo"  #Link to the API
table_req = {"lang": "en", 
             "table": "folk1c"}  #The request to be send (JSON format) - note the table input!

stat_req = requests.post(statbank_api, json=table_req)  #Send the requests

table_json = json.loads(stat_req.text, encoding = 'utf-8')  #Load the data as JSON (allowing us to interact with the data)
print(json.dumps(table_json, indent=4, ensure_ascii=False)) #Print the data as JSON

With the `table_json` containing the information about the table FOLK1C, we can extract specific information about the table.

In [None]:
table_json['description']

In [None]:
for variable in table_json['variables']:
    print(variable['id'])

In [None]:
table_json['variables'][0]  #OMRÅDE (area/municipality)

In [None]:
table_json['variables'][2]  #Alder (age)

### Extracting data from the StatBank

Using the information above, we can now request specific data from the data API.

In [None]:
statbank_api = "https://api.statbank.dk/v1/data"  #Adress of the data API

data_req = {'table': 'folk1c',
            'format': 'CSV',
            'variables': [{'code': 'OMRÅDE', 'values': ['101', '851']},  #Request in JSON
                                                            {'code': 'ALDER', 'values': ['20-24', '25-29']}]
           }

data_req = requests.post(statbank_api, json=data_req)  #Sending requests

print(data_req.text)  #Printing the raw text output

The data API returns commma-separated values by default (csv).

This output is directly readable by the `pandas` package (`pd.read_csv`)

In [None]:
from io import StringIO
import pandas as pd

dstdata = StringIO(data_req.text)  #Read the data output as raw text
dstdf = pd.read_csv(dstdata, sep=";")  #Read text as csv
dstdf  #Print data

In [None]:
dstdf.groupby(['OMRÅDE']).sum()  #Group by municipality and count sum

## Example 2: Using the Twitter API

*NOTE*: This notebook uses a token that is not included in the notebook. You will not be able to reproduce this on your own computer without proper authentication (for this you need access to the Twitter enterprise API: https://developer.twitter.com/en/docs/twitter-api/getting-started/getting-access-to-the-twitter-api

The Twitter API contains a wide variety of endpoints for both interacting with Twitter (sending tweets, replying) and for retrieving data.

The example below uses the "Search Tweets" endpoints (full archive search): https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference/get-tweets-search-all. The example retrieves tweets from Elon Mush from the last week.

It is adapted from Twitter's own sample code: https://github.com/twitterdev/Twitter-API-v2-sample-code/blob/main/Full-Archive-Search/full-archive-search.py

In [None]:
import requests
import os
import json
import time
from datetime import datetime, timedelta

# token and endpoint
with open(os.path.join("C:/", "repos", "tokens", "twitter_bearer.txt"), 'r') as f:
    bearer_token = f.read()

search_url = "https://api.twitter.com/2/tweets/search/all"

# set start_time
d = datetime.today() - timedelta(days=7)
start_time = f"{str(d.date())}T00:00:00Z"

query_params = {'query': 'from:elonmusk -is:retweet',
                'tweet.fields': 'entities,public_metrics,created_at,referenced_tweets',
                'expansions': 'author_id',
                'user.fields': 'created_at,description,public_metrics,url,verified', 
                'max_results': 500,
                'start_time': start_time}


def bearer_oauth(r):
    """
    Method required by bearer token authentication.
    """

    r.headers["Authorization"] = f"Bearer {bearer_token}"
    r.headers["User-Agent"] = "v2FullArchiveSearchPython"
    return r


def connect_to_endpoint(url, params):
    response = requests.get(search_url, auth=bearer_oauth, params=params)
    #print(response.status_code)
    if response.status_code != 200:
        raise Exception(response.status_code, response.text)
    return response.json()


def initial():
    json_response = connect_to_endpoint(search_url, query_params)
    return(json_response)

def continued(next_token):
    new_params = query_params.copy()
    new_params['next_token'] = next_token
    json_response = connect_to_endpoint(search_url, new_params)
    return(json_response)

data = initial()
all_data = data.copy()
all_data.pop('meta', None)

used_next_tokens = []
next_token = data.get('meta').get('next_token')

if next_token is not None:
    while True:
        time.sleep(1)
        data = continued(next_token)
        all_data['data'] = all_data.get('data') + data.get('data')
        all_data['includes']['users'] = all_data.get('includes').get('users') + data.get('includes').get('users')

        used_next_tokens.append(next_token)

        next_token = data.get('meta').get('next_token')

        if next_token is None:
            break

In [None]:
#data

Convert to data frame (will require further data wrangling)

In [None]:
import pandas as pd

df_tweets = pd.DataFrame.from_records(all_data.get('data'))

In [None]:
df_tweets.head()