# APIs for data retrieval

An Application Programming Interface, commonly known as API, is a set of protocols, routines, and tools for building software applications. APIs allow different software systems to communicate with each other and exchange data in a standardized and efficient way.

APIs for retrieving data enable developers to access and extract information from various sources such as databases, web services, or applications. These APIs often provide a structured and consistent way of accessing data, making it easier for developers to consume and use the data in their applications.

APIs for retrieving data can be used for a variety of purposes, such as gathering information for business intelligence, analyzing user behavior, or integrating data from different sources into a single application. These APIs often use standardized data formats, such as JSON or XML, to ensure compatibility and interoperability between different systems.

As the amount of data available online continues to grow, APIs for retrieving data have become an essential tool for developers to access and analyze this data. By leveraging these APIs, developers can quickly and easily retrieve the data they need, without having to manually extract and process it themselves.

### API types
There are several types of APIs available, but two of the most commonly used types are REST API and HTTP API.

- REST API (Representational State Transfer API):
REST stands for Representational State Transfer, and it's a set of architectural principles for building web services. A REST API is a type of web service that follows the REST architecture principles. REST APIs use HTTP methods (GET, POST, PUT, DELETE, etc.) to access and manipulate resources, which are identified by URIs (Uniform Resource Identifiers). REST APIs typically return data in JSON or XML format and are widely used for building web and mobile applications.


- HTTP API (Hypertext Transfer Protocol API):
HTTP stands for Hypertext Transfer Protocol, which is the protocol used for transferring data over the World Wide Web. An HTTP API is a type of web service that uses HTTP methods to access and manipulate resources. An HTTP API can be RESTful, but it doesn't have to be. HTTP APIs are often used for simple operations like CRUD (Create, Read, Update, Delete) on resources and return data in JSON or XML format.


Other types of APIs include SOAP (Simple Object Access Protocol), GraphQL, and WebSockets. SOAP is an older protocol used for building web services, while GraphQL is a newer API technology that allows clients to specify the data they need and receive it in a single request. WebSockets are used for real-time, two-way communication between a client and a server.

### python and APIs

Python is a popular programming language that provides powerful tools for interacting with APIs. Here are the steps to interact with APIs using Python:

1. Import the necessary libraries: Python has several libraries that make it easy to interact with APIs, including requests, json, and urllib. Before making any API requests, you need to import the appropriate libraries.

2. Find the API endpoint: The endpoint is the URL that you will use to send your API requests. It's essential to understand the API documentation to find the correct endpoint for the specific data you want to retrieve.

3. Send a request: Once you have the endpoint, you can use Python's requests library to send an HTTP request to the API endpoint. The requests library has several methods for sending different types of HTTP requests, including GET, POST, PUT, DELETE, and more.

4. Parse the response: The API response will typically be in JSON format. Python's json library can be used to parse the JSON data and convert it into a Python dictionary that you can easily work with in your code.

5. Extract the data: Once you have the API response in a Python dictionary, you can extract the data you need and use it in your application.

Python's ease of use and powerful libraries make it an excellent language for interacting with APIs. With just a few lines of code, you can send requests to APIs, parse the response data, and extract the information you need to build powerful applications.

### Using the request package

To do this, we need to know how to send requests first. We will use an amazing package called [`requests`](http://docs.python-requests.org/en/master/). If you do not have it installed, please install it using e.g. `poetry add` (in your command prompt or terminal):


```$ poetry add requests``` or ```$ pip install requests```


In [21]:
import requests # library for making HTTP requests
import pandas as pd # library for data analysis
import datetime as dt # library for handling date and time objects

###### open DMI weather data

Go to the [documentation](https://opendatadocs.dmi.govcloud.dk/en/DMIOpenData)

Using weather as an example, we should first know what is the request URL (where the request goes to), with what parameters(e.g., API key and stationID). In our case, we know that our API key and the stationId to query so we can do the following.

You will have to create a user and retrieve an API key for the API you want to use [how to](https://opendatadocs.dmi.govcloud.dk/Authentication)

I have saved my API key in an file ```.env```
    
    api_key = your_api_key

Specifically we will look at [Meteorological Observation](https://opendatadocs.dmi.govcloud.dk/en/APIs/Meteorological_Observation_API)

In [12]:
from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.environ['api_key']

ModuleNotFoundError: No module named 'dotenv'

# Alternative: Read the keys and tokens into a dictionary
my_dict = {}

# with open("./.env", "r") as f:
for line in f:
        key,val = line.split('=')
        my_dict[key.strip()] = val.strip()
        
api_key = my_dict['api_key']

In [17]:
stationId = '06074' # list of stationId: https://confluence.govcloud.dk/pages/viewpage.action?pageId=41717704

In [18]:
DMI_URL = 'https://dmigw.govcloud.dk/v2/metObs/collections/station/items'
r = requests.get(DMI_URL, params={'api-key': api_key, "stationId": stationId}) # Issues a HTTP GET request
r.url # `requests` help us encode the URL in the correct format

NameError: name 'api_key' is not defined

In [19]:
r.status_code # 200 means success

NameError: name 'r' is not defined

In [None]:
station = r.json()
station

In [None]:
DMI_URL = 'https://dmigw.govcloud.dk/v2/metObs/collections/observation/items'
r = requests.get(DMI_URL, params={'api-key': api_key, "stationId": '06074', 'period': 'latest-day', 'parameterId': 'temp_dry'}) # Issues a HTTP GET request
print(r)

In [None]:
dmi = r.json()  # Extract JSON data
dmi  # Print the keys of the JSON dictionary

JSON object will be converted into a `dict` type, which is the data structure in Python holding key value pairs. To access certain values, we just access them like a `dict`.

In [None]:
dmi['features']

In [None]:
dmi['features'][0]

In [None]:
dmi['features'][0]['properties']

In [None]:
dmi['features'][0]['properties']['value']

In [None]:
for feature in dmi['features']:
    for key, value in feature['properties'].items():
         print(key, value)

Now it gets interesting, as we can put the values into a dataframe (more on dataframes later).

In [None]:
import pandas as pd

lst = []

for values in dmi['features']:
    lst.append(pd.DataFrame.from_dict(values['properties'], orient='index').transpose())

In [None]:
df = pd.concat(lst).reset_index()

In [None]:
df

### HTTP API

An example from Open Data DK
https://www.opendata.dk/syddjurs-kommune/indeklima-administrationsbygningen-i-hornslet1

In [None]:
import requests
import pandas as pd

In [None]:
r = requests.get('https://os2iot-backend.prod.os2iot.kmd.dk/api/v1/open-data-dk-sharing/22/data/22')
r.json()

In [None]:
schema = r.json()[0][0].keys()
df = pd.DataFrame(columns=schema)
df['time'] = []

for t in r.json():
    for i in t:
        df.loc[len(df.index)] = [
            i['id'],
            i['type'],
            i['name']['value'],
            i['temperature']['value'],
            i['humidity']['value'],
            i['light_level']['value'],
            i['motion']['value'],
            i['co2']['value'],
            i['location']['value']['coordinates'],
            i['temperature']['observedAt'],
        ]

In [None]:
df

An example from Open Data DK
https://www.opendata.dk/city-of-aarhus/transaktionsdata-fra-aarhus-kommunes-biblioteker

In [None]:
url = 'https://admin.opendata.dk/api/3/action/datastore_search?resource_id=5b9b00f9-543e-4ac0-994c-dbbc8b38e7e5'
r = requests.get(url)
r.json()

In [None]:
# Using sql in query for filtering
sql_url = 'https://admin.opendata.dk/api/3/action/datastore_search_sql?sql=SELECT * from "5b9b00f9-543e-4ac0-994c-dbbc8b38e7e5" WHERE id=3'
r = requests.get(sql_url)
r.json()

return to [overview](../00_overview.ipynb)