# Table of Contents
   1. [Web Communication Fundamentals](#web-communication-fundamentals)
      1. [DNS, IP](#dns-ip)
      2. [HTTP](#http)
      3. [URL](#url)
   2. [HTTP Requests and Responses](#http-requests-and-responses)
      1. [Requests](#requests)
      2. [Response](#response)
   3. [APIs](#apis)
      1. [RESTful APIs](#restful-apis)
   4. [Requests in Python](#requests-in-python)
      1. [GET](#get)
         1. [Accessing an endpoint](#accessing-an-endpoint)
         2. [Response headers](#response-headers)
         3. [Parameters](#parameters)
         4. [Request Headers](#request-headers)
   5. [APIs examples](#apis-examples)
      1. [NewsAPI](#newsapi)
         1. [💡 Check for understanding](#check-for-understanding-1)
      2. [Pokemon API](#pokemon-api)
         1. [json_normalize()](#json_normalize)
      3. [Coincap API](#coincap-api)
      4. [Api Jokes](#api-jokes)
         1. [💡 Check for understanding](#check-for-understanding-2)
   6. [API Wrappers](#api-wrappers)
   7. [Summary](#summary)
   8. [Glossary](#glossary)
   9. [Further materials](#further-materials)

## Web Communication Fundamentals
### DNS, IP
[How do we connect to www.google.com?](https://www.youtube.com/watch?v=sUhEqT_HSBI&ab_channel=ProfeSang)
* **DNS (Domain Name System)**: it's essentially the phonebook of the internet. Humans access information online through domain names, like "google.com". Web browsers, however, interact through Internet Protocol (IP) addresses. In this example, the DNS maps the internet address www.google.com to the server's IP: 216.58.222.196
 * **IP**: Server identification. A code that allows information to be sent and received by the correct parties
 * **Domain Providers**: Sell and purchase Internet domains
 
*When making API calls to specific domains, the DNS translates the human-readable domain name into an IP address.*

### HTTP
**H**yper **T**ext **T**ransfer **P**rotocol
- HTTP is a communications protocol that provides a structure for requests between the client and the server on a network.
- For example, the web browser on the user's computer (the client) uses the HTTP protocol to request information from a website on a server.

*We will use APIs that employ the **HTTP** protocol as transport.*

### URL
Contains information about the resource being requested from the **server**.
![](https://github.com/data-bootcamp-v4/lessons/blob/main/img/http.png?raw=true)
Examples:

- https://www.google.com/webhp?authuser=2
- https://www.towardsdatascience.com
- https://www.ironhack.com/
    - Protocol: https (https == http is the same, but https is encrypted)
    - Domain Name --> ironhack
    - TLD (Top-Level Domain) --> .com

*When calling an API, you specify the URL of the API endpoint, which tells the system where to send the request.*

## HTTP Requests and Responses
**Requests** and **responses** are fundamental components of the client-server communication model.
### Requests
These are queries or calls **sent by the client** (such as a web browser or other software) **to the server** in order to receive information (a **response**). 

A request typically consists of:
- A method (such as GET, POST, PUT, DELETE) that defines the action to be performed
     * GET: read the information of the resource, without modifying it in any way. Accessing the website from the browser **gets** information.
- The URL or endpoint specifying the resource
- Optional additional information such as:
    - Headers (metadata): User-Agent, Accept-Language
    - Parameters
    - Body content. 
    
For example, a client might send a GET request to retrieve information from a web page or a POST request to submit form data.

### Response
These are the answers or data sent **by the server back to the client in reply to a request**. 

A response typically includes:
- A status code that indicates the success or failure of the request
- Headers with meta-information about the server's behavior
- The actual content or data (if applicable), such as HTML, JSON (similar to a Python dictionary), images, or other media types.

An important part of the **header** is the **status code**. This code is a numerical value that indicates the server's result. There are different status codes depending on whether the server has managed to carry out the request or has not managed to do anything. These are some groups of status codes:

- **2xx successful**: the request was successfully received, understood, and accepted
- **3xx redirection**: more actions are required to complete the request
- **4xx client error**: the request contains incorrect syntax or cannot be fulfilled
- **5xx server error**: the server has failed to complete an apparently valid request

Complete list:
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

Much more fun:
https://http.cat/

## APIs
**A**pplication **P**rogramming **I**nterface

- APIs define a set of rules and protocols that allow different software applications to communicate with each other.
- The client calls the server through an API, and the server responds.
-  Within a project, the backend may want to share information with the frontend without access to the database.

### RESTful APIs
As a data analyst, you often need to access data from various sources. RESTful APIs provide a **standardized way** to retrieve, update, or delete data from other systems.
- Usually, **we (the client) send a request, and they (the server) return a response, often in JSON - JavaScript Object Notation - format**. While JSON looks similar to the dictionaries we're used to in Python, it's worth noting that when represented as raw text (e.g., in an API response), the entire JSON structure is a *string*. However, within that structure, JSON can represent various data types like numbers, booleans, and arrays, not just strings.

- APIs always need to provide documentation for their various services: **endpoints**.  Each endpoint is a different URL.  
- Sometimes we have to pass **parameters** to an API endpoint, similar to when we pass parameters to a Python function.

At this point in the Bootcamp, **reading documentation becomes essential**, as each API will "work" differently, and to use it, we will need to know what it requires.

[Here](https://github.com/public-apis/public-apis) is a repository where you can find several free APIs if you feel like exploring a bit.

## Requests in Python
You can make requests to RESTful APIs using libraries like `requests`. 

In [2]:
# If you don't have it installed you can do use using pip or pip3

#!pip install requests

In [1]:
import requests

### GET
Making a `GET request` to the API is simply a call to a `URL` that returns information when provided with the appropriate `parameters`. We will only perform **GET** requests.

Here is an example on how to make a request with Python:

```python
url = "https://api.example.com/products"
response = requests.get(url)

if response.status_code == 200:
    products = response.json()
    # Now you can analyze and work with the products data in Python
```
#### Accessing an endpoint

As we mentioned above, APIs always need to provide documentation for their various services: **endpoints**.  Each endpoint is a different URL.  

**Example: ISS API**

Let's get information from ISS (International Space Station)! We'll start looking at the [ISS API documentation](https://wheretheiss.at/w/developer)

This API allows you to access various data related to the International Space Station (ISS), including its current, past, or future position, timezone information for specific coordinates, and more.

**Key Features**
- **Authentication**: No authentication is currently required, but future endpoints may include this.
- **Rate Limiting**: Limited to approximately 1 request per second.
- **Responses**: Default to JSON format, with optional parameters to modify response appearance.
- **Endpoints**: Several endpoints provide different types of information:
    - **satellites**: Information about satellites, including the ISS.
    - **satellites/[id]**: Position, velocity, and related information for a satellite.
    - **satellites/[id]/positions**: Position data for specific timestamps.
    - **satellites/[id]/tles**: TLE (Two-Line Element Set) data in either JSON or text format.
    - **coordinates/[lat,lon]**: Timezone information for specific coordinates.

**Examples**
- Satellite details: `https://api.wheretheiss.at/v1/satellites`
- ISS position: `https://api.wheretheiss.at/v1/satellites/25544`
- Coordinates information: `https://api.wheretheiss.at/v1/coordinates/37.795517,-122.393693`

**Endpoint satellites**

In [17]:
# We'll use the endpoint satellites, we read it gives us information about satellites

url = "https://api.wheretheiss.at/v1/satellites"

In [18]:
response = requests.get(url) # We use get method to make the request and get information from the API

In [19]:
type(response) # Lets look at the type of the response

requests.models.Response

In [20]:
response.status_code # We can access status code. 200 is OK

200

We can access the response content, which returns a string

In [23]:
response.content

b'[{"name":"iss","id":25544}]'

If response Content-Type is json, we can access it better with `.json()`

In [25]:
info = response.json()

In [37]:
info

[{'name': 'iss', 'id': 25544}]

In [34]:
type(info) # It returns a list of dictionaries

list

In [40]:
info[0]["id"] # So we can access the id like this

25544

#### Response headers 

Response headers are part of the HTTP response that a server sends back to the client after a request has been made. These headers provide meta-information about the response and can affect how the client handles the response.

Here are some common response headers and what they typically represent:

1. **`Date`**: Represents the date and time at which the response was sent.

2. **`Server`**: Provides information about the software used by the originating server.

3. **`X-Rate-Limit`**: Sometimes used in APIs to inform the client about rate limiting policies, such as the number of allowed requests in a given time frame.

4. **`Content-Type`**: Specifies the media type of the resource or data the server is sending back. For example, it could be `application/json` for a JSON object, `text/html` for an HTML page, or `image/png` for an image.

And more.

In [64]:
response.headers # We can also access the response headers

{'Date': 'Sat, 20 Jul 2024 17:18:04 GMT', 'Server': 'Apache/2.2.22 (Ubuntu)', 'X-Powered-By': 'PHP/5.3.10-1ubuntu3.26', 'X-Rate-Limit-Limit': '350', 'X-Rate-Limit-Remaining': '339', 'X-Rate-Limit-Interval': '5 minutes', 'Access-Control-Allow-Origin': '*', 'X-Apache-Time': 'D=17602', 'Cache-Control': 'max-age=0, no-cache', 'Content-Length': '312', 'Keep-Alive': 'timeout=15, max=100', 'Connection': 'Keep-Alive', 'Content-Type': 'application/json'}

#### Parameters
As we mentioned above, sometimes we can pass **parameters** to an API endpoint, similar to when we pass parameters to a Python function.

In the example above, we didn't use any parameters in the endpoint `satellites` since the [documentation]("https://wheretheiss.at/w/developer") said *Parameters: None*.

API parameters are specific values that you include in a request to an API endpoint to filter, sort, or detail the data that you want to retrieve. They allow you to customize the request to get exactly the information you need.

There are several types of parameters that can be used in API requests, such as Path Parameters, Query Parameters, Header Parameters and Request Body Parameters. Let's look at them with an example using another endpoint.

##### 1. **Path Parameters**

These are embedded in the URL path and are used to identify a specific resource. For example, in the URL above `https://api.wheretheiss.at/v1/satellites/25544`, the number `25544` is a path parameter that identifies a specific sattelite.

**Endpoint satellites/id**

Lets use endpoint satellites/[id] with the id we got from the previous endpoint.
Important, we need to provide the whole url for the endpoint as a `string`.

In [90]:
url_iss_position = "https://api.wheretheiss.at/v1/satellites/"+str(info[0]["id"])

In [78]:
response = requests.get(url_iss_position)

In [79]:
response.status_code

200

In [80]:
response.json()

{'name': 'iss',
 'id': 25544,
 'latitude': 25.063465304875,
 'longitude': -33.610288992219,
 'altitude': 413.22240301161,
 'velocity': 27608.258342002,
 'visibility': 'daylight',
 'footprint': 4472.8200926487,
 'timestamp': 1721497282,
 'daynum': 2460512.2370602,
 'solar_lat': 20.462692938223,
 'solar_lon': 276.26994170687,
 'units': 'kilometers'}

Every time we call the endpoint, the information changes since it gives us the current position of the satellite. Let's check that with a loop to gather information at different times.

In [104]:
import time

positions = []

for i in range(10):
    response = requests.get(url_iss_position)
    data = response.json()
    
    positions.append(data)
    time.sleep(0.5)

In [105]:
type(positions) # We created a list with responses

list

In [106]:
type(positions[0]) # Is a list of dictionaries

dict

In [107]:
positions[0] # So we can access each dictionary by its position in the list

{'name': 'iss',
 'id': 25544,
 'latitude': 39.625501639604,
 'longitude': -39.541678044165,
 'altitude': 413.95044846502,
 'velocity': 27616.039832108,
 'visibility': 'daylight',
 'footprint': 4476.5571044671,
 'timestamp': 1721503186,
 'daynum': 2460512.3053935,
 'solar_lat': 20.449470447273,
 'solar_lon': 251.67075991282,
 'units': 'kilometers'}

In [108]:
positions[9]

{'name': 'iss',
 'id': 25544,
 'latitude': 40.009412439537,
 'longitude': -38.90522186263,
 'altitude': 413.98911663732,
 'velocity': 27616.194753021,
 'visibility': 'daylight',
 'footprint': 4476.7554797502,
 'timestamp': 1721503196,
 'daynum': 2460512.3055093,
 'solar_lat': 20.449448028754,
 'solar_lon': 251.62909451338,
 'units': 'kilometers'}

Let's extract 10 latitudes from the 10 dictionaries in the list, with 2 decimals.

In [109]:
latitudes = [round(p["latitude"], 2) for p in positions]

In [110]:
latitudes

[39.63, 39.66, 39.7, 39.74, 39.78, 39.82, 39.89, 39.93, 39.97, 40.01]

In [111]:
import pandas as pd
pd.DataFrame(positions) # We can create a dataframe from the API response to work with it better

Unnamed: 0,name,id,latitude,longitude,altitude,velocity,visibility,footprint,timestamp,daynum,solar_lat,solar_lon,units
0,iss,25544,39.625502,-39.541678,413.950448,27616.039832,daylight,4476.557104,1721503186,2460512.0,20.44947,251.67076,kilometers
1,iss,25544,39.664071,-39.478372,413.954296,27616.055513,daylight,4476.576842,1721503187,2460512.0,20.449468,251.666593,kilometers
2,iss,25544,39.702601,-39.41499,413.958147,27616.071152,daylight,4476.596602,1721503188,2460512.0,20.449466,251.662427,kilometers
3,iss,25544,39.741093,-39.351531,413.962004,27616.08675,daylight,4476.616386,1721503189,2460512.0,20.449464,251.65826,kilometers
4,iss,25544,39.779543,-39.288,413.965864,27616.102305,daylight,4476.636191,1721503190,2460512.0,20.449461,251.654094,kilometers
5,iss,25544,39.817954,-39.224393,413.969729,27616.117817,daylight,4476.656019,1721503191,2460512.0,20.449459,251.649927,kilometers
6,iss,25544,39.894657,-39.096952,413.977471,27616.148717,daylight,4476.695739,1721503193,2460512.0,20.449455,251.641594,kilometers
7,iss,25544,39.932948,-39.033119,413.981349,27616.164104,daylight,4476.715631,1721503194,2460512.0,20.449453,251.637428,kilometers
8,iss,25544,39.9712,-38.96921,413.985231,27616.179449,daylight,4476.735545,1721503195,2460512.0,20.44945,251.633261,kilometers
9,iss,25544,40.009412,-38.905222,413.989117,27616.194753,daylight,4476.75548,1721503196,2460512.0,20.449448,251.629095,kilometers



##### 2. **Query Parameters**

These are added to the end of the URL after a question mark (`?`) and are often used to filter or sort the response. 

For example, in the ISS documentation, it mentions that there is a parameter `units` that can take values `miles` or `kilometers`. So in the URL `https://api.wheretheiss.at/v1/satellites/25544?units=miles`, we added `units=miles` which is a query parameter that shows the data in "miles". 

In general, we add the parameters like this `?param1=value1&param2=value2...` at the end of the URL.

In [119]:
url_iss_position

'https://api.wheretheiss.at/v1/satellites/25544'

In [120]:
# We saw in the documentation that we can add a parameter units to use miles or kilometers

url_iss_position2 = 'https://api.wheretheiss.at/v1/satellites/25544?units=miles'

In [121]:
response = requests.get(url_iss_position2)

In [122]:
response.status_code

200

In [123]:
response.json()

{'name': 'iss',
 'id': 25544,
 'latitude': -13.366485042487,
 'longitude': 101.22157902734,
 'altitude': 262.41358485593,
 'velocity': 17128.322341678,
 'visibility': 'eclipsed',
 'footprint': 2808.1101704677,
 'timestamp': 1721505392,
 'daynum': 2460512.3309259,
 'solar_lat': 20.444523081233,
 'solar_lon': 242.47939603187,
 'units': 'miles'}

We can also do it by passing to the argument `params` a dictionary with the parameters in the `get` method.

In [125]:
parameters = {"units": "miles"}

In [126]:
url_iss_position

'https://api.wheretheiss.at/v1/satellites/25544'

In [None]:
response = requests.get(
    url=url_iss_position,
    params=parameters
)

In [None]:
response.json()

##### 3. **Header Parameters**

These are included in the request header and can be used for various purposes, such as authentication (e.g., sending an API key or token), content negotiation (e.g., defining the response format), or custom settings defined by the API. If the API requires authentication, you might include an `Authorization` header with your API key.

ISS API mentions that *currently there is no authentication required*. So we'll look at an example with a different API of `authentication` in `header parameters`.

#### Request Headers
Request headers are key-value pairs sent in an HTTP request to provide information about the request itself.

Some request headers are:

1. **Content-Type**: Specifies the media type of the resource or data. Common examples include "application/json" for JSON data, "text/html" for HTML content, and "application/xml" for XML data.

3. **Authorization**: Contains credentials for authenticating the client with the server, often used with tokens or other forms of authentication.

4. **User-Agent**: Provides information about the client (browser or other client), including its version and operating system.


In Python's `requests` library, you can include headers in a request by using the `headers` argument, like this:

```python
headers = {'Content-Type': 'application/json', 'Authorization': 'Bearer YOUR_TOKEN'}
response = requests.get(url, headers=headers)
```

⚠️🚨 ¡Careful with the authentication token! ⚠️🚨  A token is a personal credential for accessing an API, through which your request quota to the API is managed. Therefore, the ideal procedure is to ensure security by storing the token as a variable in an .env file. This way, you can call the environment variable without having to publicly display the token.

**Example: News API**

The *News API* lets you locate articles and breaking news headlines from news sources and blogs across the web.

Let's look at the [NewsAPI Authentication documentation](https://newsapi.org/docs/authentication). It mentions:

"You can attach your API key to a request in one of three ways:

- Via the apiKey querystring parameter.
- Via the X-Api-Key HTTP header.
- Via the Authorization HTTP header. Including Bearer is optional, and be sure not to base 64 encode it like you may have seen in other authentication tutorials.

We strongly recommend the either of last two so that your API key isn't visible to others in logs or via request sniffing.

If you don't append your API key correctly, or your API key is invalid, you will receive a 401 - Unauthorized HTTP error."

- Let's look at the documentation for the endpoints to see what **parameters** they accept: [NewsAPI endpoints](https://newsapi.org/docs/endpoints). We see we have the endpoint `/v2/top-headlines` – *returns breaking news headlines for countries, categories, and singular publishers. This is perfect for use with news tickers or anywhere you want to use live up-to-date news headlines.*
If we look at the [documentation of that endpoint](https://newsapi.org/docs/endpoints/top-headlines) we see we have more parameters we can add such as `country` or `category`.

In [129]:
url = "http://newsapi.org/v2/top-headlines"

In [130]:
response = requests.get(url)

In [131]:
response.status_code

401

Let's try a made up key, using the apiKey querystring parameter. As we read in the documentation, this is not the recommended way.

In [132]:
url = f"http://newsapi.org/v2/top-headlines?country=us&apiKey=manolonodoesnthavekey"

In [133]:
url

'http://newsapi.org/v2/top-headlines?country=us&apiKey=manolonodoesnthavekey'

In [134]:
response = requests.get(url)

In [135]:
response.status_code

401

**4xx**: client error. This means, it is our error.

Lets do with a correct key. You should get your own API key through their website. Once we've saved our key, we'll send it via the X-Api-Key HTTP header.

##### Saving the API Key

Storing an API key directly in your code can expose sensitive information, especially if your code is publicly available (e.g., on a public GitHub repository). The best practice for saving and loading an API key in your code involves the following steps:

1. **Storing the API Key**:
    - **Use Environment Variables**: Store your API key in an environment variable on your system. This keeps the key out of your codebase and allows you to change it without altering your code.


    - **Create a .env File**: If you prefer, you can create a `.env` file (just call it `.env` nothing else before the `.`) in your project directory to store the API key. Inside this file, you would have something like:
   
       ```
       API_KEY=your-api-key-here
       ```

        - **Add .env to .gitignore**: If you're using a version control system like Git, make sure to add the `.env` file to your `.gitignore` file. This prevents the `.env` file (and therefore your API key) from being uploaded to any public repositories. We'll talk about Git and .gitignore in more detail soon.

2. **Load the Key in Your Code**: 

    You can use libraries like `python-dotenv` to load the key into your code. 
    You would need to install `python-dotenv` first.
    ```python 
    !pip install python-dotenv

    from dotenv import load_dotenv
    import os

    load_dotenv()
    api_key = os.getenv("API_KEY")
    ```

       
    Now, `api_key` contains the value of your API key, and you can use it to authenticate your requests to the API.
       

Lets do the second approach. Lets save it in an env file and save it in a variable.

Make sure your file is named .env and not .env.txt! Is one of the most common errors. If you need, look at the document properties to make sure it doesn't have .txt at the end even if its not showing the .txt when looking at your folder.

In [136]:
#!pip install python-dotenv



In [150]:
import os
from dotenv import load_dotenv, find_dotenv

load_dotenv()

my_key = os.getenv("API_KEY")

If you get an error, make sure you saved the .env file in the same directory as your jupyter notebook. You can use `os.getcwd()` to do so.

In [151]:
os.getcwd()

'/Users/milenko/My Drive (1307mile@gmail.com)/bootcamp/python/w2_data_wrangling_and_retrieval'

##### Using the API Key in the Header Parameters

In [152]:
url_top_headlines = "http://newsapi.org/v2/top-headlines"

In [153]:
parameters = {"country": "us", "category": "science"}

In [154]:
headers = {"X-Api-Key": my_key}

In [155]:
response = requests.get(
    url=url_top_headlines, 
    params=parameters, 
    headers=headers
)

In [156]:
response.status_code

200

In [157]:
data = response.json()

In [None]:
data

## APIs examples
### NewsAPI

In [160]:
type(data) # response.json() returns a dictionary

dict

In [161]:
data.keys() # we can get the keys with method keys()

dict_keys(['status', 'totalResults', 'articles'])

In [162]:
data.get("status") # we can get a key's value by using get method

'ok'

In [163]:
data["totalResults"] # or by using dictionary notation to access values

67

In [164]:
# this is the same as above
data.get("totalResults")

67

`dict[key]` raises error if the key does not exits  
`dict.get(key)` does not raise an error if the key does not exist

In [165]:
# lets look at the articles
articles = data.get("articles")

In [166]:
type(articles) # lets see what type it is to know how to access it

list

In [167]:
len(articles)

20

In [169]:
articles[0] # since articles is a list, we can access its elements by the index

{'source': {'id': 'google-news', 'name': 'Google News'},
 'author': 'Hackaday',
 'title': 'The Continuing Venusian Mystery Of Phosphine And Ammonia - Hackaday',
 'description': None,
 'url': 'https://news.google.com/rss/articles/CBMiWWh0dHBzOi8vaGFja2FkYXkuY29tLzIwMjQvMDcvMTkvdGhlLWNvbnRpbnVpbmctdmVudXNpYW4tbXlzdGVyeS1vZi1waG9zcGhpbmUtYW5kLWFtbW9uaWEv0gEA?oc=5',
 'urlToImage': None,
 'publishedAt': '2024-07-20T02:00:00Z',
 'content': None}

In [None]:
type(articles[0]) # each element inside the list is a dictionary

In [None]:
display([art["title"] for art in articles]) # lets get all the titles using list comprehension

In [None]:
df = pd.DataFrame(articles) # lets create a dataframe with the articles

In [178]:
df.head()

Unnamed: 0,source,author,title,description,url,urlToImage,publishedAt,content
0,"{'id': 'google-news', 'name': 'Google News'}",Hackaday,The Continuing Venusian Mystery Of Phosphine A...,,https://news.google.com/rss/articles/CBMiWWh0d...,,2024-07-20T02:00:00Z,
1,"{'id': 'google-news', 'name': 'Google News'}",Ars Technica,Armada to Apophis—scientists recycle old ideas...,,https://news.google.com/rss/articles/CBMicWh0d...,,2024-07-20T00:26:42Z,
2,"{'id': 'google-news', 'name': 'Google News'}",Space.com,'An oasis in the desert': NASA's Curiosity rov...,,https://news.google.com/rss/articles/CBMiNmh0d...,,2024-07-20T00:00:01Z,
3,"{'id': None, 'name': 'ScienceAlert'}",Universe Today,Scientists Reveal Why Fire Is So Dangerous Dur...,Astronauts face multiple risks during space fl...,https://www.sciencealert.com/scientists-reveal...,https://www.sciencealert.com/images/2024/07/Fi...,2024-07-19T23:01:20Z,Astronauts face multiple risks during space fl...
4,"{'id': 'google-news', 'name': 'Google News'}",Mentalfloss,New Study Sheds Light on the Greenland Shark's...,,https://news.google.com/rss/articles/CBMiQGh0d...,,2024-07-19T21:03:00Z,


In [None]:
print(type(data)) # response.json() returns a dictionary
print(data.keys()) # dict_keys(['key', 'key2', ...])
print(data.get("status")) # should say "ok"
print(data["totalResults"]) # returns an integer with the number of results
print(data.get("totalResults")) # same as the previous line
# data["key"] raises error if the key not exists; data.get("key") won't raise error

articles = data.get("articles")
print(type(articles)) # it's a list
print(len(articles)) # int, number of articles
display(articles[0]) # since articles is a list, we can access its elements by the index
print(type(articles[0])) # each element inside the list is a dictionary
display([art["title"] for art in articles]) # lets get all the titles using list comprehension
df = pd.DataFrame(articles) # lets create a dataframe with the articles
df

#### Check for understanding 1

1. What data type is the column *source*?
2. Since the column *source* is not very useful as it is, create a column called *name* that contains only the name inside the column *source*.
3. How many times each unique name appears in the "name" column?

In [242]:
# Step 1: The column "source" is object datatype
df["source"].dtype

# Step 2:
df["name"] = df["source"].apply(lambda x: x["name"])

# Step 3: Google News appear 15 times. Other 5 unique names in the "name" column appear once.
df.name.value_counts()

name
Google News         15
ScienceAlert         1
SciTechDaily         1
Foxweather.com       1
Space.com            1
Associated Press     1
Name: count, dtype: int64

### Pokemon API
[PokeAPI](https://pokeapi.co/) is a full RESTful API linked to an extensive database detailing everything about the Pokémon main game series.

In the documentation we see that we can get lot's of data. Let's look at the endpoint *pokemon*: ```https://pokeapi.co/api/v2/pokemon/{id or name}/``` 

In [244]:
res=requests.get('https://pokeapi.co/api/v2/pokemon/25')
res.status_code

200

In [None]:
data=res.json()
data

In [None]:
type(data)

In [246]:
data.keys()

dict_keys(['abilities', 'base_experience', 'cries', 'forms', 'game_indices', 'height', 'held_items', 'id', 'is_default', 'location_area_encounters', 'moves', 'name', 'order', 'past_abilities', 'past_types', 'species', 'sprites', 'stats', 'types', 'weight'])

In [None]:
# Lets look at the following attributes
print(data["name"])
print(data["weight"])
print(data["sprites"])

In [None]:
# Lets get the pokemon image from sprites - front_default.
# Remember we can read the documentation to understand how to get all the information
data['sprites']['front_default']

We will define a function, `get_pokemon`, that takes an ID number as input. This function fetches details about a Pokémon from the PokeAPI using the given ID, extracts its name, weight, and front-facing image, and then returns this data in a dictionary format. If the ID is invalid or there's an issue with the request, a ValueError is raised.

In [253]:
def get_pokemon(id_number):
    res=requests.get('https://pokeapi.co/api/v2/pokemon/{}'.format(id_number))
    
    if(res.status_code==200):
        data = res.json()
        return {"id": id_number,'name':data['name'], "weight":data["weight"],'image':data['sprites']['front_default']}
    else:
        raise ValueError("Cannot get pokemon, id does not exist")


In [263]:
get_pokemon(2) # Lets use the function for pokemon id 25

{'id': 2,
 'name': 'ivysaur',
 'weight': 130,
 'image': 'https://raw.githubusercontent.com/PokeAPI/sprites/master/sprites/pokemon/2.png'}

In [None]:
[get_pokemon(i+1) for i in range(5)] # Lets use list comprehension to gather information for the first 5 pokemons

We will now define a function, `print_pokemon`, that takes in a Pokémon dictionary and displays its ID, name, weight, and its image. 

In [282]:
from IPython.display import Image

def print_pokemon(poke):
    print(str(poke["id"])+" " +poke["name"]+ " "+str(poke["weight"]) + " kg")
    display(Image(url=poke['image']))

After defining this function, we will retrieve and display details for the first 10 Pokémon by iterating through their IDs and using both the `get_pokemon` and `print_pokemon` functions. 

In [None]:
[print_pokemon(get_pokemon(i+1)) for i in range(10)]

#### json_normalize()
Want the data as a DataFrame instead of a dictionary?

The following line creates a DataFrame from the items of the dictionary data. Each key-value pair will be treated as a row, with two columns: the first for the key and the second for the value.

In [287]:
df = pd.DataFrame(data.items())
df

Unnamed: 0,0,1
0,abilities,"[{'ability': {'name': 'static', 'url': 'https:..."
1,base_experience,112
2,cries,{'latest': 'https://raw.githubusercontent.com/...
3,forms,"[{'name': 'pikachu', 'url': 'https://pokeapi.c..."
4,game_indices,"[{'game_index': 84, 'version': {'name': 'red',..."
5,height,4
6,held_items,"[{'item': {'name': 'oran-berry', 'url': 'https..."
7,id,25
8,is_default,True
9,location_area_encounters,https://pokeapi.co/api/v2/pokemon/25/encounters


That format is still very hard to work with. Lets use `json_normalize` instead. `json_normalize` is used to normalize semi-structured JSON data into a flat table (DataFrame). It's particularly useful when dealing with nested dictionaries or lists inside JSON.

In [288]:
df = pd.json_normalize(data)
df

Unnamed: 0,abilities,base_experience,forms,game_indices,height,held_items,id,is_default,location_area_encounters,moves,...,sprites.versions.generation-vi.x-y.front_shiny,sprites.versions.generation-vi.x-y.front_shiny_female,sprites.versions.generation-vii.icons.front_default,sprites.versions.generation-vii.icons.front_female,sprites.versions.generation-vii.ultra-sun-ultra-moon.front_default,sprites.versions.generation-vii.ultra-sun-ultra-moon.front_female,sprites.versions.generation-vii.ultra-sun-ultra-moon.front_shiny,sprites.versions.generation-vii.ultra-sun-ultra-moon.front_shiny_female,sprites.versions.generation-viii.icons.front_default,sprites.versions.generation-viii.icons.front_female
0,"[{'ability': {'name': 'static', 'url': 'https:...",112,"[{'name': 'pikachu', 'url': 'https://pokeapi.c...","[{'game_index': 84, 'version': {'name': 'red',...",4,"[{'item': {'name': 'oran-berry', 'url': 'https...",25,True,https://pokeapi.co/api/v2/pokemon/25/encounters,"[{'move': {'name': 'mega-punch', 'url': 'https...",...,https://raw.githubusercontent.com/PokeAPI/spri...,https://raw.githubusercontent.com/PokeAPI/spri...,https://raw.githubusercontent.com/PokeAPI/spri...,,https://raw.githubusercontent.com/PokeAPI/spri...,https://raw.githubusercontent.com/PokeAPI/spri...,https://raw.githubusercontent.com/PokeAPI/spri...,https://raw.githubusercontent.com/PokeAPI/spri...,https://raw.githubusercontent.com/PokeAPI/spri...,https://raw.githubusercontent.com/PokeAPI/spri...


A lot better, even though we still have nested dictionaries or lists inside the dataframe.

Let's see again the difference between creating a dataframe directly or using `json_normalize` by using `data["abilities"]`.

In [289]:
pd.DataFrame(data["abilities"])

Unnamed: 0,ability,is_hidden,slot
0,"{'name': 'static', 'url': 'https://pokeapi.co/...",False,1
1,"{'name': 'lightning-rod', 'url': 'https://poke...",True,3


In [290]:
pd.json_normalize(data["abilities"])

Unnamed: 0,is_hidden,slot,ability.name,ability.url
0,False,1,static,https://pokeapi.co/api/v2/ability/9/
1,True,3,lightning-rod,https://pokeapi.co/api/v2/ability/31/


### Coincap API
CoinCap is a useful tool for real-time pricing and market activity for over 1,000 cryptocurrencies. Check the documentation at https://docs.coincap.io/

In [293]:
url = "http://api.coincap.io/v2/assets"

In [294]:
response = requests.get(url)

In [295]:
print(response)

<Response [200]>


In [296]:
res = response.json()

In [297]:
res # it returns a dictionary with all the data

{'data': [{'id': 'bitcoin',
   'rank': '1',
   'symbol': 'BTC',
   'name': 'Bitcoin',
   'supply': '19728778.0000000000000000',
   'maxSupply': '21000000.0000000000000000',
   'marketCapUsd': '1319228591504.5431471673790992',
   'volumeUsd24Hr': '6003420936.7281208218121979',
   'priceUsd': '66868.2364160893871464',
   'changePercent24Hr': '0.3659635037207530',
   'vwap24Hr': '67107.6062863609713849',
   'explorer': 'https://blockchain.info/'},
  {'id': 'ethereum',
   'rank': '2',
   'symbol': 'ETH',
   'name': 'Ethereum',
   'supply': '120228315.0196019700000000',
   'maxSupply': None,
   'marketCapUsd': '420373002316.5477059037945645',
   'volumeUsd24Hr': '3362961605.6427862340324318',
   'priceUsd': '3496.4559076454683915',
   'changePercent24Hr': '0.0140414430950319',
   'vwap24Hr': '3508.8164903300620981',
   'explorer': 'https://etherscan.io/'},
  {'id': 'tether',
   'rank': '3',
   'symbol': 'USDT',
   'name': 'Tether',
   'supply': '114039204814.2764600000000000',
   'maxSupply

In [298]:
# Again, let's try to have it as a dataframe
data = pd.DataFrame(res["data"])
data.head()

Unnamed: 0,id,rank,symbol,name,supply,maxSupply,marketCapUsd,volumeUsd24Hr,priceUsd,changePercent24Hr,vwap24Hr,explorer
0,bitcoin,1,BTC,Bitcoin,19728778.0,21000000.0,1319228591504.5432,6003420936.728121,66868.23641608938,0.365963503720753,67107.60628636097,https://blockchain.info/
1,ethereum,2,ETH,Ethereum,120228315.01960196,,420373002316.54767,3362961605.6427865,3496.4559076454684,0.0140414430950319,3508.816490330062,https://etherscan.io/
2,tether,3,USDT,Tether,114039204814.27646,,114078496131.85724,9245961473.386803,1.0003445421918256,-0.0271766375295686,1.0000606435256971,https://www.omniexplorer.info/asset/31
3,binance-coin,4,BNB,BNB,166801148.0,166801148.0,99093851807.11057,167747133.96506384,594.0837517923471,0.4670870832091105,595.224204866078,https://etherscan.io/token/0xB8c77482e45F1F44d...
4,solana,5,SOL,Solana,464344297.8276351,,79935622843.31358,588774937.7373472,172.14731227944517,2.1459226997984047,172.7054765818152,https://explorer.solana.com/


In [299]:
# Lets see if the result would differ in this case using json_normalize
pd.json_normalize(res["data"])

Unnamed: 0,id,rank,symbol,name,supply,maxSupply,marketCapUsd,volumeUsd24Hr,priceUsd,changePercent24Hr,vwap24Hr,explorer
0,bitcoin,1,BTC,Bitcoin,19728778.0000000000000000,21000000.0000000000000000,1319228591504.5431471673790992,6003420936.7281208218121979,66868.2364160893871464,0.3659635037207530,67107.6062863609713849,https://blockchain.info/
1,ethereum,2,ETH,Ethereum,120228315.0196019700000000,,420373002316.5477059037945645,3362961605.6427862340324318,3496.4559076454683915,0.0140414430950319,3508.8164903300620981,https://etherscan.io/
2,tether,3,USDT,Tether,114039204814.2764600000000000,,114078496131.8572307310232719,9245961473.3868054151087385,1.0003445421918257,-0.0271766375295686,1.0000606435256973,https://www.omniexplorer.info/asset/31
3,binance-coin,4,BNB,BNB,166801148.0000000000000000,166801148.0000000000000000,99093851807.1105536609491928,167747133.9650638549674074,594.0837517923470986,0.4670870832091105,595.2242048660781204,https://etherscan.io/token/0xB8c77482e45F1F44d...
4,solana,5,SOL,Solana,464344297.8276351000000000,,79935622843.3135900085873853,588774937.7373472894292529,172.1473122794451635,2.1459226997984046,172.7054765818152106,https://explorer.solana.com/
...,...,...,...,...,...,...,...,...,...,...,...,...
95,dash,96,DASH,Dash,11884405.1590432800000000,18900000.0000000000000000,340145856.7254344690125687,22077713.7275997514513223,28.6211932506024506,2.0995164724862107,28.1978492009227983,https://explorer.dash.org
96,curve-dao-token,97,CRV,Curve DAO Token,1186524249.0000000000000000,,338099602.7424678093565260,5702525.7831264709045747,0.2849495937629740,-3.6766146169191875,0.2916920001161257,https://etherscan.io/token/0xD533a949740bb3306...
97,illuvium,98,ILV,Illuvium,4478644.4244490100000000,,335346156.0911554621475062,3354366.8059952497169590,74.8767091802363346,1.7445190776747192,75.4027594529248100,https://etherscan.io/token/0x767fe9edc9e0df98e...
98,superfarm,99,SUPER,SuperVerse,487776093.4169172600000000,1000000000.0000000000000000,334708141.4825705689618059,5646092.2584251917759571,0.6861921812073828,0.3358212147607654,0.6980781981493560,https://etherscan.io/token/0xe53ec727dbdeb9e2d...


### Api Jokes

JokeAPI is a REST API that serves uniformly and well formatted jokes.
It can be used without any API token, membership, registration or payment.
It supports a variety of filters that can be applied to get just the right jokes you need.
Check the documentation here: https://jokeapi.dev/ 

In [414]:
url_random_joke = "https://v2.jokeapi.dev/joke/dark?amount=1"
request = requests.get(url_random_joke).json()
request

{'error': False,
 'category': 'Dark',
 'type': 'single',
 'joke': 'Dark humor is like food, not everyone gets it.',
 'flags': {'nsfw': False,
  'religious': False,
  'political': False,
  'racist': True,
  'sexist': False,
  'explicit': True},
 'id': 162,
 'safe': False,
 'lang': 'en'}

### Check for understanding 2
Choose one API from the [Public APIs list](https://github.com/public-apis/public-apis). Attempt to call your selected API, either with or without parameters, and retrieve some valuable information. Document your findings.


In [410]:
# Step 1: GET
url = "https://opentdb.com/api.php?amount=1&category=12&difficulty=easy&type=boolean"
response = requests.get(url)
type(response)
print(response.status_code)
print(response.content)
info = response.json()
display(info)
print(info.keys())
print(info["response_code"])
print(info["results"])
print(info["results"][0]["type"])

200
b'{"response_code":0,"results":[{"type":"boolean","difficulty":"easy","category":"Entertainment: Music","question":"The music group Daft Punk got their name from a negative review they recieved.","correct_answer":"True","incorrect_answers":["False"]}]}'


{'response_code': 0,
 'results': [{'type': 'boolean',
   'difficulty': 'easy',
   'category': 'Entertainment: Music',
   'question': 'The music group Daft Punk got their name from a negative review they recieved.',
   'correct_answer': 'True',
   'incorrect_answers': ['False']}]}

dict_keys(['response_code', 'results'])
0
[{'type': 'boolean', 'difficulty': 'easy', 'category': 'Entertainment: Music', 'question': 'The music group Daft Punk got their name from a negative review they recieved.', 'correct_answer': 'True', 'incorrect_answers': ['False']}]
boolean


In [413]:
# Step 2: Response headers
print(response.headers)
print(response.headers["Date"])
print(response.headers["Server"])
print(response.headers["Expires"])

{'Date': 'Sun, 21 Jul 2024 16:50:34 GMT', 'Server': 'Apache', 'Expires': 'Thu, 19 Nov 1981 08:52:00 GMT', 'Cache-Control': 'no-store, no-cache, must-revalidate', 'Pragma': 'no-cache', 'Access-Control-Allow-Origin': '*', 'Set-Cookie': 'PHPSESSID=d4480bafb0e9515d7329840fcfc0b457; path=/', 'Upgrade': 'h2', 'Connection': 'Upgrade, Keep-Alive', 'Vary': 'User-Agent', 'Strict-Transport-Security': 'max-age=31536000', 'Keep-Alive': 'timeout=5, max=100', 'Transfer-Encoding': 'chunked', 'Content-Type': 'application/json'}
Sun, 21 Jul 2024 16:50:34 GMT
Apache
Thu, 19 Nov 1981 08:52:00 GMT


In [408]:
# Step 3: normalization, attempt 1
pd.json_normalize(info)

Unnamed: 0,response_code,results
0,0,"[{'type': 'boolean', 'difficulty': 'easy', 'ca..."


In [409]:
# Step 4: normalization, attempt 2
pd.json_normalize(info["results"])

Unnamed: 0,type,difficulty,category,question,correct_answer,incorrect_answers
0,boolean,easy,Entertainment: Music,Michael Jackson wrote The Simpsons song &quot;...,False,[True]


Documentation of the steps:
- Step 1:
  - Accessed "Open Trivia Database API"
  - Used the following parameters: amount=1&category=12&difficulty=easy&type=boolean
  - Meaning: 1 question, category Entertainment, Music, easy question, can be answered True/False.
- Step 2:
  - Retrieved request headers
- Step 3:
  - Normalized the JSON data, but this wasn't enough because the "results" key's value is a nested dict.
 - Step 4:
   - Normalized the "results" key. It looks good.

## API Wrappers

A Python wrapper is a Python library or module that provides a more convenient or more "Pythonic" interface to another software component, such as a library in another language, a system tool, or a web API. It "wraps" the functionality of that component in a way that abstracts away its complexities and makes it easier to use in a Python context. 🙌🏻

One example is the `tweepy` library, which makes obtaining data from Twitter's API relatively straightforward.

## Summary
- Import the `requests` library.
- Store the necessary credentials, such as API key or OAuth tokens, if the API requires them.
- Execute a `request.get` to the desired API endpoint (the API's documentation will provide details on available endpoints).
- The API returns a JSON response.
- This JSON can be converted into a dataframe, or you can further explore its elements (typically a list of dictionaries).

## Glossary
* DNS: domain name system.
* HTTP: is the protocol used to transfer data over the web.
* API: application programming interface.
* REST: series of rules, architectural style.
*
* `requests`: Python library for interacting with APIs.
* URL: server name you want to ask information for.
* endpoint: server service you want to ask information for.
* parameters: extra parameters to your query.
* headers: metadata, invisible information.

## Further materials
[5 Simple-To-Use APIs for Beginners](https://dev.to/alanconstantino/5-simple-to-use-apis-for-beginners-2e0n)

[RapidAPI](https://rapidapi.com/category/Sports): access thousands of APIs.