# Lesson 6.3 - APIs

Useful links:
- [Python and REST APIs: Interacting With Web Services](https://realpython.com/api-integration-in-python/)
- [Python API Tutorial](https://www.geeksforgeeks.org/python-api-tutorial-getting-started-with-apis/)

### 6.3.1 - What is an API?

APIs or *Application Programming Interfaces* are sets of protocols, routines, and tools for building software applications. They are a crucial part of any software application. In this lesson, we will learn about APIs and how to use them in Python.

A RESTful API is an interface that two computer systems use to exchange information securely over the internet. Most business applications have to communicate with other internal and third-party applications to perform various tasks.

Representational State Transfer (REST) is a software architecture that imposes conditions on how an API should work. REST was initially created as a guideline to manage communication on a complex network like the internet. 

## 6.3.2. Requests Module

- In order to use APIs in Python, we need to import the `requests` module. Requests is a Python module that makes it easy to send HTTP requests using Python.
- This is not a built-in module. You need to install it first.
- You can install it with the following command: `pip install requests`
- You can check if it is installed with the following command: `pip show requests`

In [2]:
!pip install requests

Collecting requests
  Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting charset-normalizer<4,>=2 (from requests)
  Downloading charset_normalizer-3.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests)
  Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests)
  Downloading urllib3-2.2.1-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests)
  Downloading certifi-2024.2.2-py3-none-any.whl.metadata (2.2 kB)
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.6/62.6 kB[0m [31m1.2 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hDownloading certifi-2024.2.2-py3-none-any.whl (163 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m163.8/163.8 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDownloading charset_normalizer-3.3.2-cp312-c

## 6.3.3. JSON File

- Before we start our API calls, we need to understand JSON. A JSON is a lightweight data-interchange format that is easy for humans to read and write. JSON is a text format that is used to store and transport data. It is not a data type in Python. It is a data format that is used to transfer data between computers.
- It is called JSON as abbreviation for JavaScript Object Notation.
- It is a very popular format and it's use goes beyond the scope of this lesson.
 
 - Structurally, JSON File is a combination of dictionaries and lists.
 - Let us see an example of a JSON file (`sample.json`).

In [3]:
!cat data/sample.json

[
  {
    "id": 1,
    "name": "John Doe",
    "email": "john.doe@example.com",
    "age": 25,
    "city": "New York"
  },
  {
    "id": 2,
    "name": "Jane Smith",
    "email": "jane.smith@example.com",
    "age": 30,
    "city": "Chicago"
  },
  {
    "id": 3,
    "name": "Mike Johnson",
    "email": "mike.johnson@example.com",
    "age": 40,
    "city": "San Francisco"
  },
  {
    "id": 4,
    "name": "Emma Wilson",
    "email": "emma.wilson@example.com",
    "age": 28,
    "city": "Los Angeles"
  },
  {
    "id": 5,
    "name": "David Lee",
    "email": "david.lee@example.com",
    "age": 35,
    "city": "Austin"
  }
]




- As you can notice, this data stucture is a list of dictionaries.
- Let us see how to read it in Python. For that, we need to import the `json` module. 

In [4]:
import json  # module needed for decoding/encoding JSON data

with open("data/sample.json", "r") as f:
    data = json.load(f)

print(data)

[{'id': 1, 'name': 'John Doe', 'email': 'john.doe@example.com', 'age': 25, 'city': 'New York'}, {'id': 2, 'name': 'Jane Smith', 'email': 'jane.smith@example.com', 'age': 30, 'city': 'Chicago'}, {'id': 3, 'name': 'Mike Johnson', 'email': 'mike.johnson@example.com', 'age': 40, 'city': 'San Francisco'}, {'id': 4, 'name': 'Emma Wilson', 'email': 'emma.wilson@example.com', 'age': 28, 'city': 'Los Angeles'}, {'id': 5, 'name': 'David Lee', 'email': 'david.lee@example.com', 'age': 35, 'city': 'Austin'}]


- To make the output more readable, we can use the `pprint` (pretty print) module.

In [6]:
from pprint import pprint

pprint(data)

[{'age': 25,
  'city': 'New York',
  'email': 'john.doe@example.com',
  'id': 1,
  'name': 'John Doe'},
 {'age': 30,
  'city': 'Chicago',
  'email': 'jane.smith@example.com',
  'id': 2,
  'name': 'Jane Smith'},
 {'age': 40,
  'city': 'San Francisco',
  'email': 'mike.johnson@example.com',
  'id': 3,
  'name': 'Mike Johnson'},
 {'age': 28,
  'city': 'Los Angeles',
  'email': 'emma.wilson@example.com',
  'id': 4,
  'name': 'Emma Wilson'},
 {'age': 35,
  'city': 'Austin',
  'email': 'david.lee@example.com',
  'id': 5,
  'name': 'David Lee'}]


I we wish to access the precise data, we can use the `data` variable and notation for accessing lists and dictionaries.

In [11]:
# fetch name of the first person
print(data[0]["name"])  # 0 is the index of the first person, and `name` is the key of the person's name

# fetch the names and the emails of all the people
for person in data:
    print(person["name"], person["email"])

John Doe
John Doe john.doe@example.com
Jane Smith jane.smith@example.com
Mike Johnson mike.johnson@example.com
Emma Wilson emma.wilson@example.com
David Lee david.lee@example.com


### 6.3.4. API URL data

Let us see how we can fetch data from an online API. We will use [**Open Trivia Database**](https://opentdb.com/) as an example.

Let us interact first with the data directly from the website.
- Click on the link above and then on the menu go to the **API** tab or click on it, or enter **https://opentdb.com/api_config.php** driectly in the **URL** field of your browser.
- This is where the data is located.
![Open Trivia Database](data/open-trivia-db.png)

Here we can select different parameters to get different data. If we just click on the **Generate Api Url** button, it will generate a URL that will fetch 10 random questions. You can also change the parameters and generate the URL.

![Generated URL with questions data](data/generated-url.png)

The resulting url is `https://opentdb.com/api.php?amount=10`

- Copy the link and past it in the url address field of your browser and press enter. The data will be fetched automatically and displayed in the browser as a json file structure.
- Copy the contents of the browser (the json file). Now, we can open an empty file called `questions.json` and paste the contents inside.
- Save the file.

- Now, let us open a website [json viewer](https://jsonviewer.stack.hu/). Paste the contents inside the field of the website (it says `Paste JSON code here...`).
- Once the data is loaded, click on the menu button that says `Viewer`. The data is displayed in the browser as a json file structure.
![Json Viewer](data/json-viewer.png)

Here we can more easily study the structure of the data that we will be later accessing via Python.

### 6.3.5. Accessing JSON data from a file

Let us work with the file `questions.json` that we created in the previous step. In the following steps, we will use the `json` module to access the data, read the data and go through it.

In [11]:
import json
from pprint import pprint

with open("data/questions.json", "r") as f:
    questions = json.load(f)

# pprint(questions)
pprint(questions['results'][0])  # first question record
pprint(questions['results'][0]['question'])  # first question record - question

pprint(questions['results'][0]['incorrect_answers'])  # first question record - incorrect answers

pprint(questions['results'][0]['correct_answer'])  # first question record - correct answer

{'category': 'Entertainment: Video Games',
 'correct_answer': 'Mike Harrington',
 'difficulty': 'medium',
 'incorrect_answers': ['Robin Walker', 'Marc Laidlaw', 'Stephen Bahl'],
 'question': 'Along with Gabe Newell, who co-founded Valve?',
 'type': 'multiple'}
'Along with Gabe Newell, who co-founded Valve?'
['Robin Walker', 'Marc Laidlaw', 'Stephen Bahl']
'Mike Harrington'


### 6.3.6. Accessing JSON data from a website with Python
Now, we well use Python to access the data from the website.
1. First, we need to import the `requests` module.
2. Next, we need to import the `json` module.
3. Now, we can use the `requests` module to fetch the data from the website.
4. We are fetching the data from the website and storing the data in a variable called `response`.
5. Then we need to check if the data was fetched successfully, that is to check if the status code is 200 (OK). If it is not 200, we need to handle the error.
   - there are different reasons why the requesting data was not successful. Or we have made a mistake in our request, or there is a problem with the website, etc.
   - here are the possible status codes and their meanings:
     - `200`: OK
     - `400`: Bad Request
     - `401`: Unauthorized
     - `403`: Forbidden
     - `404`: Not Found
     - `500`: Internal Server Error
     - `502`: Bad Gateway
     - `503`: Service Unavailable
     - `504`: Gateway Timeout
     - `505`: HTTP Version Not Supported
     - `520`: Unknown Error
6. If we have successfully fetched the data, we need to parse the data into a readable format. We can use the `json` module to parse the data.
7. Once the data is parsed, we can access the data we need in a way that is already known to us.

In [15]:
# import necessary modules
import requests
import json

# fetch data
url = "https://opentdb.com/api.php?amount=10"  # this is the URL of the API endpoint that we already got from the website 

response = requests.get(url)  # we are fetching data from the URL

if response.status_code == 200:  # if the response is successful
    questions = response.json()  # we are decoding the JSON data
    pprint(questions)
    pprint(questions['results'][0]['question'])
    pprint(questions['results'][0]['correct_answer'])
    pprint(questions['results'][0]['incorrect_answers'])
    pprint(questions['results'][0]['type'])

    # we can also decode the JSON data and save it to a file
    with open("API-questions.json", "w") as f:
        json.dump(questions, f)
    print("Questions saved to API-questions.json")
else:
    print(f"Error: {response.status_code}")



{'response_code': 0,
 'results': [{'category': 'Entertainment: Video Games',
              'correct_answer': 'True',
              'difficulty': 'easy',
              'incorrect_answers': ['False'],
              'question': 'In &quot;Super Mario 3D World&quot;, the Double '
                          'Cherry power-up originated from a developer '
                          'accidentally making two characters controllable.',
              'type': 'boolean'},
             {'category': 'Animals',
              'correct_answer': '28',
              'difficulty': 'easy',
              'incorrect_answers': ['30', '26', '24'],
              'question': 'How many teeth does an adult rabbit have?',
              'type': 'multiple'},
             {'category': 'Entertainment: Music',
              'correct_answer': 'Def Leppard',
              'difficulty': 'medium',
              'incorrect_answers': ['The Beatles',
                                    'Lynyrd Skynyrd',
                           

### 6.3.7. Accessing API data from a website with Python and authentication

**Different APIs**
- Private API
- Public API
- API with authentication
- API with authorization
- API with authentication and authorization
- API with authentication, authorization, and rate limiting
- API with authentication, authorization, rate limiting, and caching
- API with authentication, authorization, rate limiting, caching, and pagination

- We will be going through the same steps as in the previous step, but this time we will use authentication. For many APis that require authentication, we need to provide the username and password to access the API. For this, we will use the `requests` module and `json` module.

The website API that we will be using is [Rapid APi](https://rapidapi.com/). It contains many different APIs that we will be using in this lesson. We will be using the IMDB API.

![RAPID APII](data/rapidapi.png)

As you can see, we have a Free Subscription that we will be using that also limits the number of requests we make to 500 per month. This is important to have in mind when using the app, especially if our programm is running constantly - like on a web page.

Once you have your account set up, go to the [IMDb API on Rapid Api](https://rapidapi.com/Glavier/api/imdb146/) and `subscribe to test` for the free plan.
![Imdb API](data/imdb-api.png)

- The RAPID APIs are very developer friendly. What is needed to do now is to go to the page of the IMDB API and select the programming language and you will get the code you can use to access the API together with the access keys.

![RAPID APII](data/rapid-api-code.gif)


In [30]:
# code from Rapid Api
import requests

url = "https://imdb146.p.rapidapi.com/v1/find/"

querystring = {"query":"Die Hard"}

headers = {
	"X-RapidAPI-Key": "xxxxxxxxxxx",
	"X-RapidAPI-Host": "imdb146.p.rapidapi.com"
}

response = requests.get(url, headers=headers, params=querystring)

pprint(response.json())

{'companyResults': {'results': []},
 'findPageMeta': {'includeAdult': False,
                  'isExactMatch': False,
                  'searchTerm': 'die hard'},
 'keywordResults': {'results': []},
 'nameResults': {'hasExactMatches': True,
                 'nextCursor': 'eyJlc1Rva2VuIjpbIjIzMzQuNzY5NSIsIm5tNzQ3NDY2NyJdLCJmaWx0ZXIiOiJ7XCJpbmNsdWRlQWR1bHRcIjpmYWxzZSxcImlzRXhhY3RNYXRjaFwiOmZhbHNlLFwic2VhcmNoVGVybVwiOlwiZGllIGhhcmRcIixcInR5cGVcIjpbXCJOQU1FXCJdfSJ9',
                 'results': [{'akaName': 'Die Hard',
                              'displayNameText': 'Jonathan McLain',
                              'id': 'nm9590276',
                              'knownForJobCategory': 'Producer',
                              'knownForTitleText': 'She',
                              'knownForTitleYear': '2021'},
                             {'akaName': 'DHL aka "Die Hard Leon"',
                              'displayNameText': 'Leon Lee',
                              'id': 'nm6092181',
 

In [31]:
response.status_code

200

In [41]:
titles = [title for title in response.json()['titleResults']['results']]
titles_sorted = [title['titleNameText'] for title in sorted(titles, key=lambda x: x.get('titleReleaseText', 0))]

pprint(titles_sorted)

['Die Hard',
 'Die Hard 2',
 'Die Hard with a Vengeance',
 'Live Free or Die Hard',
 'A Good Day to Die Hard']
