# Simple APIs 
## Random User and Fruitvice API Examples


Estimated time needed: **25** minutes

## Objectives

After completing this lab you will be able to:

*   Load and use RandomUser API, using `RandomUser()` Python library
*   Load and use Fruitvice API, using `requests` Python library

---

The purpose of this notebook is to provide more examples on how to use simple APIs. API stands for **Application Programming Interface** and is a software intermediary that allows two applications to talk to each other.

Advantage(s) of using APIs:

- **Automation**
  - Less human effort required 
  - Workflows be easily updated to become faster and more productive
- **Efficiency**
  - It allows to use the capabilities of one of the already developed APIs than to try to independently implement some functionality from scratch.

Disadvantage(s) of using APIs:
- **Secirity**
  - If the API is poorly integrated, it will be vulnerable to attacks, resulting in data breeches or losses, having financial or reputation implications.

One of the applications we will use in this notebook is Random User Generator. RandomUser is an open-source, free API providing developers with randomly generated users to be used as placeholders for testing purposes. This makes the tool similar to Lorem Ipsum, but is a placeholder for people instead of text. The API can return multiple results, as well as specify generated user details such as gender, email, image, username, address, title, first and last name, and more. See [documentation](https://randomuser.me/documentation).

Another example of simple API we will use in this notebook is Fruityvice. The Fruityvice API webservice provides data for all kinds of fruit. We can use Fruityvice to find out interesting information about fruit. The webservice is completely free to use and contribute to.

## API 01: Random User Generator

### Package Installation

To start using the API, we need to install the `randomuser` library. We can use the shell (OS dependent) or magic (provided by IPython kernel) commands in Jupyter Notebook cells.

Check the environment in use before running the command cells below.

In [1]:
## Upgrade Pip
!python.exe -m pip install --upgrade pip



If we try using the magic command (`%python.exe -m pip install --upgrade pip`) above, we will receive `UsageError`.

If the current notebook is in the Anaconda environment, use `conda install`. But this may raise `PackagesNotFoundError`. This means that Conda cannot find the package in active channels (see [Managing channels](https://docs.anaconda.com/navigator/tutorials/manage-channels/#)). Two possible solutions are: 

1. Find the right channel that contains the package 
2. Change the kernel to the Python interpreter from the Python version intended to use

In [2]:
## Install package with Conda
## 1) Platform dependant shell commands
# !conda install randomuser
## 2) Magic commands provided by IPython kernel
# %conda install randomuser

In [3]:
## Install package with Pip
## 1) Platform dependant shell commands
# !pip install randomuser
## 2) Magic commands provided by IPython kernel
%pip install randomuser

Note: you may need to restart the kernel to use updated packages.


The shell command `!pip install randomuser` will generate the error below if the wrong environment is used. To solve it, change the kernel to the Python interpreter from the Python version intended to use.

![shell-command-error-pip-install.png](../images/shell-command-error-pip-install.png)

Make sure that both `randomuser` and `pandas` are installed in the evnironment.

### Brief Introduction

Random User Generator API is used to generate random user data for application testing. For more information, see [Documentation for the Random User Generator API](https://randomuser.me/documentation).

#### Method Overview

For details on the RandomUser class and optional parameters for these methods, see the [documentation](https://connordelacruz.com/python-randomuser/randomuser.html).

#### Getter Methods

- `get_cell()`
- `get_city()`
- `get_dob()`
- `get_email()`
- `get_first_name()`
- `get_full_name()`
- `get_gender()`
- `get_id()`
- `get_id_number()`
- `get_id_type()`
- `get_info()`
- `get_last_name()`
- `get_login_md5()`
- `get_login_salt()`
- `get_login_sha1()`
- `get_login_sha256()`
- `get_nat()`
- `get_password()`
- `get_phone()`
- `get_picture()`
- `get_postcode()`
- `get_registered()`
- `get_state()`
- `get_street()`
- `get_username()`
- `get_zipcode()`

### Example

We will load the necessary libraries.

In [4]:
from randomuser import RandomUser
import pandas as pd

Create a random user object.

In [5]:
## Generate a single user
user = RandomUser()
print(user)

<randomuser.RandomUser object at 0x0000018265ABC2B0>


We can also get a list of random users using `generate_users()`.

In [6]:
## Generate multiple users
users = user.generate_users(10)
print(users)

[<randomuser.RandomUser object at 0x0000018265AC7AC0>, <randomuser.RandomUser object at 0x0000018265AC7CD0>, <randomuser.RandomUser object at 0x00000182659C2BE0>, <randomuser.RandomUser object at 0x0000018265AC2CA0>, <randomuser.RandomUser object at 0x0000018265AC2C70>, <randomuser.RandomUser object at 0x0000018265AC2490>, <randomuser.RandomUser object at 0x0000018265AC2C40>, <randomuser.RandomUser object at 0x0000018265AC2E50>, <randomuser.RandomUser object at 0x0000018265AC2E20>, <randomuser.RandomUser object at 0x0000018265AC2D60>]


The Getter methods mentioned above can generate the required parameters to construct a dataset. For example, to get the full name, we call `get_full_name()`.

In [7]:
## Get user's full name
f_name = user.get_full_name()
print(f_name)

Jeremy Walker


Let's say we only need 10 users with full names and their email addresses. We can write a for-loop to print them.

In [8]:
## Get mulitple users' info
for user in users:
	print(user.get_full_name(), "\t", user.get_email())

Filipe Costa 	 filipe.costa@example.com
Nenad Denis 	 nenad.denis@example.com
Deborah Vasquez 	 deborah.vasquez@example.com
David Campos 	 david.campos@example.com
Bill Allen 	 bill.allen@example.com
Aitor Guerrero 	 aitor.guerrero@example.com
Aapo Jokinen 	 aapo.jokinen@example.com
Zachary Andersen 	 zachary.andersen@example.com
Nanna Jensen 	 nanna.jensen@example.com
Samiha De Leest 	 samiha.deleest@example.com


To generate a table with information about the users, we can write a function containing all desirable parameters, e.g., name, gender, city, etc. The parameters will depend on the requirements of the test to be performed. We can call the Getter methods, listed at the beginning of this notebook. Then, we return Pandas dataframe with the users.

In [9]:
def get_users():
	users = []
	for user in RandomUser.generate_users(10):
		users.append({"Name": user.get_full_name(), "Gender": user.get_gender(), "City": user.get_city(), "State": user.get_state(), "Email": user.get_email(), "Date of Birth": user.get_dob(), "Picture": user.get_picture()})
	return pd.DataFrame(users)

In [10]:
get_users()

Unnamed: 0,Name,Gender,City,State,Email,Date of Birth,Picture
0,Emilija Blom,female,Bryne,Bergen,emilija.blom@example.com,1976-08-16T21:22:09.569Z,https://randomuser.me/api/portraits/women/72.jpg
1,Mark James,male,Orange,New South Wales,mark.james@example.com,1998-02-05T10:16:57.029Z,https://randomuser.me/api/portraits/men/73.jpg
2,Noora Folgerø,female,Nesjestranda,Sør-Trøndelag,noora.folgero@example.com,1978-10-12T05:44:59.351Z,https://randomuser.me/api/portraits/women/44.jpg
3,Berit Strunz,female,Bad König,Sachsen,berit.strunz@example.com,1988-01-03T09:01:32.627Z,https://randomuser.me/api/portraits/women/41.jpg
4,Aleksi Ahola,male,Juupajoki,Satakunta,aleksi.ahola@example.com,1990-03-29T11:04:05.146Z,https://randomuser.me/api/portraits/men/69.jpg
5,Philip Andersen,male,Øster Assels,Syddanmark,philip.andersen@example.com,1987-01-15T11:43:45.202Z,https://randomuser.me/api/portraits/men/51.jpg
6,عرشيا کریمی,male,اردبیل,بوشهر,aarshy.khrymy@example.com,1990-03-08T18:28:59.242Z,https://randomuser.me/api/portraits/men/1.jpg
7,Naomi Jackson,female,Upper Hutt,Canterbury,naomi.jackson@example.com,1961-08-03T08:17:52.096Z,https://randomuser.me/api/portraits/women/39.jpg
8,Ali Başoğlu,male,Gümüşhane,Ardahan,ali.basoglu@example.com,1987-09-04T18:14:11.325Z,https://randomuser.me/api/portraits/men/59.jpg
9,Andres Marquez,male,Torrente,Melilla,andres.marquez@example.com,1944-11-05T06:20:05.592Z,https://randomuser.me/api/portraits/men/91.jpg


In [11]:
df_users = pd.DataFrame(get_users())  

Now we have a Pandas dataframe that can be used for any testing purposes that the tester might have.

## API 02: Fruityvice

Another more common way to use APIs is through the `requests` library. We will obtain the [Fruityvice](https://fruityvice.com) API data using `requests.get()` function. The data is in JSON format.

In [12]:
import requests
import json

In [13]:
## Get a web response
data = requests.get("https://fruityvice.com/api/fruit/all")

We will retrieve results using `json.loads()` function.

In [14]:
## Retrieve data in JSON format
results = json.loads(data.text)
print(results)

[{'genus': 'Malus', 'name': 'Apple', 'id': 6, 'family': 'Rosaceae', 'order': 'Rosales', 'nutritions': {'carbohydrates': 11.4, 'protein': 0.3, 'fat': 0.4, 'calories': 52, 'sugar': 10.3}}, {'genus': 'Prunus', 'name': 'Apricot', 'id': 35, 'family': 'Rosaceae', 'order': 'Rosales', 'nutritions': {'carbohydrates': 3.9, 'protein': 0.5, 'fat': 0.1, 'calories': 15, 'sugar': 3.2}}, {'genus': 'Persea', 'name': 'Avocado', 'id': 84, 'family': 'Lauraceae', 'order': 'Laurales', 'nutritions': {'carbohydrates': 8.53, 'protein': 2, 'fat': 14.66, 'calories': 160, 'sugar': 0.66}}, {'genus': 'Musa', 'name': 'Banana', 'id': 1, 'family': 'Musaceae', 'order': 'Zingiberales', 'nutritions': {'carbohydrates': 22, 'protein': 1, 'fat': 0.2, 'calories': 96, 'sugar': 17.2}}, {'genus': 'Rubus', 'name': 'Blackberry', 'id': 64, 'family': 'Rosaceae', 'order': 'Rosales', 'nutritions': {'carbohydrates': 9, 'protein': 1.3, 'fat': 0.4, 'calories': 40, 'sugar': 4.5}}, {'genus': 'Fragaria', 'name': 'Blueberry', 'id': 33, 'fam

We will convert our JSON data into Pandas DataFrame.

In [15]:
## JSON to Pandas DataFrame
print(results)
df_results = pd.DataFrame(results)
print(df_results)

[{'genus': 'Malus', 'name': 'Apple', 'id': 6, 'family': 'Rosaceae', 'order': 'Rosales', 'nutritions': {'carbohydrates': 11.4, 'protein': 0.3, 'fat': 0.4, 'calories': 52, 'sugar': 10.3}}, {'genus': 'Prunus', 'name': 'Apricot', 'id': 35, 'family': 'Rosaceae', 'order': 'Rosales', 'nutritions': {'carbohydrates': 3.9, 'protein': 0.5, 'fat': 0.1, 'calories': 15, 'sugar': 3.2}}, {'genus': 'Persea', 'name': 'Avocado', 'id': 84, 'family': 'Lauraceae', 'order': 'Laurales', 'nutritions': {'carbohydrates': 8.53, 'protein': 2, 'fat': 14.66, 'calories': 160, 'sugar': 0.66}}, {'genus': 'Musa', 'name': 'Banana', 'id': 1, 'family': 'Musaceae', 'order': 'Zingiberales', 'nutritions': {'carbohydrates': 22, 'protein': 1, 'fat': 0.2, 'calories': 96, 'sugar': 17.2}}, {'genus': 'Rubus', 'name': 'Blackberry', 'id': 64, 'family': 'Rosaceae', 'order': 'Rosales', 'nutritions': {'carbohydrates': 9, 'protein': 1.3, 'fat': 0.4, 'calories': 40, 'sugar': 4.5}}, {'genus': 'Fragaria', 'name': 'Blueberry', 'id': 33, 'fam

The result is in a nested JSON format. The "nutritions" column contains multiple sub-columns so the data needs to be "flattened" or normalized.

In [16]:
df_norm = pd.json_normalize(results)
print(df_norm)

           genus          name  id           family             order  \
0          Malus         Apple   6         Rosaceae           Rosales   
1         Prunus       Apricot  35         Rosaceae           Rosales   
2         Persea       Avocado  84        Lauraceae          Laurales   
3           Musa        Banana   1         Musaceae      Zingiberales   
4          Rubus    Blackberry  64         Rosaceae           Rosales   
5       Fragaria     Blueberry  33         Rosaceae           Rosales   
6         Prunus        Cherry   9         Rosaceae           Rosales   
7      Vaccinium     Cranberry  87        Ericaceae          Ericales   
8   Selenicereus   Dragonfruit  80        Cactaceae    Caryophyllales   
9          Durio        Durian  60        Malvaceae          Malvales   
10    Sellowiana        Feijoa  76        Myrtaceae        Myrtoideae   
11         Ficus           Fig  68         Moraceae           Rosales   
12         Ribes    Gooseberry  69  Grossulariaceae

Let's extract some information from this dataframe. Perhaps, we need to know the family and genus of a cherry.

In [17]:
cherry = df_norm.loc[df_norm["name"] == 'Cherry']
(cherry.iloc[0]['family']), (cherry.iloc[0]['genus'])

('Rosaceae', 'Prunus')

## Exercises

### Exercise 1

1. Generate photos of 5 random users.

In [18]:
## TODO Unable to do this
for user in users:
    print(user.get_picture())

https://randomuser.me/api/portraits/men/13.jpg
https://randomuser.me/api/portraits/men/15.jpg
https://randomuser.me/api/portraits/women/22.jpg
https://randomuser.me/api/portraits/men/21.jpg
https://randomuser.me/api/portraits/men/25.jpg
https://randomuser.me/api/portraits/men/29.jpg
https://randomuser.me/api/portraits/men/26.jpg
https://randomuser.me/api/portraits/men/78.jpg
https://randomuser.me/api/portraits/women/72.jpg
https://randomuser.me/api/portraits/women/7.jpg


<details><summary>Click here for the solution</summary>

```python
for user in users:
    print(user.get_picture())
```

</details>

### Exercise 2

1. Find out how many calories are contained in a banana.

In [19]:
banana_cal = df_norm.loc[df_norm["name"] == "Banana"]["nutritions.calories"]
print(banana_cal)

3    96
Name: nutritions.calories, dtype: int64


In [20]:
banana_cal = df_norm.loc[df_norm["name"] == "Banana", ["nutritions.calories"]]
print(banana_cal)

   nutritions.calories
3                   96


In [21]:
banana_cal = df_norm.loc[df_norm["name"] == "Banana"].iloc[0]['nutritions.calories']
print(banana_cal)

96


<details><summary>Click here for the solution</summary>

```python
cal_banana = df.loc[df["name"] == 'Banana']
cal_banana.iloc[0]['nutritions.calories']
```

</details>

### Exercise 3

This [page](https://github.com/public-apis/public-apis#public-apis) contains a list of free public APIs. Choose any API of your interest and use it to load/extract some information, as shown in the example above.

1. Use `requests.get()` function to load data.

In [22]:
import requests
import json

In [23]:
# response = requests.get("http://api.open-notify.org/astros.json")
response = requests.get("https://api.exchangerate.host/latest")
print(response.status_code)

200


<details><summary>Click here for the solution</summary>

```python
data2 = requests.get("https://www.fishwatch.gov/api/species")
```

</details>

2. Retrieve results using `json.loads()` function.

In [24]:
data = json.loads(response.text)
print(data)

{'motd': {'msg': 'If you or your company use this project or like what we doing, please consider backing us so we can continue maintaining and evolving this project.', 'url': 'https://exchangerate.host/#/donate'}, 'success': True, 'base': 'EUR', 'date': '2023-03-12', 'rates': {'AED': 3.907246, 'AFN': 93.192666, 'ALL': 114.096641, 'AMD': 410.57577, 'ANG': 1.90813, 'AOA': 539.893146, 'ARS': 213.586603, 'AUD': 1.616374, 'AWG': 1.916729, 'AZN': 1.809231, 'BAM': 1.956389, 'BBD': 2.12754, 'BDT': 111.596487, 'BGN': 1.950822, 'BHD': 0.401189, 'BIF': 2201.032205, 'BMD': 1.064841, 'BND': 1.434559, 'BOB': 7.314978, 'BRL': 5.55004, 'BSD': 1.064472, 'BTC': 5.2e-05, 'BTN': 86.822272, 'BWP': 14.078384, 'BYN': 2.672117, 'BZD': 2.134502, 'CAD': 1.475142, 'CDF': 2207.234449, 'CHF': 0.97952, 'CLF': 0.031247, 'CLP': 847.5732, 'CNH': 7.382481, 'CNY': 7.348301, 'COP': 5024.277453, 'CRC': 580.325192, 'CUC': 1.064427, 'CUP': 27.394718, 'CVE': 110.227635, 'CZK': 23.627513, 'DJF': 188.493697, 'DKK': 7.442021, '

<details><summary>Click here for the solution</summary>

```python
results = json.loads(data.text)
```

</details>

3. Convert JSON data into Pandas dataframe.

In [25]:
df = pd.DataFrame(data)
print(df)

                                                  motd  success base  \
msg  If you or your company use this project or lik...     True  EUR   
url                 https://exchangerate.host/#/donate     True  EUR   
AED                                                NaN     True  EUR   
AFN                                                NaN     True  EUR   
ALL                                                NaN     True  EUR   
..                                                 ...      ...  ...   
XPT                                                NaN     True  EUR   
YER                                                NaN     True  EUR   
ZAR                                                NaN     True  EUR   
ZMW                                                NaN     True  EUR   
ZWL                                                NaN     True  EUR   

           date       rates  
msg  2023-03-12         NaN  
url  2023-03-12         NaN  
AED  2023-03-12    3.907246  
AFN  2023-03-12

<details><summary>Click here for the solution</summary>

```python
df = pd.DataFrame(results)
df
```

</details>

---

Author(s):

- [Svitlana Kramar](www.linkedin.com/in/svitlana-kramar)
  - Svitlana is a master’s degree Data Science and Analytics student at University of Calgary, who enjoys travelling, learning new languages and cultures and loves spreading her passion for Data Science.

Other Contributor(s):

- N/A