1. Required Imports

In [30]:
import json
import requests
import requests.exceptions
import csv
import pandas as pd

Below, please change the pathname to a path where you would like all files create by this code to be saved.

In [19]:
pathname = "/Users/noahmcintire/Documents/DS 3002/"

2. Api Response function from 08 Python-API-Data
This includes extensive try, catch blocks, which will help catch errors if the url given is invalid (rubric ii).

In [20]:
def get_api_response(url, response_type):
    try:
        response = requests.get(url)
        response.raise_for_status()
    
    except requests.exceptions.HTTPError as errh:
        return "An Http Error occurred: " + repr(errh)
    except requests.exceptions.ConnectionError as errc:
        return "An Error Connecting to the API occurred: " + repr(errc)
    except requests.exceptions.Timeout as errt:
        return "A Timeout Error occurred: " + repr(errt)
    except requests.exceptions.RequestException as err:
        return "An Unknown Error occurred: " + repr(err)

    if response_type == 'json':
        result = json.dumps(response.json(), sort_keys=True, indent=4)
    elif response_type == 'dataframe':
        result = pd.json_normalize(response.json())
    else:
        result = "An unhandled error has occurred!"
        
    return result

3. Fetching Data from a pre-defined data source.

In [21]:
fruit_url = "https://www.fruityvice.com/api/fruit/all"
response_type = "json"

In [22]:
json_string = get_api_response(fruit_url, response_type)
print(json_string)

[
    {
        "family": "Rosaceae",
        "genus": "Malus",
        "id": 6,
        "name": "Apple",
        "nutritions": {
            "calories": 52,
            "carbohydrates": 11.4,
            "fat": 0.4,
            "protein": 0.3,
            "sugar": 10.3
        },
        "order": "Rosales"
    },
    {
        "family": "Rosaceae",
        "genus": "Prunus",
        "id": 35,
        "name": "Apricot",
        "nutritions": {
            "calories": 15,
            "carbohydrates": 3.9,
            "fat": 0.1,
            "protein": 0.5,
            "sugar": 3.2
        },
        "order": "Rosales"
    },
    {
        "family": "Musaceae",
        "genus": "Musa",
        "id": 1,
        "name": "Banana",
        "nutritions": {
            "calories": 96,
            "carbohydrates": 22,
            "fat": 0.2,
            "protein": 1,
            "sugar": 17.2
        },
        "order": "Zingiberales"
    },
    {
        "family": "Rosaceae",
        "genus": 

In [23]:
json_path=r""+pathname+"fruits.json"
json_file=open(json_path, "w")
json_file.write(json_string)
json_file.close()

4. Converting to a CSV to modify columns

If the desired data structure of info from the api is a CSV, run the following cells below. By doing so, we are able to remove the order column from CSV. Additional info about the data can be discovered by running df.info(). This includes both the entries (records) and columns present in the dataset (benchmark i5 in the rubric).

In [24]:
df = pd.read_json(json_path)
df.to_csv(r""+ pathname + "/fruit.csv")

In [25]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 31 entries, 0 to 30
Data columns (total 6 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   family      31 non-null     object
 1   genus       31 non-null     object
 2   id          31 non-null     int64 
 3   name        31 non-null     object
 4   nutritions  31 non-null     object
 5   order       31 non-null     object
dtypes: int64(1), object(5)
memory usage: 1.6+ KB


In [26]:
df.head()

Unnamed: 0,family,genus,id,name,nutritions,order
0,Rosaceae,Malus,6,Apple,"{'calories': 52, 'carbohydrates': 11.4, 'fat':...",Rosales
1,Rosaceae,Prunus,35,Apricot,"{'calories': 15, 'carbohydrates': 3.9, 'fat': ...",Rosales
2,Musaceae,Musa,1,Banana,"{'calories': 96, 'carbohydrates': 22, 'fat': 0...",Zingiberales
3,Rosaceae,Rubus,64,Blackberry,"{'calories': 40, 'carbohydrates': 9, 'fat': 0....",Rosales
4,Rosaceae,Fragaria,33,Blueberry,"{'calories': 29, 'carbohydrates': 5.5, 'fat': ...",Rosales


Droping the order column using the df.drop(0) function.

In [33]:
df.drop('order',inplace=True, axis=1)
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 31 entries, 0 to 30
Data columns (total 5 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   family      31 non-null     object
 1   genus       31 non-null     object
 2   id          31 non-null     int64 
 3   name        31 non-null     object
 4   nutritions  31 non-null     object
dtypes: int64(1), object(4)
memory usage: 1.3+ KB


We can now write this file to the same location as the orignal CSV to update it (rubric i4).

In [34]:
df.to_csv(r""+ pathname + "/fruit.csv")