# JSON and APIs

## Introduction to JSON

JSON was invented by Douglas Crockford in 2001. It is a lightweight data-interchange format that is easy for humans to read and write and easy for machines to parse and generate. JSON is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl. Also JSON was influenced by Python. These properties make JSON an ideal data-interchange language.

Goal: Human Readable, Machine Parsable

Website: [https://www.json.org/](https://www.json.org/)

In [1]:
# let's get some sample json
import json # typically we import libraries at the top of the file
json_string = '{"first_name": "Guido", "last_name":"Rossum", "age": 64, "is_python_dev": true}'
# note that json_string is just a string, this is how JSON is exchanged between systems
print(json_string)



{"first_name": "Guido", "last_name":"Rossum", "age": 64, "is_python_dev": true}


## Parsing JSON strings to Python objects

Python has a built-in package called `json`, which can be used to work with JSON data. The `json` package provides a convenient way to convert JSON data to Python objects and vice versa.

In [2]:
# parsing JSON string
python_object = json.loads(json_string)
print(python_object)

{'first_name': 'Guido', 'last_name': 'Rossum', 'age': 64, 'is_python_dev': True}


## Saving Python objects as JSON strings

To save a Python object as a JSON string, you can use the `json.dumps()` function. This function takes a Python object as input and returns a JSON string.

Alternatively we can use the `json.dump()` function to write the JSON string to a file.

In [3]:
# let's dump a python object to a json file
with open('guido.json', 'w') as f: # so we open file stream to write to a file
    json.dump(python_object, f)
    
# file stream f is closed automatically when we exit the with block

In [4]:
# i can indent the json file
with open('guido_indented.json', 'w') as f:
    json.dump(python_object, f, indent=4)

## Dealing with non ASCII data

In [5]:
# let's create a list of three dictionaries
# each dictionary will represent a person in Latvia with name, surname, age and is_student
people = [
    {
        "name": "Jānis",
        "surname": "Bērziņš",
        "age": 25,
        "is_student": True
    },
    {
        "name": "Pēteris",
        "surname": "Liepiņš",
        "age": 35,
        "is_student": False
    },
    {
        "name": "Anna",
        "surname": "Ozoliņa",
        "age": 22,
        "is_student": True
    }
]

In [6]:
# now I will save it
with open('latvian_people.json', 'w') as f:
    json.dump(people, f, indent=4)

In [7]:
# the above cell saved the data with ASCII encoding
# we would like to save it with UTF-8 encoding and ensure ASCII is not used
with open('latvian_people_utf8.json', 'w', encoding='utf-8') as f:
    json.dump(people, f, indent=4, ensure_ascii=False)

## Reading JSON files into Python objects

To read a JSON file into a Python object, you can use the `json.load()` function. This function takes a file object as input and returns a Python object.

In [8]:
# let's read latvian_people_utf8.json
with open('latvian_people_utf8.json', 'r', encoding='utf-8') as f:
    people_from_file = json.load(f) # here we have our data back
# f is closed here again

In [9]:
# let's compare contents
print(people)
print(people_from_file)

[{'name': 'Jānis', 'surname': 'Bērziņš', 'age': 25, 'is_student': True}, {'name': 'Pēteris', 'surname': 'Liepiņš', 'age': 35, 'is_student': False}, {'name': 'Anna', 'surname': 'Ozoliņa', 'age': 22, 'is_student': True}]
[{'name': 'Jānis', 'surname': 'Bērziņš', 'age': 25, 'is_student': True}, {'name': 'Pēteris', 'surname': 'Liepiņš', 'age': 35, 'is_student': False}, {'name': 'Anna', 'surname': 'Ozoliņa', 'age': 22, 'is_student': True}]


## Correspondence between JSON and Python data types

The following table shows the correspondence between JSON data types and Python data types:

| JSON data type | Python data type |
|----------------|------------------|
| object         | dict             |
| array          | list or tuple    |
| string         | str              |
| number         | int or float     |
| true           | True             |
| false          | False            |
| null           | None             |

Note: tuple will be lost when saving as JSON string.


## Usage of JSON in web APIs

JSON is the most commonly used data format for web APIs. When you make a request to a web API, the response is usually in JSON format. You can use the `json` package to parse the JSON response and extract the data you need.

We can use requests library to make API calls and get the JSON response.

In [11]:
# we will use requests library to get some JSON data from the internet
import requests # usually at top of file
# if not found install with pip install requests from command line /terminal
# or !pip install requests in jupyter notebook
url = "https://jsonplaceholder.typicode.com/todos"
print(f"I will be accessing {url}")

I will be accessing https://jsonplaceholder.typicode.com/todos


In [13]:
# let's get the data
response = requests.get(url) # here we are like a browser accessing the url
# let's check response code
print(response.status_code) # 200 means OK

# 404 means not found, maybe a typo in the url
# 500 means server error, not our fault
# full list of status codes https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

200


In [14]:
# if response is 200 let's get the data
if response.status_code == 200:
    todos = response.json() # here we parse the data, this does not connect to the internet
else:
    print("There was an error accessing the data")
    print(f"Status code: {response.status_code}")
    todos = [] # empty list as default


In [15]:
# let's check our data
print(todos)

[{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}, {'userId': 1, 'id': 2, 'title': 'quis ut nam facilis et officia qui', 'completed': False}, {'userId': 1, 'id': 3, 'title': 'fugiat veniam minus', 'completed': False}, {'userId': 1, 'id': 4, 'title': 'et porro tempora', 'completed': True}, {'userId': 1, 'id': 5, 'title': 'laboriosam mollitia et enim quasi adipisci quia provident illum', 'completed': False}, {'userId': 1, 'id': 6, 'title': 'qui ullam ratione quibusdam voluptatem quia omnis', 'completed': False}, {'userId': 1, 'id': 7, 'title': 'illo expedita consequatur quia in', 'completed': False}, {'userId': 1, 'id': 8, 'title': 'quo adipisci enim quam ut ab', 'completed': True}, {'userId': 1, 'id': 9, 'title': 'molestiae perspiciatis ipsa', 'completed': False}, {'userId': 1, 'id': 10, 'title': 'illo est ratione doloremque quia maiores aut', 'completed': True}, {'userId': 1, 'id': 11, 'title': 'vero rerum temporibus dolor', 'completed': True}, {'userId': 1, 'i

In [None]:
# so our todos are a list of dictionaries very typical for JSON data
# most likely this data came from a database, where each dictionary is a row in a table

In [18]:
# let's find our list of todos that are not complete
# incomplete_todos = [todo for todo in todos if not todo['completed']]
# above could be written as loop
incomplete_todos = []
for todo in todos:
    if todo['completed'] is False:
        incomplete_todos.append(todo)
# how many todos are not complete
print(f"There are {len(incomplete_todos)} incomplete todos")
# first 5 incomplete todos
print(incomplete_todos[:5])

There are 110 incomplete todos
[{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}, {'userId': 1, 'id': 2, 'title': 'quis ut nam facilis et officia qui', 'completed': False}, {'userId': 1, 'id': 3, 'title': 'fugiat veniam minus', 'completed': False}, {'userId': 1, 'id': 5, 'title': 'laboriosam mollitia et enim quasi adipisci quia provident illum', 'completed': False}, {'userId': 1, 'id': 6, 'title': 'qui ullam ratione quibusdam voluptatem quia omnis', 'completed': False}]


## JSON API data sources

There are many public APIs that provide JSON data. Some popular examples include:

- [OpenWeatherMap API](https://openweathermap.org/api) - requires an API key

List of open APIs: - https://github.com/public-apis/public-apis



In [19]:
# Let's try open food facts API
product_uri = "737628064502"
url = f"https://world.openfoodfacts.org/api/v0/product/{product_uri}.json"
print(f"I will be accessing {url}")
response = requests.get(url)
if response.status_code == 200:
    product = response.json()
else:
    print("There was an error accessing the data")
    print(f"Status code: {response.status_code}")
    raise Exception("Could not access data")
# let's check our data
print(f"Type of product data: {type(product)}")

I will be accessing https://world.openfoodfacts.org/api/v0/product/737628064502.json
Type of product data: <class 'dict'>


In [20]:
# let's save this product to file with indent and utf-8 encoding
with open(f'product_{product_uri}.json', 'w', encoding='utf-8') as f:
    json.dump(product, f, indent=4, ensure_ascii=False)

In [22]:
# let's write a function that will take product uri and return and optionally save product data
def get_product_data(product_uri, api_url = "https://world.openfoodfacts.org/api/v0/product/", save=False):
    url = f"{api_url}{product_uri}.json"
    print(f"I will be accessing {url}")
    response = requests.get(url) # here we make internet request
    if response.status_code == 200:
        product = response.json()
    else:
        print("There was an error accessing the data")
        print(f"Status code: {response.status_code}")
        return None
    if save:
        with open(f'product_{product_uri}.json', 'w', encoding='utf-8') as f:
            json.dump(product, f, indent=4, ensure_ascii=False)
    return product

In [23]:
product_uri = "710671151926"
product_data = get_product_data(product_uri, save=True)

I will be accessing https://world.openfoodfacts.org/api/v0/product/710671151926.json


In [None]:
# Open Food Facts there is their own library actually
# !pip install openfoodfacts
# which offers more functionality than just getting data