# JSON and APIs

## Introduction to JSON

JSON was invented by Douglas Crockford in 2001. It is a lightweight data-interchange format that is easy for humans to read and write and easy for machines to parse and generate. JSON is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.

### Alternatives to JSON

There are other data-interchange formats that are similar to JSON, such as XML, YAML, and CSV. However, JSON has become the most popular data-interchange format because of its simplicity and ease of use. JSON is also more lightweight than XML, which makes it faster to parse and generate. JSON is also more human-readable than CSV, which makes it easier to debug and troubleshoot.

### Goals of JSON

- Machine Parsable: JSON is easy for machines to parse and generate.
- Human Readable and Editable: JSON is easy for humans to read and write.

Website: [JSON](https://www.json.org/json-en.html)

In [1]:
# let's create a json sample
# first let's import the json module
import json # this is part of the standard library
json_string = '{"name": "Rūta", "age": 20, "city": "Rīga"}'
# this is just a string! JSON is basically text that looks like a Python dictionary - but it is not parsed yet
print(f"json_string is of type {type(json_string)}")
print(json_string)

json_string is of type <class 'str'>
{"name": "Rūta", "age": 20, "city": "Rīga"}


In [2]:
# we will typically want to parse JSON strings into Python objects
# we do that by using the json.loads() function
data = json.loads(json_string)
print(f"data is of type {type(data)}")
print(data)

data is of type <class 'dict'>
{'name': 'Rūta', 'age': 20, 'city': 'Rīga'}


In [3]:
# so data is just a Python object now, has not relation to JSON anymore
# let's add some key-value pairs to it
data['country'] = 'Latvia'
data['hobbies'] = ['reading', 'swimming', 'coding']
print(data)

{'name': 'Rūta', 'age': 20, 'city': 'Rīga', 'country': 'Latvia', 'hobbies': ['reading', 'swimming', 'coding']}


## Transfomring Python data back to JSON strings

The `json` module in Python can be used to transform Python data structures back to JSON strings. The `json.dumps()` function can be used to transform Python data structures back to JSON strings. The `json.dumps()` function takes a Python data structure as input and returns a JSON string as output. 

If we want to save to file directly we can use the `json.dump()` function. This function takes a Python data structure and a file object as input and writes the JSON string to the file object.

```python

In [4]:
# let's save our data into json formatted text file
with open('data.json', 'w') as f: # name could be anything but we want .json extension
    json.dump(data, f)

## Making JSON better looking

Inspecing our data.json file we see two potential issues:

- Unicode strings are escaped into ASCII - not very readable
- the file is not indented - not very readable

In [5]:
# solution is to pass some extra parameters to json.dump()
# indent=4 will make the output more readable
# encoding='utf-8' will allow us to use non-ASCII characters
# ensure_ascii=False will allow us to use non-ASCII characters

with open('data_indented_utf8.json', 'w',encoding="utf-8") as f:
    json.dump(data, f, indent=4, ensure_ascii=False)

## Converting List of Dictionaries to JSON 

Very often our data is in the form of a list of dictionaries. We can convert them just like before using the `json.dump()` function. 



In [6]:
person_list = [
    {"name": "Rūta", "age": 20, "city": "Rīga"},
    {"name": "Maija", "age": 30, "city": "London"},
    {"name": "Ede", "age": 25, "city": "New York"}
]
# let's save it json formatted text file
with open('person_list.json', 'w',encoding="utf-8") as f:
    json.dump(person_list, f, indent=4, ensure_ascii=False)

## Reading JSON files

The `json` module in Python can be used to read JSON files. The `json.load()` function can be used to read JSON files. The `json.load()` function takes a file object as input and returns a Python data structure as output. 



In [7]:
# let's read person_list.json back into Python
with open('person_list.json', 'r',encoding="utf-8") as f:
    also_person_list = json.load(f)

# let's compare it to the original person_list
print(f"Do lists have same data? {person_list == also_person_list}")
print(f"Are lists actually same objects in memory? {person_list is also_person_list}")

Do lists have same data? True
Are lists actually same objects in memory? False


## Correspondence of JSON and Python data types

JSON and Python data types correspond to each other as follows:

- JSON null -> Python None
- JSON true -> Python True
- JSON false -> Python False
- JSON number -> Python int or float
- JSON string -> Python str
- JSON array -> Python list
- JSON object -> Python dict

Note: Python tuples are converted to JSON arrays, so you will lose the distinction between tuples and lists when converting Python data structures to JSON strings.

Note also that JSON officially does not support comments. However, some parsers do support comments. There are some less widely used versions of JSON that do support comments, such as JSON5 and HJSON.
Workaround in JSON for comments is to use a key-value pair with the key being a comment and the value being a string. This is not a perfect solution, but it is a common workaround.

## Usage of JSON as API

### What is API in general?

API stands for Application Programming Interface. An API is a set of rules and protocols that allow one software application to interact with another software application.

JSON is often used as the data-interchange format for APIs typically in web applications. The client sends a request to the server in JSON format, and the server sends a response to the client in JSON format. The client and server can be written in any programming language, as long as they can parse and generate JSON strings.

In [8]:
## we will use requests module to get data from the internet
# requests is a third-party module, so we need to install it first
# pip install requests if you do not have it installed
# home page: https://requests.readthedocs.io/en/master/

# note Google Colab has requests installed by default

import requests
# now I will want an url to get some data from
url = "https://jsonplaceholder.typicode.com/users"
print(f"Will open {url}")

Will open https://jsonplaceholder.typicode.com/users


In [9]:
# now I will use requests.get() function to get the data
# essentially this is the same as opening a url in a browser
# instead of browser displaying the data, requests.get() will return the data to us
response = requests.get(url) # here an internet connection is needed
# let's check the status code of response
print(f"Status code is {response.status_code}")

Status code is 200


In [10]:
bad_url = "https://example.com/not_a_real_page_norJSON"
print(f"Will open {bad_url}")
another_response = requests.get(bad_url)
print(f"Status code is {another_response.status_code}")
# here it 500 means there is server error

# full list of status codes: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

Will open https://example.com/not_a_real_page_norJSON
Status code is 500


In [11]:
# now our response gave 200 status code, so we can try to get the data
data = response.json() # we can use .json() method to parse JSON data
# again data is now a Python object, not JSON string
# so we use normal Python syntax to work with it
# so parser is built-in
print(f"data is of type {type(data)}")
print(data)

data is of type <class 'list'>
[{'id': 1, 'name': 'Leanne Graham', 'username': 'Bret', 'email': 'Sincere@april.biz', 'address': {'street': 'Kulas Light', 'suite': 'Apt. 556', 'city': 'Gwenborough', 'zipcode': '92998-3874', 'geo': {'lat': '-37.3159', 'lng': '81.1496'}}, 'phone': '1-770-736-8031 x56442', 'website': 'hildegard.org', 'company': {'name': 'Romaguera-Crona', 'catchPhrase': 'Multi-layered client-server neural-net', 'bs': 'harness real-time e-markets'}}, {'id': 2, 'name': 'Ervin Howell', 'username': 'Antonette', 'email': 'Shanna@melissa.tv', 'address': {'street': 'Victor Plains', 'suite': 'Suite 879', 'city': 'Wisokyburgh', 'zipcode': '90566-7771', 'geo': {'lat': '-43.9509', 'lng': '-34.4618'}}, 'phone': '010-692-6593 x09125', 'website': 'anastasia.net', 'company': {'name': 'Deckow-Crist', 'catchPhrase': 'Proactive didactic contingency', 'bs': 'synergize scalable supply-chains'}}, {'id': 3, 'name': 'Clementine Bauch', 'username': 'Samantha', 'email': 'Nathan@yesenia.net', 'addr

In [12]:
# now we could save this data into a file
with open('users.json', 'w',encoding="utf-8") as f:
    json.dump(data, f, indent=4, ensure_ascii=False)

In [13]:
# a typical task when working JSON would be some sort of data transformation
# also called data wrangling or data munging

# let's create a new dictionary of names to phone numbers

phone_numbers = {}
for user in data:
    phone_numbers[user['name']] = user['phone']

print(phone_numbers)


{'Leanne Graham': '1-770-736-8031 x56442', 'Ervin Howell': '010-692-6593 x09125', 'Clementine Bauch': '1-463-123-4447', 'Patricia Lebsack': '493-170-9623 x156', 'Chelsey Dietrich': '(254)954-1289', 'Mrs. Dennis Schulist': '1-477-935-8478 x6430', 'Kurtis Weissnat': '210.067.6132', 'Nicholas Runolfsdottir V': '586.493.6943 x140', 'Glenna Reichert': '(775)976-6794 x41206', 'Clementina DuBuque': '024-648-3804'}


## JSON API Data sources

There are many many public APIs that provide JSON data. Here are a few examples:

- [OpenWeatherMap API](https://openweathermap.org/api) - requires sign-up

List of Awesome APIs: [Awesome APIs](

In [15]:
# let's try to get data from openweathermap.org

# we have our API KEY - which SHOULD BE KEPT SECRET!! not put in public Notebook file
api_key = "dfadfafdafdfdsa" # this is not a real key
city = "Riga"
url = f"http://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}"

response = requests.get(url) # this makes internet connection
# sometimes we need to pass custom headers to requests.get() function
# such as browser type or API key
# example of custom headers:
# headers = {
#     "User-Agent": "Mozilla/5.0",
#    "Authorization": "Bearer"
# }
# response = requests.get(url, headers=headers)
print(f"Status code is {response.status_code}")
# 401 means that my API is not valid / not activated or not provided

Status code is 401


In [16]:
# let us try https://www.fruityvice.com/api/fruit/all
url = "https://www.fruityvice.com/api/fruit/all"
response = requests.get(url)
if response.status_code == 200:
    data = response.json()
    print(data)
else:
    print(f"Status code is {response.status_code}")
    # raise Exception("API did not return 200 status code")
    

[{'name': 'Persimmon', 'id': 52, 'family': 'Ebenaceae', 'order': 'Rosales', 'genus': 'Diospyros', 'nutritions': {'calories': 81, 'fat': 0.0, 'sugar': 18.0, 'carbohydrates': 18.0, 'protein': 0.0}}, {'name': 'Strawberry', 'id': 3, 'family': 'Rosaceae', 'order': 'Rosales', 'genus': 'Fragaria', 'nutritions': {'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'carbohydrates': 5.5, 'protein': 0.8}}, {'name': 'Banana', 'id': 1, 'family': 'Musaceae', 'order': 'Zingiberales', 'genus': 'Musa', 'nutritions': {'calories': 96, 'fat': 0.2, 'sugar': 17.2, 'carbohydrates': 22.0, 'protein': 1.0}}, {'name': 'Tomato', 'id': 5, 'family': 'Solanaceae', 'order': 'Solanales', 'genus': 'Solanum', 'nutritions': {'calories': 74, 'fat': 0.2, 'sugar': 2.6, 'carbohydrates': 3.9, 'protein': 0.9}}, {'name': 'Pear', 'id': 4, 'family': 'Rosaceae', 'order': 'Rosales', 'genus': 'Pyrus', 'nutritions': {'calories': 57, 'fat': 0.1, 'sugar': 10.0, 'carbohydrates': 15.0, 'protein': 0.4}}, {'name': 'Durian', 'id': 60, 'family': 'Malv

In [17]:
# how many fruits did I get?
print(f"Number of fruits is {len(data)}")

Number of fruits is 49


## Conclusion

When working with JSON APIs the request is simple

Most of the work is AFTER parsing the JSON into a Python data structure. This is where the real work begins.

This usually involves extracting the data you need from the JSON data structure and transforming it into a format that is useful for your application.

This could mean transforming from dictionary into a list of dictionaries, or from a list of dictionaries into a dictionary of dictionaries, or from a dictionary of dictionaries into a list of tuples, etc.