# Module 1: JSON files

There are lot of different file types, and some are more common then others. And others require more work than others. JSON is one of the most important file types.

During this training we will go into detail about JSON files, and how to work with them. For those familiar with Python and dictionaries, it will feel quite similar. During the training we will follow the following outline:
1. JSON file basics
2. The json library
3. Navigating a JSON file structure
4. JSON to information
5. Nested JSON files

Enjoy!

Run the following cell to import all necessary libraries.

In [None]:
import json
import requests
import pprint
import pandas

## Section 1: JSON file basics

JSON, or **J**ava**S**cript **O**bject **N**otation, is one of the golden standards for information exhange in the world of data. When transporting data, for example throught an API, then JSON is the way to go. JSON has been built to be readable by a lot of programming languages, and that includes Python.

Due to its structure, and its widespread availability and accessability, knowledge about JSON is essential for an aspiring data engineer. So let's have a look at those so called JSON files. In the example below we will retrieve a JSON file through an API. The example is about Game of Thrones.

In [None]:
import requests
import pprint

URL = "https://anapioficeandfire.com/api/characters/583"
jon_snow_json = requests.get(URL).json()

pprint.pprint(jon_snow_json)
# print(jon_snow_json)

The example above shows the structure of a JSON file. And as you can see, a JSON file is very structured. 
JSON files are based on key-value pairs. With each key corresponding to a specific file. The keys can be used to navigate around the JSON file.
One important thing to note is that JSON files are extremely flexible. Almost any key can be used.
There is also a lot of flexibility in the values of the JSON.

The JSON example above is retrieved from a service about the world of Game of Thrones (spoilers). We retrieved information about Jon Snow, an important character in the books and show. The JSON file contains information about this character. Each key contains values with information about what is known. For example, you can see the following things in the JSON file.
- The name, which is of course Jon Snow.
- There is also information about the father, and the mother (both are empty in this case).
- We can find which seasons and which books the character is present.
- And, interestingly, a list of aliases for the character.

If you look closely in the JSON, you can find these points of information, as well as other information.

One thing to remember is that you can see a JSON as one long string. And using Python (and the json library) we can decode those strings and work with them. Also, we can encode Python objects as JSON strings. We can do these things with the json library. 

For those with experience in Python; a JSON structure is very similar to the structure of a Python dictionary.

## Section 2: The json library

Now that we had a look at the structure of a JSON file, we want to work with it! And we want to work with in Python!
For working with JSON files we can use the appropriately named json library.

With the json library we can do a lot of things regarding json files, including:
- Decode JSON files so that we can use them within Python.
- Encode Python objects so that we can store them as JSON files.

The json library is essential in working with JSON files within Python. As with (almost) all decent libraries, there is an extensive amount of documentation that can help you understand the functionalities of the library. 

While working with Python it is essential that you learn how to read documentation. This will help speed your work up, and improve your understanding of the library. So, have a look: https://docs.python.org/3/library/json.html. 

Let's see what we can do with the json library. We're going to have a look at four of the functionalities of the library:
- json.dumps
- json.dump
- json.loads
- json.load

Let's first create a Python dictionary that we can use as a basis for our examples. See below.

In [None]:
import json

json_structure_example = {"name": "Roger Federer",
                          "age": 40,
                          "occupation": "Professional tennis player"}

Using the method of 'json.dumps' we can convert the Python dictionary to a JSON string. In that way Python will see it as a string.

In [None]:
# Use json.dumps
json_string = json.dumps(json_structure_example)

print(json_string)
print(type(json_string))

We can also save our created dictionary as a JSON file. We can use the method of 'json.dump'.

In [None]:
# Use json.dump
file_name = "my_first_json.json"

with open(file_name, "w") as file:
    json.dump(json_structure_example, file, indent='\t\t\n\t\t')

So, we can use the 'json.dumps' to create a JSON string, and we can use the 'json.dump' to create a JSON string and save it as a JSON file.

Now let's look at reading JSON files. The 'json.loads' can read JSON strings and convert them to Python dictionaries.

In [None]:
print(json_string)
print(type(json_string))

converted_json_string = json.loads(json_string)
print(converted_json_string)
print(type(converted_json_string))

And using the 'json.load' method, we can read JSON files and load them as Python dictionaries.

In [None]:
file_name = "my_first_json.json"

with open("my_first_json.json", "r") as file:
    loaded_json = json.load(file)

print(loaded_json)

Now that we seen the most important methods of the json library, it's your turn to try them out. Please complete the following assigments.

#### Assignment 1: The json library 1

Create your own Python dictionary, with your name, age and occupation.

In [None]:
### FILL IN

#### Assignment 2: The json library 2

Convert your dictionary to a JSON string, and print it.
Use the 'json.dumps' method.

In [None]:
### FILL IN

#### Assignment 3: The json library 3

Save your dictionary as a JSON file with the name "my_second_json.json".
Use the json.dump' method.

In [None]:
### FILL IN

#### Assignment 4: The json library 4

Read your save JSON file and print it. It should have the name: "my_second_json.json".
Use the 'json.load' method.

In [None]:
### FILL IN

#### Assignment 5: The json library 5

Create a JSON string from your Python dictionary, and than convert it back to a Python dictionary.
First use the 'json.dumps' method, and then use the 'json.loads' method.

In [None]:
### FILL IN

Good job! These steps should give you some insight in how JSON files are structured, and how we can read, load and save them within Python.