# Hail to the King (of data interchange formats)

The modern internet is kept running by servers firing off and receiving bits of JSON. It is the most common data interchange format you will encounter - all services you build will rely on parsing JSON you've received from somewhere and sending out valid JSON to downstream sources.

What is JSON? It stands for Javascript Object Notation - it is a way of packaging up a piece of data into key-value associations so it can be easily parsed by machines but also be easily read by a human. Let's look at some:

```
{
    "firstName": "Jane",
    "lastName": "Doe",
    "hobbies": ["running", "sky diving", "singing"],
    "age": 35,
    "employed": true,
    "children": [
        {
            "firstName": "Alice",
            "age": 6
        },
        {
            "firstName": "Bob",
            "age": 8
        }
    ]
}
```

Hopefully, your knowledge...nay, *mastery* of python data types is helping you see that this looks an awful lot like a combination of dicts and lists. And it is! But there's some complexity around what kind of quotes to use, and then how to escape quotes if the whole thing is wrapped in quotes, and booleans are a bit different, and... and... oh boy, this just got complicated. Fortunately, python handles this complexity natively with it's `json` library.

## Deserialization

Let's begin by loading a json object into python - a process known as deserialization. In this case, our json is sitting in a file, so we'll need to get it out:

In [1]:
import json

## This should now look quite familiar
with open("/home/jovyan/files/example_json.json", "r") as f:
    data = json.load(f)

In [2]:
# Did we get it?
data

{'firstName': 'Jane',
 'lastName': 'Doe',
 'hobbies': ['running', 'sky diving', 'singing'],
 'age': 35,
 'employed': True,
 'children': [{'firstName': 'Alice', 'age': 6},
  {'firstName': 'Bob', 'age': 8}]}

In [3]:
# Did it parse correctly?
type(data)

dict

Take a moment to compare the python object and the raw json. See some differences?

## Deserializing a string

the `json.load` method works on files. What if you've just hit an api and what get returned is a string of json, not a file? You can use the `json.loads` method instead (the `s` stands for string):

In [4]:
# Let's load the example up as a string:
with open("/home/jovyan/files/example_json.json") as f:
    string_data = f.read()

In [5]:
string_data

'{\n    "firstName": "Jane",\n    "lastName": "Doe",\n    "hobbies": ["running", "sky diving", "singing"],\n    "age": 35,\n    "employed": true,\n    "children": [\n        {\n            "firstName": "Alice",\n            "age": 6\n        },\n        {\n            "firstName": "Bob",\n            "age": 8\n        }\n    ]\n}\n'

In [6]:
# gross. now what?
data = json.loads(string_data)
data

{'firstName': 'Jane',
 'lastName': 'Doe',
 'hobbies': ['running', 'sky diving', 'singing'],
 'age': 35,
 'employed': True,
 'children': [{'firstName': 'Alice', 'age': 6},
  {'firstName': 'Bob', 'age': 8}]}

Yay! That wasn't too bad.

## Serializing JSON

Ok, let's go the other way. We need to create some valid JSON for some other program to consume. Let's modify our dict first, then write the results as JSON into a new file

In [7]:
# Let's add a hobby

data["hobbies"].append("python")
data

{'firstName': 'Jane',
 'lastName': 'Doe',
 'hobbies': ['running', 'sky diving', 'singing', 'python'],
 'age': 35,
 'employed': True,
 'children': [{'firstName': 'Alice', 'age': 6},
  {'firstName': 'Bob', 'age': 8}]}

In [8]:
# Neat - let's serialize it
with open("/home/jovyan/files/output_json.json", "w") as f:
    json.dump(data, f)

In [9]:
# What if we just need a string?
json_string = json.dumps(data) # Once again we add an s!

In [10]:
json_string

'{"firstName": "Jane", "lastName": "Doe", "hobbies": ["running", "sky diving", "singing", "python"], "age": 35, "employed": true, "children": [{"firstName": "Alice", "age": 6}, {"firstName": "Bob", "age": 8}]}'

In [11]:
# We can control the output a little bit
pretty_string = json.dumps(data, indent=4)

In [12]:
pretty_string

'{\n    "firstName": "Jane",\n    "lastName": "Doe",\n    "hobbies": [\n        "running",\n        "sky diving",\n        "singing",\n        "python"\n    ],\n    "age": 35,\n    "employed": true,\n    "children": [\n        {\n            "firstName": "Alice",\n            "age": 6\n        },\n        {\n            "firstName": "Bob",\n            "age": 8\n        }\n    ]\n}'

In [13]:
# Ew, what have we done???  Wait for it...
print(pretty_string)

{
    "firstName": "Jane",
    "lastName": "Doe",
    "hobbies": [
        "running",
        "sky diving",
        "singing",
        "python"
    ],
    "age": 35,
    "employed": true,
    "children": [
        {
            "firstName": "Alice",
            "age": 6
        },
        {
            "firstName": "Bob",
            "age": 8
        }
    ]
}


## Have a pickle?

JSON is supremely useful because basically any system you interact with will know what to do with it. However, if you're shipping data from a python environment to another python environment you may want to consider the `pickle` library, which facilitates serialization/deserialization (or SerDe) of python-native objects. Pickle can handle very large and very complex data types (even data frames) easily - worth spending some time to get familiar with it!

That being said, most of your serde will be JSON related - we will be using this library all the time once we start working with AWS and `boto3`