### JSON Data

Let's start with a simple JSON object - note that it is just a string!

In [1]:
json_str = '''
{
    "name": "Eric Smith",
    "age": 32,
    "phoneNumbers": [
        {
            "type": "home",
            "number": "(212) 555-3276"
        },
        {
            "type": "work",
            "number": "(332) 555-1234"
        }
    ],
    "spouse": null,
    "children": [],
    "employed": true
}
'''

Apart from a few details (such as `true`, `null`), this really just looks like a Python dictionary.

We can in fact **decode** (**deserialize**) this string (this JSON object) into a Python dictionary using the `loads` function in the `json` module:

In [2]:
import json

In [3]:
eric = json.loads(json_str)

In [4]:
eric

{'name': 'Eric Smith',
 'age': 32,
 'phoneNumbers': [{'type': 'home', 'number': '(212) 555-3276'},
  {'type': 'work', 'number': '(332) 555-1234'}],
 'spouse': None,
 'children': [],
 'employed': True}

This operation is in fact reversable, and we can also **serialize** (**encode**) this Python dictionary into a JSON object using the `dumps` function:

In [5]:
json_str_2 = json.dumps(eric)

In [6]:
json_str_2

'{"name": "Eric Smith", "age": 32, "phoneNumbers": [{"type": "home", "number": "(212) 555-3276"}, {"type": "work", "number": "(332) 555-1234"}], "spouse": null, "children": [], "employed": true}'

You'll notice that the white space here is very different from what we started with - to JSON whitespace is irrelevant, and in fact, the less whitespace you have in the JSON object the less characters it needs to transmit - but for human reading it can be tough.

We can tell the `dumps` function to use some whitespace to make our (human) life easier:

In [7]:
print(json.dumps(eric, indent=2))

{
  "name": "Eric Smith",
  "age": 32,
  "phoneNumbers": [
    {
      "type": "home",
      "number": "(212) 555-3276"
    },
    {
      "type": "work",
      "number": "(332) 555-1234"
    }
  ],
  "spouse": null,
  "children": [],
  "employed": true
}


The JSON basic value types are very limited: numbers, strings, booleans and null (as well as objects and lists), so many data types in Python cannot just be serialized:

In [8]:
from datetime import datetime

d = {
    "name": "Isaac Newton",
    "dob": datetime(1643, 1, 4)
}

In [9]:
d

{'name': 'Isaac Newton', 'dob': datetime.datetime(1643, 1, 4, 0, 0)}

In [10]:
try:
    json.dumps(d)
except TypeError as ex:
    print('TypeError:', ex)

TypeError: Object of type datetime is not JSON serializable


As you can see, Python was unable to serialize that datetime object.

There are different ways we can specify custom encoders, but these are mostly beyond the scope of this course (they rely on something called inheritance).

However, there is a way to specify a simple custom encoder, using the named argument `default` in the `dumps` function.

This argument can be used to specify a function that will get called when the default encoder cannot serialize the object. That function should either return the encoded value, or raise a TypeError...

In [11]:
def my_encoder(obj):
    print(f'my_encoder({obj}) called...')
    if isinstance(obj, datetime):
        return obj.isoformat()
    raise TypeError  # only handles datetimes

In [12]:
json.dumps(d, default=my_encoder)

my_encoder(1643-01-04 00:00:00) called...


'{"name": "Isaac Newton", "dob": "1643-01-04T00:00:00"}'

So `default` is a simple way for you to specify a custom encoder.

It could be used to encode more than just one data type:

In [13]:
from decimal import Decimal
from datetime import date

In [14]:
d = {
    "symbol": "IBM",
    "date": date(2020, 9, 21),
    "day": {
        "open": Decimal('120.48'),
        "high": Decimal('120.70'),
        "low": Decimal('118.58'),
        "close": Decimal('120.25'),
        "volume": 5_205_413
    }
}

Here we'll have to handle `date` and `Decimal` types ourselves:

In [15]:
def stock_encoder(obj):
    if isinstance(obj, date):
        return obj.isoformat()
    if isinstance(obj, Decimal):
        return str(obj)
    raise TypeError

In [16]:
print(json.dumps(d, default=stock_encoder, indent=2))

{
  "symbol": "IBM",
  "date": "2020-09-21",
  "day": {
    "open": "120.48",
    "high": "120.70",
    "low": "118.58",
    "close": "120.25",
    "volume": 5205413
  }
}


I chose to encode `Decimal` objects as strings, but we could also use floats, rounded to 2 digits after the decimal point:

In [17]:
def stock_encoder(obj):
    if isinstance(obj, date):
        return obj.isoformat()
    if isinstance(obj, Decimal):
        return round(float(obj), 2)
    raise TypeError

In [18]:
print(json.dumps(d, default=stock_encoder, indent=2))

{
  "symbol": "IBM",
  "date": "2020-09-21",
  "day": {
    "open": 120.48,
    "high": 120.7,
    "low": 118.58,
    "close": 120.25,
    "volume": 5205413
  }
}
