## JSON
JavaScript Object Notation<br>

- text-based object serialization
- open standard
- human-readable

Very common format for web API's and general data interchange between systems.<br>
Unlike pickling, it is considered `safe`<br>
may vary based on the JSON deserializer you use<br>


There are other formats too, such as XML - but XML does not translate directly to Python dictionaries like JSON does. JSON is a far more natural fit with Python - in fact, when we view the contents of a Python dictionary it reminds us of JSON.

### Limited Data Types<br>


`strings` "python" delimited by double quotes Unicode.<br>
`numbers` 100 3.14 3.14e-05 -> all floats<br>
`booleans` true, false<br>

`arrays (lists)` [1, 3.1, "python"]  delimited by square brackets ordered<br>

`dictionaries` { "a": 1, "b": "python"} ; key-value pairs; keys ->  strings and values -> any supported data type; `Unoredered`<br>

`empty` value null<br>

`integers` 100 <br>
`floats` Nan Infinity -Infinity<br>





### Serialization and Deserialization<br>

JSON is a natural fit for serializing and deserializing Python dictionaries<br>
Of course, Python dictionaries are `objects`<br>
JSON is essentially a `string`<br>

import json<br>
dump, dumps, load, loads

### Problems<br>

JSON keys must be strings -> but Python dictionary keys just need to be hashable.<br>
                            -> how to serialize?<br>
JSON value types are limited -> Python dictionary values can be any data type<br>
                               -> how to serialize?<br>
                               
even if we can serialize a complex data type, such as a custom class;<br>
- how do we deserialize back to original data type?<br>


In [1]:
d = {
    "name": {
        "first": "...",
        "last": "..."
    },
    "contact": {
        "phone": [
            {"type": "...", "number": "..."},
            {"type": "...", "number": "..."},
            {"type": "...", "number": "..."},
        ],
        "email": ["...", "...", "..."]
    },
    "address": {
        "line1": "...",
        "line2": "...",
        "city": "...",
        "country": "..."
    }
}

This is a standard Python dictionary, but if you look at the format, it is also technically JSON.<br>
The big difference is that JSON is basically just one big string, while a Python dictionary is an object containing other objects.<br>

So the big question when we want to "convert" (serialize) a Python object to JSON is how to represent Python objects as **strings** ?<br>

Conversely, if we want to load a JSON object into a Python dictionary, how do we "convert" (deserialize) the JSON value strings into a Python object ? <br>

this concept of serializing/deserializing is also often called `marshalling`.

In [2]:
import json

In [3]:
d1 = {"a": 100, "b": 200}

In [4]:
d1_json = json.dumps(d1)

type( d1_json )

str

In [6]:
d1_json

'{"a": 100, "b": 200}'

better looking JSON string by specifying an indent for the dump or dumps functions:

In [7]:
print(json.dumps(d1, indent=2))

{
  "a": 100,
  "b": 200
}


And we can deserialize the JSON string:

In [8]:
d2 = json.loads(d1_json)

In [9]:
d2, type(d2)

({'a': 100, 'b': 200}, dict)

In [10]:
d1 == d2

True

In [11]:
d1 is d2

False

#### Problem<br>
There is a big caveat here. In Python, keys can be any hashable object. But remember that in JSON keys must be strings!

In [12]:
d1 = {1: 100, 2: 200}

In [13]:
d1_json = json.dumps(d1)

In [14]:
d1_json

'{"1": 100, "2": 200}'

Notice how the keys are now strings in the JSON "object". And when we deserialize:

In [15]:
d2 = json.loads(d1_json)

In [17]:
print(d1)
print(d2)

{1: 100, 2: 200}
{'1': 100, '2': 200}


In [18]:
d_json = '''
{
    "name": "John Cleese",
    "age": 82,
    "height": 1.96,
    "walksFunny": true,
    "sketches": [
        {
        "title": "Dead Parrot",
        "costars": ["Michael Palin"]
        },
        {
        "title": "Ministry of Silly Walks",
        "costars": ["Michael Palin", "Terry Jones"]
        }
    ],
    "boring": null    
}
'''

In [19]:
# Let's deserialize this JSON string:
d = json.loads(d_json)
print(d)

{'name': 'John Cleese', 'age': 82, 'height': 1.96, 'walksFunny': True, 'sketches': [{'title': 'Dead Parrot', 'costars': ['Michael Palin']}, {'title': 'Ministry of Silly Walks', 'costars': ['Michael Palin', 'Terry Jones']}], 'boring': None}


In [20]:
d

{'name': 'John Cleese',
 'age': 82,
 'height': 1.96,
 'walksFunny': True,
 'sketches': [{'title': 'Dead Parrot', 'costars': ['Michael Palin']},
  {'title': 'Ministry of Silly Walks',
   'costars': ['Michael Palin', 'Terry Jones']}],
 'boring': None}

**Important**: The order of the keys appears preserved - but JSON objects are an unordered collection, so there is no guarantee of this - do not rely on it.

In [21]:
print(d['age'], type(d['age']))
print(d['height'], type(d['height']))
print(d['boring'], type(d['boring']))
print(d['sketches'], type(d['sketches']))
print(d['walksFunny'], type(d['walksFunny']))
print(d['sketches'][0], type(d['sketches'][0]))

82 <class 'int'>
1.96 <class 'float'>
None <class 'NoneType'>
[{'title': 'Dead Parrot', 'costars': ['Michael Palin']}, {'title': 'Ministry of Silly Walks', 'costars': ['Michael Palin', 'Terry Jones']}] <class 'list'>
True <class 'bool'>
{'title': 'Dead Parrot', 'costars': ['Michael Palin']} <class 'dict'>


As you can see the JSON `array` was serialized into a `list`, `true` was serialized into a `bool`, integer looking values into `int`, `float` looking values into float and sub-objects into `dict`. As you can see deserializing JSON objects into Python is very straightforward and intuitive.

Let's look at `tuples`, and see serializing those work:

In [22]:
d = {'a': (1, 2, 3)}

In [23]:
json.dumps(d)

'{"a": [1, 2, 3]}'

So Python tuples are serialized into JSON lists - which again means that if we deserialize the JSON we will not get our exact object back:

In [24]:
json.loads(json.dumps(d))

{'a': [1, 2, 3]}

Of course, JSON does not have a notion of tuples as a data type, so this will not work:

In [25]:
bad_json = '''
    {"a": (1, 2, 3)}
'''

In [26]:
json.loads(bad_json)

JSONDecodeError: Expecting value: line 2 column 11 (char 11)

Python was able to serialize a tuple by making it into a JSON array - but what about other data types - like Decimals, Fractions, Complex Numbers, Sets, etc?

In [27]:
from decimal import Decimal
json.dumps({'a': Decimal('0.5')})

TypeError: Object of type Decimal is not JSON serializable

So `Decimal` objects are not serializable.

In [28]:
try:
    json.dumps({"a": 1+1j})
except TypeError as ex:
    print(ex)

Object of type complex is not JSON serializable


In [29]:
try:
    json.dumps({"a": {1, 2, 3}})
except TypeError as ex:
    print(ex)

Object of type set is not JSON serializable


In [30]:
str(Decimal(0.5))

'0.5'

In [31]:
json.dumps({"a": str(Decimal(0.5))})

'{"a": "0.5"}'

But as you can see from the JSON, when we read that data back, we will get the string `0.5` back, not even a float!

### How about our own objects?

In [32]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def __repr__(self):
        return f'Person(name={self.name}, age={self.age})'

In [33]:
p = Person('harsha', 82)

In [34]:
p

Person(name=harsha, age=82)

In [35]:
json.dumps({"harsha": p})

TypeError: Object of type Person is not JSON serializable

Solution: One approach is to write a custom JSON serializer in our class itself, and use that when we serialize the object:

In [36]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def __repr__(self):
        return f'Person(name={self.name}, age={self.age})'
    
    def toJSON(self):
        return dict(name=self.name, age=self.age)

In [37]:
p = Person('harsha', 82)

In [38]:
p.toJSON()

{'name': 'harsha', 'age': 82}

In [39]:
print(json.dumps({"john": p.toJSON()}, indent=2))

{
  "john": {
    "name": "harsha",
    "age": 82
  }
}


In [40]:
vars(p)

{'name': 'harsha', 'age': 82}

In [41]:
p.__dict__

{'name': 'harsha', 'age': 82}

In [42]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def __repr__(self):
        return f'Person(name={self.name}, age={self.age})'
    
    def toJSON(self):
        return vars(self)

In [43]:
json.dumps(dict(john=p.toJSON()))

'{"john": {"name": "harsha", "age": 82}}'

#### How about dealing with sets, where we do not control the class definition

In [44]:
s = {1, 2, 3}

In [45]:
json.dumps(dict(a=list({1, 2, 3})))

'{"a": [1, 2, 3]}'