# JSON

The json library can parse JSON from either strings or files. The library parses JSON into a Python dictionary or list. It can also convert Python dictionaries or lists into JSON strings.

## Parsing JSON - Covert JSON Strings to Python Object

One of the most common task which we perform on `JSON` is to convert it to Python object. `json` library provides `loads` function to achieve it. Lets understand it with following example.

In the below example we will take a JSON string (`json_string`) and convert it to Python object (`parsed_json`).

In [4]:
import json
# XML equival of json_string
""" 
<user>
    <first_name>Guido</first_name>
    <last_name>Rossum</last_name>
</user>
"""
json_string = '{"first_name": "Guido", "last_name":"Rossum"}'

parsed_json = json.loads(json_string)
print(type(parsed_json), parsed_json)

<class 'dict'> {'first_name': 'Guido', 'last_name': 'Rossum'}


We have used `loads` function to convert JSON string to Python object. As `parsed_json` is a dictionary, lets read its individual elements.  

In [2]:
print(parsed_json['first_name'],
      parsed_json['last_name'])

Guido Rossum


We can even traverse using `for` loop.

In [3]:
for k, v in parsed_json.items():
    print(k, "=>", v)

first_name => Guido
last_name => Rossum


Lets take a JSON string with another data type (list)

In [4]:
# It can be of other data types also
json_string = '["first_name", "Guido", {"name": "mayank" , "last_name":"johri"}]'

parsed_json = json.loads(json_string)
print(parsed_json, type(parsed_json))

['first_name', 'Guido', {'name': 'mayank', 'last_name': 'johri'}] <class 'list'>


Below example shows, how we can read a JSON file and perform operation on its data. 

In [6]:
with open("random.json") as f:
    parsed_json = json.loads(f.read())
    
print(parsed_json)
print(type(parsed_json))

{'results': [{'user': {'gender': 'male', 'name': {'title': 'mr', 'first': 'ernest', 'last': 'coleman'}, 'location': {'street': '6735 greenhaven ln', 'city': 'sunnyvale', 'state': 'connecticut', 'zip': '33332'}, 'email': 'ernest.coleman20@example.com', 'username': 'yellowdog409', 'password': 'ledzep', 'salt': '7iDblTIm', 'md5': '3c9871f954d86c58d3f49b98a7fb60c3', 'sha1': 'd6cb9b8abb27f9718ee9185b2cde298665a177b5', 'sha256': '4fdb19e71c66c91f27f5208cc2a59ca8833ed9b72c549f11ea6a881ffde23c65', 'registered': '1315463131', 'dob': '468664323', 'phone': '(348)-196-2669', 'cell': '(757)-671-8341', 'SSN': '641-99-7751', 'picture': {'large': 'http://api.randomuser.me/portraits/men/72.jpg', 'medium': 'http://api.randomuser.me/portraits/med/men/44.jpg', 'thumbnail': 'http://api.randomuser.me/portraits/thumb/men/47.jpg'}, 'version': '0.6', 'nationality': 'US'}, 'seed': '369a24e79c2f8676'}]}
<class 'dict'>


Following is the table to understand the relationship between Python and JSON Data and can be useful to understand the state of data after serialization or deserialization.

|Python|JSON|
|------|-----|
| dict | object|
| list, tuple|array|
| None | null |
| False | false|
| True | true|
| int, float | number|
| str  |string |

### De-serialization - Python object to JSON string

It is the process of converting `python` objects to `JSON` objects/string. It can be achieved by using `json.dumps`.

In [9]:
py = {
    'first_name': 'Guido',
    'second_name': 'Rossum',
    'titles': ['BDFL', 'Developer'],
}

json_string = json.dumps(py)
print(json_string)
print(type(json_string))

{"first_name": "Guido", "second_name": "Rossum", "titles": ["BDFL", "Developer"]}
<class 'str'>


In [10]:
d = ["mayank", "Venky", "Prashant Bhandarkar"]

json_string = json.dumps(py)
print(json_string)
print(type(json_string))

{"first_name": "Guido", "second_name": "Rossum", "titles": ["BDFL", "Developer"]}
<class 'str'>


`JSON` to Python object conversion follows the following table

|  `JSON`       | Python    |
|:-------------:|:---------:|
| object	    | dict      |
| array	        | list      |
| string	    | str       |
| number (int)	| int       |
| number (real)	| float     |
| true	        | True      |
| false	        | False     |
| null	        | None      |

In all the above example, if we save the `JSON` string to a file then entire content is stored in a single line and will not look pretty. `json.dumps` provides additional methods to achieve it. In the following examples we are going to learn about them.

Note that not all types of Python data can be converted to JSON object as shown in the below example.

In [3]:
for data in ["Prashant Bhandarkar", True, False, None, (1, 2), [1, (2, 3)], {1, 3}, [{1, 3}]]:
    try:
        json_string = json.dumps(data)
        print(json_string)
        print(type(json_string))
    except Exception as e:
        print(e)


name 'json' is not defined
name 'json' is not defined
name 'json' is not defined
name 'json' is not defined
name 'json' is not defined
name 'json' is not defined
name 'json' is not defined
name 'json' is not defined


In the above example, As Python `set` has no equivalent data type in JSON it fails. So we need to fix it before using it as shonw in the below exmaple  

In [5]:
data = {1, 3}
data = tuple(data)
json_string = json.dumps(data)
print(json_string)
print(type(json_string))

[1, 3]
<class 'str'>


#### `indent`

`indent` is an argument of `dumps` which provide proper indents to the JSON elements as shown in the below examples.

In [21]:
d = ["mayank", "Venky", "Prashant Bhandarkar"]

data = json.dumps(d, indent=2)
print(data)

[
  "mayank",
  "Venky",
  "Prashant Bhandarkar"
]


In [22]:
py = {
    'first_name': 'Guido',
    'second_name': 'Rossum',
    'titles': ['BDFL', 'Developer'],
}

data = json.dumps(py, indent=4)
print(data)

{
    "first_name": "Guido",
    "second_name": "Rossum",
    "titles": [
        "BDFL",
        "Developer"
    ]
}


#### `sort_keys`

`sort_keys` when set to `True` sorts the Python object dictionary based on keys while converting them to `JSON` as shown in the below example

In [23]:
py = {
    'titles': ['Developer', 'BDFL'], 'first_name': 'Guido',
    'second_name': 'Rossum',
}

data = json.dumps(py, sort_keys=True, indent=4)
print(data)

{
    "first_name": "Guido",
    "second_name": "Rossum",
    "titles": [
        "Developer",
        "BDFL"
    ]
}


#### `skipkeys=True`

 Lets take an example of a Python dictionary which has key as tuple, which is a valid 

In [8]:
json_string = {("mohan", "rakesh"): "admin", "mayank": "user"}
try:
    data = json.dumps(json_string, indent=4)
    print(data)
except TypeError as te:
    print(te)

keys must be str, int, float, bool or None, not tuple


lets use the `skipkeys=True` option, you can see that `("mohan", "rakesh"): "admin"` element which was causing issue has been skipped 

In [9]:
try:
    data = json.dumps(json_string, skipkeys=True, indent=4)
    print(data)
except TypeError as te:
    print(te)

{
    "mayank": "user"
}


### JSON Validation

In [26]:
# JSON validator: Quick and Basic :)

import json

json_string = '{"first_name": \'Guido\', "last_name":"Rossum"'

try:
    js_val = json.loads(json_string)
except ValueError as ve:
    print("Got invalid JSON string, skipping it.")
    print(ve)

Got invalid JSON string, skipping it.
Expecting value: line 1 column 16 (char 15)


### JSON to load into an OrderedDict

In [35]:
import collections
py = {
    'titles': ['Developer', 'BDFL'], 'first_name': 'Guido',
    'second_name': 'Rossum',
}

# to avoid single quote issue in JSON
py = str(py).replace("'", '"')
json.JSONDecoder(object_pairs_hook=collections.OrderedDict).decode(str(py))

OrderedDict([('titles', ['Developer', 'BDFL']),
             ('first_name', 'Guido'),
             ('second_name', 'Rossum')])

In [18]:
from collections import OrderedDict
# OrderedDict as ordereddict

lnct_batch = """{
    "es": ["Mukesh Bansal", "Kirti Khanna", "Jyoti Pancholi", "Nishant Shrivastava", "Gajendra Bandi"],
    "cs": ["Amit Shrivastava"]
}"""

data = json.loads(lnct_batch,  object_pairs_hook=OrderedDict)
print(data)
print(json.dumps(data, indent=4))

OrderedDict([('es', ['Mukesh Bansal', 'Kirti Khanna', 'Jyoti Pancholi', 'Nishant Shrivastava', 'Gajendra Bandi']), ('cs', ['Amit Shrivastava'])])
{
    "es": [
        "Mukesh Bansal",
        "Kirti Khanna",
        "Jyoti Pancholi",
        "Nishant Shrivastava",
        "Gajendra Bandi"
    ],
    "cs": [
        "Amit Shrivastava"
    ]
}


In [40]:
data = json.loads('{"foo":1, "bar": 2}', object_pairs_hook=OrderedDict)
print(json.dumps(data, indent=4))

{
    "foo": 1,
    "bar": 2
}


### Examples

In [13]:
import json  
student = {"101":{"class":'V', "Name":'Rohit',  "Roll_no":7},  
           "102":{"class":'V', "Name":'David',  "Roll_no":8},  
           "103":{"class":'V', "Name":'Samiya', "Roll_no":12}}  
print(json.dumps(student)) 

{"101": {"class": "V", "Name": "Rohit", "Roll_no": 7}, "102": {"class": "V", "Name": "David", "Roll_no": 8}, "103": {"class": "V", "Name": "Samiya", "Roll_no": 12}}


In [14]:
import json  
student = {"101":{"Name":'Rohit',"Class":'V', "Roll_no":7},  
           "102":{"Name":'David',"Class":'V', "Roll_no":8},  
           "103":{"Name":'Samiya',"Class":'V', "Roll_no":12}}  
print(json.dumps(student, sort_keys=True)); 

{"101": {"Class": "V", "Name": "Rohit", "Roll_no": 7}, "102": {"Class": "V", "Name": "David", "Roll_no": 8}, "103": {"Class": "V", "Name": "Samiya", "Roll_no": 12}}


In [64]:
import json  
tup1 = 'Red', 'Black', 'White'
print(json.dumps(tup1))

["Red", "Black", "White"]


In [16]:
import json  
list1 = [5, 12, 13, 14];  
print(json.dumps(list1));

[5, 12, 13, 14]


In [17]:
import json  
string1 = 'Python and JSON';  
print(json.dumps(string1));

"Python and JSON"


In [18]:
import json  
x = True;  
print(json.dumps(x));  

true


In [1]:
import json  
json_data = '{"103": {"class": "V", "Name": "Samiya", "Roll_n": 12}, "102": {"class": "V", "Name": "David", "Roll_no": 8}, "101": {"class": "V", "Name": "Rohit", "Roll_no": 7}}';  
print(json.loads(json_data));

{'103': {'class': 'V', 'Name': 'Samiya', 'Roll_n': 12}, '102': {'class': 'V', 'Name': 'David', 'Roll_no': 8}, '101': {'class': 'V', 'Name': 'Rohit', 'Roll_no': 7}}


### Serializing Custom Object

We have instances where we have custom data such as an object of custom class, or of data which is not a basic data types such as (int, float, bool, None, str, list, tuple, dict). In these cases we can create our own serialization function as shown in the below example

In [10]:
# Problem

import json

from datetime import datetime

now = datetime.now()
user_punch_data = {
    'userid': 10021,
    'Punchtime': now
}

try:
    # we have datetime object as Value for key 'Punchtime'
    print(user_punch_data)
    rest_data = json.dumps(user_punch_data)
except Exception as e:
    print(e)

{'userid': 10021, 'Punchtime': datetime.datetime(2022, 3, 4, 10, 8, 21, 178766)}
Object of type datetime is not JSON serializable


In [12]:
# Very Basic Solution
import json
from datetime import datetime


def _serialize(obj):
    return str(obj)

now = datetime.now()
user_punch_data = {
    'userid': 10021,
    'Punchtime': now
}

try:
    rest_data = json.dumps(user_punch_data, 
                           default=_serialize)
    print(rest_data)
except Exception as e:
    print(e)

{"userid": 10021, "Punchtime": "2022-03-04 10:09:09.007845"}


In [18]:
# A bit better Solution
import json
from datetime import datetime


def _serialize(obj):
    if isinstance(obj, set):
        return list(obj)
    return str(obj)

now = datetime.now()
user_punch_data = ["Prashant Bhandarkar", True, False, None, (1, 2), [1, (2, 3)], {1, 3}, [{1, 3}]]

try:
    rest_data = json.dumps(user_punch_data, 
                           default=_serialize)
    print(rest_data)
except Exception as e:
    print(e)

["Prashant Bhandarkar", true, false, null, [1, 2], [1, [2, 3]], [1, 3], [[1, 3]]]


### Gotcha's 

#### Gotcha 1: 

In [30]:
# Single quotes are not allowed inside the JSON string to denote string

import json  
json_data = """{'103': {"class": "V", "Name": "Samiya", "Roll_n": 12}, 
            "102": {"class": "V", "Name": "David", "Roll_no": 8},
            "101": {"class": "V", "Name": "Rohit", "Roll_no": 7}}""" 
try:
    js = json.loads(json_data)
except Exception as je:
    print(je)

Expecting property name enclosed in double quotes: line 1 column 2 (char 1)


In [20]:
import collections

json_data = """{'103': {"class": "V", "Name": "Samiya", "Roll_n": 12}, 
            "102": {"class": "V", "Name": "David", "Roll_no": 8},
            "101": {"class": "V", "Name": "Rohit", "Roll_no": 7}}""" 
try:
    json.JSONDecoder(object_pairs_hook=collections.OrderedDict).decode(str(json_data))
except Exception as je:
    print(je)

Expecting property name enclosed in double quotes: line 1 column 2 (char 1)


**Solution:**

In [21]:
import ast
import json

data = json.dumps(ast.literal_eval(json_data))
print(data)

{"103": {"class": "V", "Name": "Samiya", "Roll_n": 12}, "102": {"class": "V", "Name": "David", "Roll_no": 8}, "101": {"class": "V", "Name": "Rohit", "Roll_no": 7}}


##### Gotcha 2: Extra `,` comma can do wonders

In [23]:
d = '{"first_name": "Guido", "last_name":"Rossum", }'

try:
    data = json.loads(d)
    print(data, "<==>", type(data))
except Exception as e:
    print(e)

Expecting property name enclosed in double quotes: line 1 column 47 (char 46)


**Solution:**

In [24]:
import ast
import json

json_data = '{"first_name": "Guido", "last_name":"Rossum",}'
data = json.dumps(ast.literal_eval(json_data))
print(data)

{"first_name": "Guido", "last_name": "Rossum"}


In [39]:
# Please do not use `eval` for it and use `ast.literal_eval`

json_data = '[print("welcome")]'
eval(json_data)

welcome


[None]

In [25]:
import ast
import json

json_data = '[print("welcome")]'
try:
    _data = ast.literal_eval(json_data)
    print(f"{_data = }")
    data = json.dumps(_data)
    print(data)
except Exception as e:
    print(f"{e = }")


e = ValueError('malformed node or string: <ast.Call object at 0x7887b735f760>')
