### Python JSON

Handling JSON in Python is a fundamental skill for working with APIs, configuration files, and structured data exchange. JSON (JavaScript Object Notation) is a lightweight, text-based format that maps closely to Python’s native data structures—like dictionaries and lists. Python’s built-in json module makes it easy to parse, generate, and manipulate JSON data.


In [1]:
import json

In [2]:
dir(json)

['JSONDecodeError',
 'JSONDecoder',
 'JSONEncoder',
 '__all__',
 '__author__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '_default_decoder',
 '_default_encoder',
 'codecs',
 'decoder',
 'detect_encoding',
 'dump',
 'dumps',
 'encoder',
 'load',
 'loads',
 'scanner']

In [4]:
import json
from pprint import pprint

json_str = '{"name": "Sadiq", "skills": ["Python", "AI", "ML"], "active": true}'

data = json.loads(json_str)
print(data["name"])  # Output: Sadiq

Sadiq


In [5]:
pprint(json_str)

'{"name": "Sadiq", "skills": ["Python", "AI", "ML"], "active": true}'


Here, the JSON string is parsed into a Python dictionary, allowing you to access values using standard dictionary syntax. This is especially useful when consuming data from web APIs.

Conversely, to convert a Python object into a JSON string, use json.dumps(). This is useful when you want to serialize Python data for storage or transmission. You can also format the output using parameters like indent and sort_keys:


In [6]:
python_data = {"name": "Sadiq", "age": 30, "skills": ["Python", "ML"]}
json_output = json.dumps(python_data, indent=2, sort_keys=True)
print(json_output)

{
  "age": 30,
  "name": "Sadiq",
  "skills": [
    "Python",
    "ML"
  ]
}


This produces a human-readable JSON string, which is ideal for logging or saving to a file. If you want to write this directly to a file, use json.dump() with a file object.
Reading and writing JSON files is straightforward with json.load() and json.dump(). For reading, open the file in read mode and pass the file object to json.load():


In [1]:
import json

# Sample data to dump
data = {
    "name": "Sadiq",
    "location": "Bengaluru North, Ka, India",
    "timestamp": "2025-09-13T20:29:00+05:30",
    "president_US": "Donald Trump",
    "interests": ["AI", "ML", "Python", "Upanishads"],
    "goals": {
        "health": "Reduce visceral fat in 30 days",
        "career": "Transition to AI/ML and NLP projects",
        "finance": "Eliminate high-interest debt"
    }
}

# Dump to JSON file
with open("output.json", "w") as f:
    json.dump(data, f, indent=4)

In [11]:
with open("output.json", "r") as f:
    data = json.load(f)
    pprint(data)


{'goals': {'career': 'Transition to AI/ML and NLP projects',
           'finance': 'Eliminate high-interest debt',
           'health': 'Reduce visceral fat in 30 days'},
 'interests': ['AI', 'ML', 'Python', 'Upanishads'],
 'location': 'Bengaluru North, Ka, India',
 'name': 'Sadiq',
 'president_US': 'Donald Trump',
 'timestamp': '2025-09-13T20:29:00+05:30'}


|Function| Input Type | Purpose |
|---------|---------|----------|
|json.load()|File-like object| Reads JSON from as file and converts to Python|
|json.loads()|String|Reads JSON from a string and converts to Python|

#### Serialization Formats Compared
    
|Feature  |JSON  |Pickle  |YAML  |
|---------|--------|------------|---------| 
|Format Type  |Text (UTF-8) |Binary |text(UTF-8)| 
|Human Readable  |yes  |No  |Yes  | 
|Cross language Support|Excellent |Python-Only  |Good| 
|Supports Complex Objects|Limited (Basic Type Only)|yes (Classes and functions)|Partial via Pyyaml | 
|Security |Safe (No code execution)  |unsafe (Can execute Code)|Depends on loader used| 
|Speed |Slower for large data |fast  |Moderate | 
|File Size|Compact|Larger|Compact|
|Use Case|APIs, config, web data|internal python state saving|config files, readable data|

In [2]:
#Pickle
import pickle

data = {"name": "Sadiq", "skills": ["Python", "ML"]}
pickled = pickle.dumps(data)
unpickled = pickle.loads(pickled)

In [3]:
#YAML
import yaml

data = {"name": "Sadiq", "skills": ["Python", "ML"]}
yaml_str = yaml.dump(data)
parsed = yaml.safe_load(yaml_str)

##### What Is JSONDecodeError?

JSONDecodeError is raised when the json module in Python encounters invalid JSON while trying to parse a string using json.loads() or a file using json.load().
It’s a subclass of ValueError, and typically signals:

- Malformed JSON syntax
- Unexpected characters
- Trailing commas or extra data
- Empty strings or None inputs

##### Common Causes and Examples
1. Empty String Input

In [1]:
import json

data = ""
parsed = json.loads(data)  # Raises JSONDecodeError: Expecting value

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Why? JSON must contain at least one valid value—empty string is not valid JSON.

##### 2. Trailing Comma

In [2]:
json_str = '{"name": "Sadiq", "skills": ["Python", "ML", ]}'
json.loads(json_str)  # Raises JSONDecodeError: Expecting value

JSONDecodeError: Expecting value: line 1 column 46 (char 45)

Why? JSON does not allow trailing commas in arrays or objects.

##### 3. Multiple JSON Objects in One String

In [3]:
json_str = '{"name": "Sadiq"}{"skills": ["Python", "ML"]}'
json.loads(json_str)  # Raises JSONDecodeError: Extra data

JSONDecodeError: Extra data: line 1 column 18 (char 17)

Why? json.loads() expects a single JSON object. Use a streaming parser or split the string.

##### 4. Improper Quotes

In [4]:
json_str = "{'name': 'Sadiq'}"
json.loads(json_str)  # Raises JSONDecodeError: Expecting property name enclosed in double quotes

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

Why? JSON requires double quotes for strings and keys—not single quotes.

##### 5. Control Characters or Non-Printable Characters

In [5]:
json_str = '{"name": "Sadiq\u0001"}'
json.loads(json_str)  # May raise JSONDecodeError depending on context

JSONDecodeError: Invalid control character at: line 1 column 16 (char 15)

Why? Control characters can corrupt the JSON structure unless properly escaped.

In [6]:
##### How to Handle It Gracefully

#Use try-except Block

try:
    data = json.loads(json_str)
except json.JSONDecodeError as e:
    print(f"JSON decoding failed: {e}")

JSON decoding failed: Invalid control character at: line 1 column 16 (char 15)


- Validate JSON Before Parsing:
  
Use tools like JSONLint or Python libraries like jsonschema to validate structure.
    

In [8]:
##### Use strict=False (with caution)
with open("output.json") as f:
    data = json.load(f, strict=False)

#### What Are JSON Codecs

In Python, a codec (short for coder-decoder) refers to the mechanism that encodes Python objects into JSON strings and decodes JSON strings back into Python objects. The built-in json module provides the default codec, but there are alternative codecs like **ujson, orjson, and rapidjson** that offer performance, memory, and feature trade-offs.

##### Standard JSON Codec (json module)


In [9]:
#Encoding (Python → JSON)

import json

data = {"name": "Sadiq", "skills": ["Python", "ML"], "active": True}
json_str = json.dumps(data)
print(json_str)


# Decoding (JSON → Python)
decoded = json.loads(json_str)
print(decoded["skills"])

{"name": "Sadiq", "skills": ["Python", "ML"], "active": true}
['Python', 'ML']


##### Custom Encoder
You can subclass json.JSONEncoder to handle non-standard types:

In [10]:
import json
from datetime import datetime

class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

data = {"timestamp": datetime.now()}
print(json.dumps(data, cls=DateTimeEncoder))

{"timestamp": "2025-09-17T11:06:40.287173"}


##### High-Performance JSON Codecs

1. UltraJSON (ujson)
- Written in C, very fast
- Doesn’t support custom encoders or Decimal

In [12]:
!pip install ujson orjson python-rapidjson

Collecting ujson
  Downloading ujson-5.11.0-cp311-cp311-win_amd64.whl.metadata (9.6 kB)
Collecting orjson
  Downloading orjson-3.11.3-cp311-cp311-win_amd64.whl.metadata (43 kB)
Collecting python-rapidjson
  Downloading python_rapidjson-1.21-cp311-cp311-win_amd64.whl.metadata (24 kB)
Downloading ujson-5.11.0-cp311-cp311-win_amd64.whl (43 kB)
Downloading orjson-3.11.3-cp311-cp311-win_amd64.whl (131 kB)
Downloading python_rapidjson-1.21-cp311-cp311-win_amd64.whl (147 kB)
Installing collected packages: ujson, python-rapidjson, orjson

   ---------------------------------------- 3/3 [orjson]

Successfully installed orjson-3.11.3 python-rapidjson-1.21 ujson-5.11.0


In [15]:
import ujson

data = {"name": "Sadiq", "score": 99.5}
encoded = ujson.dumps(data)
decoded = ujson.loads(encoded)
print(decoded)

{'name': 'Sadiq', 'score': 99.5}


2. orjson

- Blazing fast, supports datetime, numpy, and dataclass
- Returns bytes instead of str

In [16]:
import orjson
from datetime import datetime

data = {"created": datetime.now()}
encoded = orjson.dumps(data)
decoded = orjson.loads(encoded)
print(decoded)

{'created': '2025-09-17T11:08:48.117337'}


3. python-rapidjson
- Fast and flexible
- Supports datetime, UUID, and more

In [18]:
import rapidjson

data = {"created": datetime.now()}
encoded = rapidjson.dumps(data, datetime_mode=rapidjson.DM_ISO8601)
decoded = rapidjson.loads(encoded)
print(decoded)

{'created': '2025-09-17T11:09:19.297403'}


##### Benchmarking Codecs (Speed Comparison)
Here’s a simple benchmark using timeit:

In [20]:
import json, ujson, orjson, rapidjson
import timeit

data = {"name": "Sadiq", "skills": ["Python", "ML"], "active": True}

print("json:", timeit.timeit(lambda: json.dumps(data), number=100000))
print("ujson:", timeit.timeit(lambda: ujson.dumps(data), number=100000))
print("orjson:", timeit.timeit(lambda: orjson.dumps(data), number=100000))
print("rapidjson:", timeit.timeit(lambda: rapidjson.dumps(data), number=100000))

json: 0.3866023000000496
ujson: 0.12795980000009877
orjson: 0.04135450000012497
rapidjson: 0.14769030000002203


##### Security and Compatibility Notes
|codec  |Speed  |Custom Types  |Security Risk  |Output Type  | 
|-----------|---------|----------|----------|-------|
| json |Fast  |Full  |Safe  | str | 
| ujson |Fast  |Limited  |Less strict  | str | 
| orjson |Very Fast  |Good  |safe  | bytes | 
| rapidjson |Fast  |Flexible  |safe  | str | 


- Be cautious with codecs that silently ignore malformed input (ujson can do this). For secure applications, stick with json or orjson.



##### Flattening json

In [21]:
import pandas as pd

data = {
    "user": {
        "name": "Sadiq",
        "skills": ["Python", "ML"]
    },
    "location": "India"
}

df = pd.json_normalize(data)
print(df)


  location user.name   user.skills
0    India     Sadiq  [Python, ML]


In [22]:
!pip install flatten_json

Collecting flatten_json
  Downloading flatten_json-0.1.14-py3-none-any.whl.metadata (4.2 kB)
Downloading flatten_json-0.1.14-py3-none-any.whl (8.0 kB)
Installing collected packages: flatten_json
Successfully installed flatten_json-0.1.14


In [23]:
from flatten_json import flatten

nested = {
    "user": {
        "name": "Sadiq",
        "skills": ["Python", "ML"]
    },
    "location": "India"
}


flat = flatten(nested)
print(flat)

{'user_name': 'Sadiq', 'user_skills_0': 'Python', 'user_skills_1': 'ML', 'location': 'India'}
