---

### 🎓 **Professor**: Apostolos Filippas

### 📘 **Class**: Web Analytics

### 📋 **Topic**: JSON

🚫 **Note**: You are not allowed to share the contents of this notebook with anyone outside this class without written permission by the professor.

---


# 🚪 1. Introduction

JSON (JavaScript Object Notation), is a lightweight data-interchange format. While it originates from JavaScript, JSON is a format and not a language.
- **Interpretabe**: It is kind of easy for humans to read and write, and it's also easy for machines to parse and generate.
- **Standardized**: JSON is widely adopted across many platforms.
- **Lightweight**: This makes JSON good for sending and receiving data.
- **Structured**: JSON uses key-value pairs, similar to Python dictionaries.


## JSON Syntax Rules
- Data is in name/value pairs (like Python dictionaries).
- Data is separated by commas.
- Curly braces {} hold objects (key-value pairs).
- Square brackets [] hold arrays (lists).

## JSON Data Types
JSON can represent the following data types:
- String
- Number
- Object (JSON object)
- Array
- Boolean (true or false)
- null

# 2. JSON in Python
Python has a built-in module called json which can be used to work with JSON data.

Let's start by parsing some JSON data.

## From JSON-encoded string to a python object: `loads()` 

- `json.loads()` method can be used to parse a JSON string into a Python object.
- What Python object your string will get interpreted to on the root structure of the JSON content. It can result in a dictionary, list, string, integer, float, boolean, or None based on the top-level structure of the JSON string.

Here are two examples:

In [None]:
import json

# some JSON data:
json_data = '{"name": "John", "age": 30, "city": "New York"}'

# parse JSON:
python_data = json.loads(json_data)

# the result is a Python dictionary:
print(f"The type of the `python_data` object is {type(python_data)}")  
print(python_data)  # Outputs: John

# access the data:
print(f"The name is {python_data['name']}")


In [None]:
import json

# some JSON data:
json_data = '["Chicago","New York","Los Angeles"]'

# parse JSON:
python_data = json.loads(json_data)

# the result is a Python dictionary:
print(f"The type of the `python_data` object is {python_data.__class__.__name__}.")  
print(python_data)  # Outputs: John

# access the data:
print(f"The first city in the list is {python_data[0]}")


## From python object to JSON string: `dumps()`

The reverse of `loads()` is `dumps()`. It takes a Python object and returns a string representation of it.

Here are two examples:

In [None]:
# EXAMPLE 1

# a Python dictionary:
python_data = {
  "name": "John",
  "age": 30,
  "city": "New York",
  "married": True,
  "children": None
}

# convert into JSON:
json_data = json.dumps(python_data)

# 
print(f"The type of the `json_data` object is {type(json_data)}")
print(f"The value of the `json_data` object is {json_data}")

In [None]:
#EXAMPLE 2

# a Python list:
python_data = [
    {"name": "John", "age": 30, "city": "New York", "married": True, "children": None},
    {"name": "Peter", "age": 45, "city": "Chicago", "married": False, "children": ("Ann","Betty")},
]

# convert into JSON:
json_data = json.dumps(python_data)

# 
print(f"The type of the `json_data` object is {type(json_data)}")
print(f"The value of the `json_data` object is {json_data}")

## Python - Jason data type mapping

Note that 
- "married", which is "True" or "False" in Python, gets translated to a lowercase true or false in the JSON string. 
- "children" is None in the dictionary of the first example, but in the JSON data, it is represented by "null".
- "children" is a tuple in the dictionary of the second example, but in the JSON data, it is represented by an array. JSON does not have a tuple data type.

|  Python  |  JSON   |
|:--------:|:-------:|
|  `dict`  | Object  |
|  `list`  | Array   |
|  `tuple` | Array   |
|  `str`   | String  |
|  `int`   | Number  |
|  `float` | Number  |
|  `True`  | `true`  |
|  `False` | `false` |
|  `None`  | `null`  |


## Writing and reading to files

We can also write and read JSON data to/from files.
To do so, you can first use `dumps()` to convert a Python object to a JSON string, and then use `write()` to write to a file as a string. Similarly, you can use `read()` to read a JSON string from a file, and then use `loads()` to convert it to a Python object.

However, you can also use `dump()` and `load()` to write and read JSON data to/from files directly.
Here's an exmaple:

In [None]:
person = {
  "name": "Bob",
  "age": 25,
  "city": "Funtown"
}

with open('files/example.json', 'w') as file:
    json.dump(person, file)
    

In [None]:
# you can make the output more readable by using the indent argument
# the following will format the JSON output with an indentation of 4 spaces for each level
with open('files/example.json', 'w') as file:
    json.dump(person, file, indent=4)

# 3. More on JSON with Python

## Handling errors
JSON, being a text-based format, can sometimes contain errors, either due to human editing or other factors. Python’s json module will raise exceptions when encountering improperly formatted JSON.

A common exception is  `json.JSONDecodeError` 
- This exception is raised when there's an error decoding a JSON document. 
- It's the most common error you'll encounter when parsing JSON in Python.

To gracefully handle errors, use the `try` and `except` blocks.

In [None]:
malformed_json = "{ 'name': 'Alice' }"  # Single quotes are not valid in JSON

try:
    data = json.loads(malformed_json)
except json.JSONDecodeError:
    print("Failed to decode JSON")
    data = {}


Tips for Debugging:
- Ensure that the JSON strings use double quotes for keys and string values.
- Validate the JSON using online tools to pinpoint errors.

## Complex JSON structures

JSON objects and arrays can be nested, leading to more complex structures. These structures often mirror real-world data more closely.

### Accessing nested data
To access data within nested JSON, you'll chain keys (for dictionaries) or indices (for lists).

In [None]:
json_string = """{
    "name": "Alice",
    "address": {
        "street": "123 Wonderland St",
        "city": "Wonderland",
        "contacts": ["1234567890", "9876543210"]
    }
}
"""

data = json.loads(json_string)

# Accessing nested dictionary
street = data["address"]["street"]  # Outputs: "123 Wonderland St"
print(f"The street is {street}")

# Accessing item in a nested list
first_contact = data["address"]["contacts"][0]  # Outputs: "1234567890"
print(f"The first contact is {first_contact}")

### Iterating Through Nested Data:

For nested arrays or lists, you can use loops to iterate through the data.

In [None]:
for contact in data["address"]["contacts"]:
    print(contact)


### Handling Unexpected Structures
Sometimes, the structure of the JSON data might not be what you expect. It's a good practice to check the type of a structure before accessing it.

In [None]:
if isinstance(data["address"], dict):
    city = data["address"].get("city", "Unknown City")
else:
    city = "Unknown City"

print(f"The city is {city}")
