# ENSF 400 - Winter 2025 - Lab 01 - Data Formats

# JSON
In this tutorial, you will learn to parse, read and write JSON in Python with the help of examples. Also, you will learn to convert JSON to dict and pretty print it.

JSON (JavaScript Object Notation) is a popular data format used for representing structured data. It's common to transmit and receive data between a server and web application in JSON format.

In Python, JSON exists as a string. For example:



In [None]:
p = '{"name": "Bob", "languages": ["Python", "Java"]}'
print(p)
type(p)

It's also common to store a JSON object in a file.

## Import JSON Module
To work with JSON (string, or file containing JSON object), you can use Python's json module. You need to import the module before you can use it.

## Parse JSON in Python
The json module makes it easy to parse JSON strings and files containing JSON object.

### Example 1: Python JSON to dict
You can parse a JSON string using json.loads() method. The method returns a dictionary.


In [None]:
import json

person = '{"name": "Bob", "languages": ["English", "Fench"]}'
person_dict = json.loads(person)

In [None]:
# Output: {'name': 'Bob', 'languages': ['English', 'Fench']}
print(person_dict)
type(person_dict)
#person_dict['name']

In [None]:
# Output: ['English', 'French']
print(person_dict['name'])
type(person_dict['age'])

Here, person is a JSON string, and person_dict is a dictionary.

### Example 2: Python read JSON file
You can use `json.load()` method to read a file containing JSON object.

Suppose, you have a file named `person.json` which contains a JSON object.

In [None]:
%%writefile person.json
{"name": "Bob", "languages": ["English", "Fench"]}

Here's how you can parse this file:

In [None]:
import json

with open('person.json') as f:
  data = json.load(f)

# Output: {'name': 'Bob', 'languages': ['English', 'Fench']}
print(data)
type(data)

Here, we have used the `open()` function to read the json file. Then, the file is parsed using `json.load()` method which gives us a dictionary named data.

If you do not know how to read and write files in Python, we recommend you to check Python File I/O.

## Python Convert to JSON string
You can convert a dictionary to JSON string using `json.dumps()` method.

### Example 3: Convert dict to JSON

In [None]:
import json

person_dict = {
    'name': 'Bob',
    'age': 12,
    'children': None
}
person_json = json.dumps(person_dict)

# Output: {"name": "Bob", "age": 12, "children": null}
print(person_json)
type(person_json)

Here's a table showing Python objects and their equivalent conversion to JSON.

| Python          | JSON Equivalent |
|-----------------|-----------------|
| dict            | object          |
| list, tuple     | array           |
| str             | string          |
| int, float, int | number          |
| True            | true            |
| False           | false           |
| None            | null            |

## Writing JSON to a file
To write JSON to a file in Python, we can use `json.dump()` method.

### Example 4: Writing JSON to a file


In [None]:
import json

person_dict = {
    "name": "Bob",
    "languages": ["English", "Fench"],
    "married": True,
    "age": 32
}

with open('person.txt', 'w') as json_file:
  json.dump(person_dict, json_file)


In the above program, we have opened a file named person.txt in writing mode using 'w'. If the file doesn't already exist, it will be created. Then, `json.dump()` transforms person_dict to a JSON string which will be saved in the person.txt file.

When you run the program, the person.txt file will be created. The file has the text output inside it.

## Python pretty print JSON
To analyze and debug JSON data, we may need to print it in a more readable format. This can be done by passing additional parameters indent and sort_keys to `json.dumps()` and `json.dump()` method.

### Example 5: Python pretty print JSON



In [None]:
import json

person_string = '{"name": "Bob", "languages": "English", "numbers": [2, 1.6, null]}'

# Getting dictionary
person_dict = json.loads(person_string)

# Pretty Printing JSON string back
print(json.dumps(person_dict, indent = 4, sort_keys=False))


In the above program, we have used 4 spaces for indentation. And, the keys are sorted in ascending order. By the way, the default value of `indent = None`. And, the default value of `sort_keys = False`.

# YAML

_YAML Ain’t Markup Language_ ([YAML](https://yaml.org/)) is a serialization language that has steadily increased in popularity over the last few years.

It’s often used as **a format for configuration files**, but its object serialization abilities make it a viable replacement for languages like JSON. YAML has broad language support and maps easily into native data structures. It’s also **easy for humans to read**, which is why it’s a good choice for configuration.

The YAML acronym was shorthand for _Yet Another Markup Language_. But the maintainers renamed it to _YAML Ain’t Markup Language_ to place more emphasis on its data-oriented features.

## A Simple YAML Example

Let’s take a look at a YAML file for a brief overview.


In [None]:
%%writefile simple.yaml
---
 doe: "a deer, a female deer"
 ray: "a drop of golden sun"
 pi: 3.14159
 xmas: true
 french-hens: 3
 calling-birds:
   - huey
   - dewey
   - louie
   - fred
 xmas-fifth-day:
   calling-birds: four
   french-hens: 3
   golden-rings: 5
   partridges:
     count: 1
     location: "a pear tree"
   turtle-doves: two

The file starts with three dashes. These dashes indicate the start of a new YAML document. YAML supports multiple documents, and compliant parsers will recognize each set of dashes as the beginning of a new one.

Next, we see the construct that makes up most of a typical YAML document: a key-value pair. `Doe` is a key that points to a string value: `a deer, a female deer`.

YAML supports more than just string values. The file starts with six key-value pairs. They have four different data types. `Doe` and `ray` are strings. `Pi` is a floating-point number. `Xmas` is a boolean. `French-hens` is an integer. You can enclose strings in single or double-quotes or no quotes at all. YAML recognizes unquoted numerals as integers or floating point.

The seventh item is an array. `Calling-birds` has four elements, each denoted by an opening dash.

I indented the elements in `calling-birds` with two spaces. **Indentation is how YAML denotes nesting**. The number of spaces can vary from file to file, but <u>tabs are not allowed</u>. We’ll look at how indentation works below.

Finally, we see `xmas-fifth-day`, which has five more elements inside it, each of them indented. We can view `xmas-fifth-day` as a dictionary that contains two string, two integers, and another dictionary. YAML supports nesting of key-values, and mixing types.

Before we take a deeper dive, let’s look at how this document looks in JSON. You may use an online [JSON to YAML converter](https://www.json2yaml.com/). The result is pasted below:

```json
{
  "doe": "a deer, a female deer",
  "ray": "a drop of golden sun",
  "pi": 3.14159,
  "xmas": true,
  "french-hens": 3,
  "calling-birds": [
     "huey",
     "dewey",
     "louie",
     "fred"
  ],
  "xmas-fifth-day": {
  "calling-birds": "four",
  "french-hens": 3,
  "golden-rings": 5,
  "partridges": {
    "count": 1,
    "location": "a pear tree"
  },
  "turtle-doves": "two"
  }
}
```

JSON and YAML have similar capabilities, and you can convert most documents between the formats.

## Outline Indentation and Whitespace
Whitespace is part of YAML’s formatting. Unless otherwise indicated, newlines indicate the end of a field.

You structure a YAML document with indentation. The indentation level can be one or more spaces. The specification forbids tabs because tools treat them differently.

Consider this document. The items inside `stuff` are indented with two spaces.

In [None]:
%%writefile foo.yaml
foo: bar
pleh: help
stuff:
  foo: bar
  bar: foo

The `PyYAML` package will map a YAML file stream into a dictionary. We’ll iterate through the outermost set of keys and values and print the key and the string representation of each value.

In [None]:
import yaml
import json

def print_yaml(filename):
    f = open(filename, 'r')
    dictionary = yaml.full_load(f)
    print(json.dumps(dictionary, indent = 4, sort_keys=False))


print_yaml("foo.yaml")

YAML’s simple nesting gives us the power to build sophisticated objects. But that’s only the beginning.

## Comments
Comments begin with a pound sign. They can appear after a document value or take up an entire line.
```yaml
---
# This is a full line comment
foo: bar # this is a comment, too
```

## YAML Datatypes

Values in YAML’s key-value pairs are scalar. They act like the scalar types in languages like Perl, Javascript, and Python. It’s usually good enough to enclose strings in quotes, leave numbers unquoted, and let the parser figure it out.

But that’s only the tip of the iceberg. YAML is capable of a great deal more.

### Key-Value Pairs and Dictionaries
The key-value is YAML’s basic building block. Every item in a YAML document is a member of at least one dictionary. The key is always a string. The value is a scalar so that it can be any datatype.

So, as we’ve already seen, the value can be a string, a number, or another dictionary.

### Numeric types
YAML recognizes numeric types. We saw floating point and integers above. YAML supports several other numeric types.

An integer can be decimal, hexidecimal, or octal.

In [None]:
%%writefile numeric.yaml
---
 foo: 12345
 bar: 0x12d4
 plop: 023332

In [None]:
print_yaml("numeric.yaml")

As you expect, `Ox` indicates a value is hex, and a _leading zero_ denotes an octal value.

YAML supports both fixed and exponential floating point numbers.

In [None]:
%%writefile float.yaml
---
 foo: 1230.15
 bar:  12.3015e+05

In [None]:
print_yaml("float.yaml")

Finally, we can represent not-a-number (NAN) or infinity.

In [None]:
%%writefile nan.yaml
---
foo: .inf
bar: -.Inf
plop: .NAN

In [None]:
print_yaml("nan.yaml")

### Strings
YAML strings are Unicode. In most situations, you don’t have to specify them in quotes.

In [None]:
%%writefile strings.yaml
---
foo: "this is not a normal string\n"
bar: this is not a normal string\n

In [None]:
print_yaml("strings.yaml")

YAML processes the first value as ending with a carriage return and linefeed. Since the second value is not quoted, YAML treats the `\n` as two characters.

YAML will not escape strings with single quotes, but the single quotes do avoid having string contents interpreted as document formatting.

String values can span more than one line. With the fold (greater than) character, you can specify a string in a block.


In [None]:
%%writefile newline.yaml
bar: >
  this is not a normal string it
  spans more than
  one line
  see?

In [None]:
import yaml
f = open('newline.yaml', 'r')
dictionary = yaml.full_load(f)
print(dictionary['bar'])

The block (pipe) character has a similar function, but YAML interprets the field exactly as is.

In [None]:
%%writefile newline2.yaml
bar: |
  this is not a normal string it
  spans more than
  one line
  see?

In [None]:
import yaml
f = open('newline2.yaml', 'r')
dictionary = yaml.full_load(f)
print(dictionary['bar'])

### Nulls
You enter nulls with a tilde or the unquoted null string literal.

In [None]:
%%writefile null.yaml
---
foo: ~
bar: null

In [None]:
print_yaml("null.yaml")

### Booleans
YAML indicates boolean values with the keywords True, On and Yes for true. False is indicated with False, Off, or No.

In [None]:
%%writefile bool.yaml
---
foo: True
bar: False
foo2: true
bar2: false
light: On
TV: Off
aaa: yes
bbb: no

In [None]:
print_yaml("bool.yaml")

### Arrays
You can specify arrays or lists on a single line.



In [None]:
%%writefile array1.yaml
---
items: [ 1, 2, 3, 4, 5 ]
names: [ "one", "two", "three", "four" ]

In [None]:
print_yaml("array1.yaml")

Or, you can put them on multiple lines.

In [None]:
%%writefile array2.yaml
---
items:
  - 5
  - 6
  - 7
  - 8
  - 9
names:
  - "five"
  - "six"
  - "seven"
  - "eight"

In [None]:
print_yaml("array2.yaml")

The multiple line format is useful for lists that contain complex objects instead of scalars.

In [None]:
%%writefile array3.yaml
---
items:
  - things:
      thing1: huey
      things2: dewey
      thing3: louie
  - other things:
      key: value

In [None]:
print_yaml("array3.yaml")

An array can contain any valid YAML value. The values in a list do not have to be the same type.

### Dictionaries
We covered dictionaries above, but there’s more to them.

Like arrays, you can put dictionaries inline. We saw this format above. It’s how python prints dictionaries.

In [None]:
%%writefile dict1.yaml
---
foo: { thing1: huey, thing2: louie, thing3: dewey }

In [None]:
print_yaml("dict1.yaml")

In [None]:
%%writefile dict2.yaml
---
foo: bar
bar: foo

In [None]:
print_yaml("dict2.yaml")

In [None]:
%%writefile dict3.yaml
---
foo:
  bar:
    - bar
    - rab
    - plop

In [None]:
print_yaml("dict3.yaml")

## Tasks

### Task 1

Using data file 'interface-data.json' provided with the lab, create output that resembles the following by parsing the included JSON file:

**Note: Use `dn`, `descr`, `speed` and `mtu` for parsing.**

```
Interface Status
================================================================================
DN                                                 Description    Speed    MTU  
-------------------------------------------------- ------------  ------  ------
topology/pod-1/node-201/sys/phys-[eth1/33]                       inherit   9150
topology/pod-1/node-201/sys/phys-[eth1/34]                       inherit   9150
topology/pod-1/node-201/sys/phys-[eth1/35]                       inherit   9150
```




In [1]:
import json

# Load the JSON data
with open('interface-data.json') as file:
    json_data = json.load(file)

# Print header
print("Interface Status")
print("=" * 80)
print(f"{'DN':<50} {'Description':<12} {'Speed':<7} {'MTU':<6}")
print("-" * 80)

# Iterate and extract relevant data
for interface in json_data["imdata"]:
    attributes = interface["l1PhysIf"]["attributes"]
    dn = attributes.get("dn", "")
    descr = attributes.get("descr", "")
    speed = attributes.get("speed", "inherit")
    mtu = attributes.get("mtu", "")
    print(f"{dn:<50} {descr:<12} {speed:<7} {mtu:<6}")




FileNotFoundError: [Errno 2] No such file or directory: 'interface-data.json'

### Task 2

Using the same JSON input as `Task 1`, find all interfaces with `dn` attributes starting with `topology/pod-1/node-103`. Store the results in another file named `interface-103.json`, **with the same format as the original input JSON file** in JSON format, with one pretty setting: `indent = 4`.

In [2]:
import json

# Load the JSON data
with open('interface-data.json') as file:
    json_data = json.load(file)

# Filter interfaces with `dn` starting with `topology/pod-1/node-103`
filtered_interfaces = {
    "imdata": [
        interface for interface in json_data["imdata"]
        if interface["l1PhysIf"]["attributes"]["dn"].startswith("topology/pod-1/node-103")
    ]
}

# Save filtered data to a JSON file with pretty formatting
with open('interface-103.json', 'w') as outfile:
    json.dump(filtered_interfaces, outfile, indent=4)


FileNotFoundError: [Errno 2] No such file or directory: 'interface-data.json'

### Task 3
You are given a JSON file `task3_input.json`. Now perform the following operations on the JSON file:

1. Read the JSON file `task3_input.json` and convert it to a dictory.
1. Pretty print the JSON file.
1. Change the value of `Parameters.InstanceType.Default` from `t2.micro` to `c3.large`.
1. Save the modified values as a new JSON file `task3_output1.json`.
1. Save the modified values as a new YAML file `task3_output2.yaml`.

In [None]:
import json
import yaml

# Load the JSON data
with open('task3_input.json') as file:
    task3_data = json.load(file)

# Modify the value of Parameters.InstanceType.Default
task3_data["Parameters"]["InstanceType"]["Default"] = "c3.large"

# Save modified data to a new JSON file
with open('task3_output1.json', 'w') as json_file:
    json.dump(task3_data, json_file, indent=4)

# Save modified data to a new YAML file
with open('task3_output2.yaml', 'w') as yaml_file:
    yaml.dump(task3_data, yaml_file, default_flow_style=False, sort_keys=False)


## LAB 1 Grading

A TA will check your outputs for the 3 tasks above in this Notebook file with the following requirements:
- Task 1: Parsed output with the table described.
- Tasks 2: `interface-103.json` with the pretty setting.
- Tasks 3: `task3_output1.json` and `task3_output2.yaml` with the changed value.