In [87]:
import json
from pathlib import Path
import pandas as pd

# Writing and Reading Metadata with Serialiazed with JSON


JSON (JavaScript Object Notation) is a widely-used format for data exchange, valued for its simplicity and readability. In neuroscience, JSON's structured format is ideal for organizing complex metadata. It supports clear data representation, crucial for sharing and analyzing experimental information. This compatibility with diverse programming languages enhances its utility in global research collaboration, streamlining data management in neuroscience.

This table covers the basic types of values that can be represented in JSON, providing a quick reference for understanding and using JSON data types in various applications:

| JSON Type    | Description                               | Example                |
|--------------|-------------------------------------------|------------------------|
| String       | Textual data enclosed in quotes           | `"exampleString"`      |
| Number       | Integer or floating-point number          | `42`, `3.14`           |
| Object       | Collection of key-value pairs             | `{"key": "value"}`     |
| Array        | Ordered list of values                    | `[1, "two", 3.0]`      |
| Boolean      | True or false value                       | `true`, `false`        |
| Null         | Represents a null or non-existent value   | `null`                 |



## The Built-In `json` Library

| Code | Description |
| :-- | :-- |
| **Reading JSON** |  |
| **`text = pathlib.Path('myfile.json').read_text()`** | Reads a text file to a string. |
| **`data = json.loads(text)`** | Converts JSON-formtted text to a Python code data structure |
| --- | --- |
| **Writing JSON** | |
| ** **`text = json.dumps(data, indent=3)`** | Convert a Python code data structure to a text string |
| **`pathlib.Path("myfile.json").write_text(text)`** | Write the text to a file |
| **`pathlib.Path("data/myfile.json").parent`** | Get the parent directory of "myfile.json" (in this case, "data") |
| **`pathlib.Path("data").mkdir(exist_ok=True, parents=True)`** | Create a folder at the path, and all of its parent folders, if necessary. | 



**Exercises**

In [12]:
import json
from pathlib import Path

**Example**: Translate the following sentence to JSON-formatted text, and use the JSON parser to validate it (i.e. check that it is formatted correctly): *The researcher, Sam Vimes, ran Session Number 3 with Subject XTR2 on February 4th, 2022.*.

In [43]:
text = '{"Researcher": "Sam Vimes", "Session": 3, "Subject": "XTR2", "Date": "2022-02-04"}'
json.loads(text)

{'Researcher': 'Sam Vimes',
 'Session': 3,
 'Subject': 'XTR2',
 'Date': '2022-02-04'}

**Example**: Save this data to an appropriately-named file.

In [44]:
path = Path("data1/session.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)

82

**Example**: Read the data from the file back into a Python data structure.

In [45]:
data = json.loads(path.read_text())
data

{'Researcher': 'Sam Vimes',
 'Session': 3,
 'Subject': 'XTR2',
 'Date': '2022-02-04'}

---

Translate the following sentence to JSON-formatted text, and use the JSON parser to validate it (i.e. check that it is formatted correctly): *The EEG amplifier's low-pass filter was set to 200 Hz, its high-pass filter to 0.2 Hz, and its notch filter (which was set to 50 Hz) was turned on.*


In [36]:
text = '{"lowpass_hz": 200, "highpass_hz": 0.2, "notch_hz": 50, "notch": true}'
json.loads(text)

{'lowpass_hz': 200, 'highpass_hz': 0.2, 'notch_hz': 50, 'notch': True}

Save this data to an appropriately-named file.

In [38]:
path = Path("data1/amplifier_settings.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)

70

Read the data from the file back into a Python data structure.

In [41]:
data = json.loads(path.read_text())
data

{'lowpass_hz': 200, 'highpass_hz': 0.2, 'notch_hz': 50, 'notch': True}

---

Translate the following sentence to a Python data structure, then use the `json` library to convert it to JSON-formatted text: *Three electrodes were implanted into subject "Pinky", a Sprague-Dawley rat: one in the hippocampus (channel 3), one in the visual cortex (channel 4), and one in the motor cortex (channel 6).*

In [29]:
text = {
    "subject": "Pinky",
    "species": "Sprague-Dawley",
    "electrodes": [
        {"channel": 3, "region": "hippocampus"},
        {"channel": 4, "region": "visual cortex"},
        {"channel": 6, "region": "motor cortex"},
    ]
}
text = json.dumps(text, indent=3);
print(text)

{
   "subject": "Pinky",
   "species": "Sprague-Dawley",
   "electrodes": [
      {
         "channel": 3,
         "region": "hippocampus"
      },
      {
         "channel": 4,
         "region": "visual cortex"
      },
      {
         "channel": 6,
         "region": "motor cortex"
      }
   ]
}


Save the json data to an appropriately-named file.

In [31]:
Path("data1/recording2.json").write_text(text)

303

Read the file back into a Python data structure.

In [35]:
json.loads(Path("data1/recording2.json").read_text())


{'subject': 'Pinky',
 'species': 'Sprague-Dawley',
 'electrodes': [{'channel': 3, 'region': 'hippocampus'},
  {'channel': 4, 'region': 'visual cortex'},
  {'channel': 6, 'region': 'motor cortex'}]}

---

Translate the following sentence to a Python data structure and save it to a JSON file: *The image has a width of 1080 pixels, a height of 720 pixels, saved data in RGB format. The camera settings had an exposure time of 8 milliseconds, an aperture of 2.8 stops, and an ISO setting of 100."

In [48]:
data = {
  "image": {
    "height": 720,
    "width": 1080,
    "format": "RGB",
  },
  "camera": {
    "exposure": .008,
    "aperture": 2.8,
    "iso": 100,
  }
}
Path("data1/image_settings.json").write_text(json.dumps(data))

118

Read the file back to check that it was saved correctly.

In [49]:
json.loads(Path("data1/image_settings.json").read_text())

{'image': {'height': 720, 'width': 1080, 'format': 'RGB'},
 'camera': {'exposure': 0.008, 'aperture': 2.8, 'iso': 100}}

---

Run the following code to generate the `image_data` folder, which contains a session's image acquisition data parameters:

In [63]:
import json, random
from pathlib import Path

random.seed(42)

for _ in range(10):

    # Generate random parameters
    params = {
        "exposure_time": random.choice([100, 200, 300]),  # milliseconds
        "laser_power": random.choice([5, 10, 15]),  # milliwatts
        "num_frames": random.randint(200, 400),
        "frame_rate": random.choice([10, 20, 30]),  # Hz
        "region_of_interest": random.choice(["ROI1", "ROI2", "ROI3"]),
    }
    if random.random() > 0.5:
        params['start_time'] = random.randint(1, 5000)  # seconds

    # Write the data to a json file
    session_num = random.randint(1, 300)
    experimenter = random.choice(["Sophie", "Florian"])
    path = Path(f"image_data/{experimenter}_{session_num}/session.json")
    path.parent.mkdir(parents=True, exist_ok=True)
    json_text = json.dumps(params, indent=3)
    path.write_text(json_text)


Read and Parse the JSON-formatted data in session 72, to get the exposure time.

In [119]:
data = json.loads(Path("image_data/Sophie_72/session.json").read_text())
data['exposure_time']

300

Read and Parse the JSON-formatted data in session 177, to get the frame rate.

In [118]:
data = json.loads(Path("image_data/Florian_177/session.json").read_text())
data['frame_rate']

10

Use `list(Path().glob(pattern))` to list all the the JSON session files in the `image_data` folder (tip: use the wildcard "*" whereever there are variable parts in the filename)

In [112]:
list(Path().glob("image_data/session_*/session.json"))

[]

Read and parse all the `session.json` files and put them into a Pandas DataFrame. Here is a code template to help you get started:

```python
sessions = []
for path in Path().glob("image_data/Sophie_16/session.json"):
    text = path.read_text()
    session = {"A": 3}
    sessions.append(session)

df = pd.DataFrame(sessions)
df
```

In [65]:
import pandas as pd

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


Read and parse all the `session.json` files and put them into a Pandas DataFrame.

In [66]:
sessions = []
for path in Path().glob("image_data/*/session.json"):
    session = json.loads(path.read_text())
    sessions.append(session)

df = pd.DataFrame(sessions)
df

Unnamed: 0,exposure_time,laser_power,num_frames,frame_rate,region_of_interest,start_time
0,300,15,292,30,ROI1,376.0
1,100,10,226,10,ROI2,
2,100,15,317,30,ROI1,3101.0
3,200,15,271,10,ROI1,2788.0
4,300,15,339,10,ROI3,
5,100,10,297,20,ROI3,1800.0
6,100,5,225,20,ROI2,
7,100,5,329,30,ROI1,4465.0
8,300,5,206,30,ROI2,
9,200,10,253,30,ROI2,585.0


Read and parse all the `session.json` files and put them into a Pandas DataFrame, this time including the experimenter name, the session ID from the parent folder's name (tip: `Path().parent.name`), and the path to the parent folder's name for later analysis (e.g. to load up other data files from that session). 

In [123]:
sessions = []
for path in Path().glob("image_data/*/session.json"):
    session = {}
    experimenter, session_id = path.parent.name.split('_')
    session['session_id'] = session_id
    session['experimenter'] = experimenter
    session |= json.loads(path.read_text())
    sessions.append(session)

df = pd.DataFrame(sessions)
df

Unnamed: 0,session_id,experimenter,exposure_time,laser_power,num_frames,frame_rate,region_of_interest,start_time
0,117,Florian,300,15,292,30,ROI1,376.0
1,177,Florian,100,10,226,10,ROI2,
2,41,Florian,100,15,317,30,ROI1,3101.0
3,143,Sophie,200,15,271,10,ROI1,2788.0
4,16,Sophie,300,15,339,10,ROI3,
5,167,Sophie,100,10,297,20,ROI3,1800.0
6,187,Sophie,100,5,225,20,ROI2,
7,215,Sophie,100,5,329,30,ROI1,4465.0
8,72,Sophie,300,5,206,30,ROI2,
9,88,Sophie,200,10,253,30,ROI2,585.0
