# Dataset Conversion: MATLAB to JSON

This notebook converts battery test data from MATLAB `.mat` files to a unified JSON format for easier processing and analysis.

## Purpose
- Load battery measurement data from MATLAB files in the `CU_Dynamic` directory
- Extract data for battery cell BW-VTC-458 from each checkup (CU000, CU001, etc.)
- Convert numpy arrays and data types to JSON-serializable format
- Save the complete dataset as `full_data.json`

## Output Structure
The resulting JSON file contains:
```
{
  "CU000": {
    "Time": [...],
    "I": [...],        // Current measurements
    "U": [...],        // Voltage measurements
    "Ah": [...],       // Capacity measurements
    "Command": [...]   // Test commands (Charge, Discharge, Pause)
  },
  "CU001": { ... },
  ...
}
```

In [5]:
# Import required libraries for data processing
from scipy.io import loadmat  # For loading MATLAB .mat files
import os
from json import dump  # For saving data to JSON format
import numpy as np  # For handling numpy arrays and data types

## Import Required Libraries
Setting up the necessary tools for reading MATLAB files, file system operations, and JSON serialization.

In [None]:
def make_json_serializable(obj):
    """
    Recursively convert numpy objects to JSON-serializable Python types.
    This function handles various numpy data types that cannot be directly serialized to JSON.
    """
    # Handle dictionaries recursively
    if isinstance(obj, dict):
        return {k: make_json_serializable(v) for k, v in obj.items()}
    
    # Convert numpy arrays to lists and process recursively
    if isinstance(obj, np.ndarray):
        return make_json_serializable(obj.tolist())
    
    # Handle lists and tuples recursively
    if isinstance(obj, (list, tuple)):
        return [make_json_serializable(x) for x in obj]
    
    # Convert numpy integer types to Python int
    if isinstance(obj, (np.integer,)):
        return int(obj)
    
    # Convert numpy floating point types to Python float
    if isinstance(obj, (np.floating,)):
        return float(obj)
    
    # Convert numpy boolean to Python bool
    if isinstance(obj, np.bool_):
        return bool(obj)
    
    # Handle other numpy generic types
    if isinstance(obj, np.generic):
        return obj.item()
    
    # Return object as-is if already JSON serializable
    return obj

path = "/media/mods-pred/Datasets/Data_Munich/"

os.makedirs("json", exist_ok=True)

# ...existing code...
allowed = {"CU_Cyclic", "CYC_Cyclic"}

for aging_process in sorted(os.listdir(path)):
    # process only the two allowed folders
    if aging_process not in allowed:
        print("SKIP:", aging_process)
        continue

    aging_dir = os.path.join(path, aging_process)
    if not os.path.isdir(aging_dir):
        print("SKIP (not dir):", aging_dir)
        continue

    os.makedirs(os.path.join("json", aging_process), exist_ok=True)

    # Process each checkup directory in sorted order
    for checkup_name in sorted(os.listdir(aging_dir)):
        checkup_dir = os.path.join(aging_dir, checkup_name)
        if not os.path.isdir(checkup_dir):
            continue

        # Skip the failed/incomplete data directory
        if checkup_name == "failed and incomplete":
            continue

        # Initialize dictionary for this checkup
        full_dict = {}

        cur_dir = f"{path}/{aging_process}/{checkup_name}"

        # Process each file in the checkup directory
        for cell in os.listdir(cur_dir):
            file_path = f"{cur_dir}/{cell}"
            try:
                mat = loadmat(file_path, squeeze_me=True, struct_as_record=False)
                data = mat.get("Dataset", None)
            except NotImplementedError:
                import hdf5storage
                mat = hdf5storage.loadmat(file_path)  # returns Python types / dicts
                data = mat.get("Dataset", None)
            except Exception as e:
                print("ERROR loading (scipy):", file_path, e)
                continue

            # normalize common matlab returns into a plain dict
            if data is None:
                print(f"Skipping {file_path}: no 'Dataset' variable")
                continue

            # sometimes stored as 0-d object array containing the struct
            if isinstance(data, np.ndarray) and data.dtype == np.object_:
                try:
                    data = data.item()
                except Exception:
                    pass

            # matlab structs can come as objects with __dict__
            if hasattr(data, "__dict__"):
                data = data.__dict__

            # now ensure it's a dict; if not, skip and log type for debugging
            if not isinstance(data, dict):
                print(f"Skipping {file_path}: Dataset not a dict (type={type(data)})")
                continue

            # proceed to copy fields
            full_dict[cell[:-4]] = {}
            for key in data.keys():
                if key in ["_fieldnames", "Info"]:
                    continue
                full_dict[cell[:-4]][key] = data[key]

        # Save the complete dataset to JSON file
        to_save = f"json/{aging_process}/{checkup_name}.json"
        with open(to_save, "w") as f:
            dump(make_json_serializable(full_dict), f)

SKIP: CU_Calendar
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-291_1842_CU_cyc_000_BW-VTC-CYC.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-437_6497_CU_cyc_000_BW-VTC-SupPos.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-230_2508_CU_cyc_000_BW-VTC-CYC.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-425_5058_CU_cyc_000_BW-VTC-SupPos.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-416_5162_CU_cyc_000_BW-VTC-SupPos.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-267_1073_CU_cyc_000_BW-VTC-CYC.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-323_1858_CU_cyc_000_BW-VTC-CYC.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-406_3364_CU_cyc_000_BW-VTC-SupPos.mat
LOAD: /media/mods-pred/Datasets/Data_Munich//CU_Cyclic/CU000_cyc/BW-VTC-407_5017_CU_cyc_000_BW-VTC-SupPos.mat
LOAD

KeyboardInterrupt: 

## Data Processing and Conversion
This section handles the conversion of MATLAB data to JSON format, including:
1. **JSON Serialization Function**: Converts numpy arrays and data types to JSON-compatible Python types
2. **Data Extraction**: Loads battery data from each checkup directory 
3. **File Processing**: Processes only BW-VTC-458 battery cell data from each checkup
4. **JSON Export**: Saves the unified dataset to `full_data.json`

## Conversion Complete ✅

The dataset conversion process is now complete. The `full_data.json` file contains:

- **Battery cell**: BW-VTC-458 measurements
- **Time series data**: Current (I), Voltage (U), Capacity (Ah), and Commands
- **Checkups**: All available checkup cycles (CU000, CU001, CU002, etc.)
- **Format**: JSON-serialized data ready for analysis

### Next Steps
The generated `full_data.json` file can now be used for:
- Battery state analysis
- Degradation studies  
- Machine learning applications
- Data visualization
- Statistical analysis

### File Location
📁 `full_data.json` - Generated in the current working directory