**MATLAB to HDF5 Converter for CLIMADA v6**


In CLIMADA v3, it was possible to convert matlab `.mat` files to HDF5. However, this feature was deprecated in newer CLIMADA releases (v6). This notebook documents a replacement approach to perform that same conversion **without** needing to downgrade or maintain older versions of CLIMADA.

### Approach

1. I studied the HDF5 structure produced by CLIMADA v3  
2. Reconstructed the expected HDF5 schema (groups, datasets, dimensions)  
3. Wrote a custom script to convert the MATLAB `.mat` hazard data into this HDF5 structure
4. Downloaded the file as a hdf5 format into the working folder 
5. Tested 4 in the example script in the CLIMADA codebase.

### How to use
Download this notebook and add it into the 'script' folder in the climada codebase. 
You can either **A**. create a subfolder named 'custom' and add the notebook: {'script/custom/notebook'}  or **B**. add it directly into the 'script' folder: {'script/notebook'}.

if you do A, you don't have to change the file path when using the function but make sure to use the correct file path with B

### Example File 

The hdf5 data downloaded into the applications/eca_san_salvador are: 'Salvador_hazard_FL_2015.hdf5 and 
'Salvador_hazard_FL_2040_extreme_cc.hdf5'.
They have been tested and works with all the .ipynb file in the same folder


In [3]:
from climada.util.hdf5_handler import get_sparse_csr_mat
from climada.hazard.centroids import Centroids
from climada.hazard import Hazard
import numpy as np
import h5py

def convert_mat_to_hdf5(mat_file, hdf5_file):
    with h5py.File(mat_file, 'r') as f:
    
        #define the vars_oblig [units, "units","centroids", "event_id", "frequency", "intensity", "fraction", } based on the Hazard class in climada

        #extract units 
        unit_value = f['hazard/units'][()]
        units = chr(int(unit_value.item()))
    
        # extract centroids (lat/lon)
        lats = f['hazard/lat'][()].flatten()
        lons = f['hazard/lon'][()].flatten()
        centroids = Centroids(lat=lats, lon=lons)

        #extract event_id
        event_id = f['hazard/event_ID'][()].flatten() if 'hazard/event_ID' in f else None
        if event_id is not None:
            if event_id.size == 1:
                event_id = int(event_id[0])
            else:
                event_id = event_id.astype(int)

        #extract frequency
        frequency = f['hazard/frequency'][()].flatten() if 'hazard/frequency' in f else None

        # extract intensity
        intensity_dict = {
        'data': f['hazard/intensity/data'][()],
        'ir': f['hazard/intensity/ir'][()],
        'jc': f['hazard/intensity/jc'][()]
        }

        shape_intensity = (len(event_id), len(lats))  # e.g. (n_centroids, n_events)

        intensity_csr = get_sparse_csr_mat(intensity_dict, shape_intensity)

        # extract fraction
        fraction_dict = {
        'data': f['hazard/fraction/data'][()],
        'ir': f['hazard/fraction/ir'][()],
        'jc': f['hazard/fraction/jc'][()]
        }
        shape_fraction = (len(event_id), len(lats))  # e.g. (n_centroids, n_events)

        fraction_csr = get_sparse_csr_mat(fraction_dict, shape_fraction)

        # define var_def {"date", "orig", "event_name",  "frequency_unit"} 

        #extract date
        datenum = f['hazard/datenum'][()].flatten()
        datenum_int = datenum.astype(int)
        date = date = np.insert(datenum_int[:-1], 0, 1)

        #extract event_name 
        name = f['hazard/name']
        event_name= []

        for i in range(name.shape[0]):
            ref_array = name[i]
            ref = ref_array[0]
            obj = f.file[ref]
            ascii_data = obj[()]
            s = ''.join(
                chr(int(x.item() if isinstance(x, np.ndarray) else x))
                for x in ascii_data
            )
            event_name.append(s)
            
          # build hazard
            haz = Hazard(
                haz_type='FL',
                centroids=centroids,
                event_id=event_id,
                event_name=event_name,
                intensity=intensity_csr,
                fraction=fraction_csr,
                frequency=frequency,
                units=units,
                date=date,
            )

            # save to HDF5
            haz.write_hdf5(hdf5_file)
       

In [14]:
convert_mat_to_hdf5("../applications/eca_san_salvador/Salvador_hazard_FL_2015.mat", "../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5")
convert_mat_to_hdf5("../applications/eca_san_salvador/Salvador_hazard_FL_2040_extreme_cc.mat", "../applications/eca_san_salvador/Salvador_hazard_FL_2040_extreme_cc.hdf5")

2025-07-06 17:31:47,301 - climada.hazard.io - INFO - Writing ../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5
2025-07-06 17:31:47,304 - climada.hazard.centroids.centr - INFO - Writing ../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5
2025-07-06 17:31:47,953 - climada.hazard.io - INFO - Writing ../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5
2025-07-06 17:31:47,956 - climada.hazard.centroids.centr - INFO - Writing ../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5
2025-07-06 17:31:48,595 - climada.hazard.io - INFO - Writing ../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5
2025-07-06 17:31:48,598 - climada.hazard.centroids.centr - INFO - Writing ../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5
2025-07-06 17:31:49,266 - climada.hazard.io - INFO - Writing ../applications/eca_san_salvador/Salvador_hazard_FL_2015.hdf5
2025-07-06 17:31:49,269 - climada.hazard.centroids.centr - INFO - Writing ../applications/eca_san_sa