<div>
<img src="../image/images.png" width="100%"/>
</div>

<h1>
    <center>
        H64 Data Product
        <br>
        <br>
        Visualisation
    </center>
</h1>

### **Introduction and Objective**
In this notebook, we explore the H64 precipitation dataset, a high-resolution satellite-based product from the Hydrology SAF (H-SAF) initiative. The goal is to offer a guide for visualizing the H64 data downloaded in the previous notebook for the desired region and time frame. 


### **Library Imports**
The necessary Python libraries for data manipulation, visualization and handling the H64 data products are imported. This section includes standard libraries, third-party modules for advanced processing, and custom functions for H64 data handling.

In [None]:
import os
import math
from datetime import datetime

import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt
import geopandas as gpd
from shapely.geometry import Polygon
import ipywidgets as widgets

from hsaf_data_access import HSAFDataAccess as data_access

## Data reading and pre-processing

The code snippet below retrieves the list of files in the `data` directory, sorts them alphabetically, and then determine the number of rows and columns needed for plotting in a square grid layout.

This cell performs the preliminary file handling and layout preparation necessary for visualizing H64 data stored in NetCDF format (`.nc` files). 

1. **File Listing and Sorting**:  
   The cell begins by scanning the `data` directory to identify all files with the `.nc` extension, which are assumed to contain H64 precipitation data. These files are then sorted alphabetically to ensure a consistent processing order. A validation step ensures that the directory contains relevant files. If no `.nc` files are found, an error is raised, guiding the user to check the data directory.

2. **Grid Layout Calculation**:  
   To prepare for data visualization, the number of files is used to calculate the dimensions of a square grid layout for subplots. This ensures that the visualization can accommodate all files efficiently while maintaining a clean and organized appearance.

In [None]:
# list the files in the data path and sort them
filelist = [f for f in os.listdir('data') if f.endswith('.nc')]
# Check if the filelist is empty or None
if not filelist:
    raise ValueError("No files found in the directory: {}".format(data_path))
else:
    filelist.sort()
    
# Get the number of files in the data file and get the ceil of the sqrt to determine the subplot dimensions
plt_dim = math.ceil(
                math.sqrt(
                    len(filelist)
                )
            )


This cell initiates the data exploration process by loading and previewing the contents of the first H64 data file.

1. **File Loading**:  
   The path to the first `.nc` file in the sorted file list is constructed, and the file loaded using `xarray` with the NetCDF4 engine.

2. **Data Conversion**:  
   The loaded dataset is converted into a pandas DataFrame for tabular exploration of the dataset's structure and attributes.

3. **Preview**:  
   The `.head()` method displays the first five rows of the DataFrame to provide a quick glance at the dataset's structure and contents. Also, the dataset object `ds` is displayed to examine its metadata, dimensions, and variable attributes directly in its native xarray structure.

In [None]:
# Load the first file
file_path = os.path.join('data', filelist[0])
ds = xr.open_dataset(file_path, engine='netcdf4')

# Convert to DataFrame and view the first 5 rows
df = ds.to_dataframe().reset_index()
df.head()

In [None]:
ds

This cell processes and visualizes precipitation data from the downloaded NetCDF files. It:  

1. **Loads and Combines Data**: Opens all NetCDF files in the directory, combines them along their shared coordinates (`combine='by_coords'`).  
2. **Extracts Time Range**: Calculates the earliest (`datestart`) and latest (`dateend`) timestamps in the dataset.  
3. **Plots Rainfall Data**: Visualizes the accumulated rainfall (`acc_rr`) on subplots for each time step, laid out in a grid (`col_wrap=plt_dim`). The color scale (`cmap="rainbow"`) represents rainfall intensity. This highlights the distribution of rainfall over time, offering a spatial representation of precipitation intensity across the study area. Each map shows rainfall estimates for a specific time period, helping to uncover trends and variations in precipitation patterns. 

4. **Adds Dynamic Titles**: Sets a descriptive title based on the time range of the data.  
5. **Saves Visualization**: Exports the plot as an image file in the `output` directory.  

For statistical analysis of the dataset, use the `Analysis` button below.

In [None]:
with xr.open_mfdataset([os.path.join('data', f) for f in filelist], engine='netcdf4', combine='by_coords') as ds:
    datestart = pd.to_datetime(ds['time'].min().values)
    dateend = pd.to_datetime(ds['time'].max().values)

    aa = ds['acc_rr'].plot(x="longitude", y="latitude", col="time", col_wrap=plt_dim, cmap="rainbow", cbar_kwargs={"label": "Rain rates [mm]"})
    
    if datestart > dateend:
        print("Warning: Start date is after the end date. Please check your input dates.")
    else:
        if datestart.month != dateend.month:
            # Different months
            title_start = datestart.date().strftime("%d/%m")
        else:
            # Same month, different days
            title_start = datestart.date().strftime("%d")
        title_end = dateend.date().strftime("%d/%m/%Y")
        title = f"{title_start} - {title_end} H64 rainfall" if datestart != dateend else f"{datestart.date()} H64 rainfall"
    aa.fig.suptitle(title, fontsize=12)
    aa.fig.subplots_adjust(top=0.9, right=0.82)
    aa.fig.savefig('output/H64_Daily_Rainfall_plot.png', dpi=300)

Assuming the AOI is provided as a shapefile or a shapefile is available, this cell focuses on overlaying it on the precipitation plots.

1. **Generate AOI Selection Widget**:  
   - The `create_shp_widgets` method from `data_access` generates an interactive widget, allowing users to select a shapefile representing the AOI.

2. **Read AOI Shapefile**:  
   - Reads the selected shapefile into a GeoDataFrame (`aoi_gdf`) using `geopandas`. This GeoDataFrame represents the boundaries or geometry of the AOI.

3. **Overlay AOI on Precipitation Plots**:  
   - Iterates over each subplot (`ax`) in the precipitation plots and overlays the AOI boundaries (`aoi_gdf.boundary`) using black lines for clear visual delineation of the region.

4. **Save the Updated Visualization**:  
   - Exports the updated precipitation plot with the AOI overlay as a high-resolution image file for documentation or further analysis.

In [None]:
aoi_shp = data_access.create_shp_widgets()
aoi_shp

In [None]:
aoi_gdf = gpd.read_file(aoi_shp.children[0].value)
for ax in aa.axs.flat:
    # Plot the AOI boundary on top of each plot
    aoi_gdf.boundary.plot(ax=ax, color='black', linewidth=0.1)

aa.fig.savefig('output/H64_Daily_Rainfall_plot.png', dpi=300)

<table style="width: 100%; border-collapse: collapse;">
    <tr>
        <td style="text-align: left; font-size: 15px;">
            <a href="data_download.ipynb" style="text-decoration:none;">&larr; Data Download</a>
        </td>
        <td style="text-align: right; font-size: 15px;">
            <a href="data_analysis.ipynb" style="text-decoration:none;">Analysis &rarr;</a>
        </td>
    </tr>
</table>