# TAMSAT ALERT WRSI Drought IBF

## impact function 

1. **Import Required Modules**:
   - `ImpactFuncSet` from `climada.entity`: Used to handle and analyze impact functions.
   - `ENT_TEMPLATE_XLS` from `climada.util`: Provides a default template for creating impact function Excel files.
   - `matplotlib.pyplot`: Used for visualizing the impact functions.

2. **Specify Input Excel File**:
   ```python
   file_name = '../../ibf_drought_impact_ea_v0.xlsx'
   ```
   - Specifies the path to the Excel file containing impact functions (e.g., drought impact functions for East Africa).

3. **Load Impact Functions**:
   ```python
   imp_set_xlsx = ImpactFuncSet.from_excel(file_name)
   ```
   - Reads the impact functions from the specified Excel file into an `ImpactFuncSet` object (`imp_set_xlsx`).
   - The file is expected to follow the CLIMADA impact function template format.

4. **Plot Impact Functions**:
   ```python
   imp_set_xlsx.plot()
   ```
   - Visualizes the impact functions in the `ImpactFuncSet`.
   - Each function is typically plotted with hazard intensity on the x-axis and damage fraction on the y-axis.

---

### Purpose:
The code loads drought impact functions from an Excel file and plots them to visualize how hazard intensity (e.g., drought severity) affects the exposed entities (e.g., crops, infrastructure) in East Africa. This helps in understanding the relationship between hazard and impact for different scenarios.

In CLIMADA, MDD, PAA, and MDR are key parameters used in defining impact functions, which describe the relationship between hazard intensity and the resulting damage or loss.

MDD: Mean Damage Degree - The fraction of the exposed value expected to be lost at a given hazard intensity.
PAA: Peak Absolute Area - The intensity at which the maximum damage occurs in the impact function.
MDR: Maximum Damage Ratio - The upper limit of damage as a proportion of the exposed value, typically normalized to 1 (or 100%).
These parameters collectively model how a hazard (e.g., windstorm, flood) impacts the exposure based on its intensity.

In [None]:
from climada.entity import ImpactFuncSet
from climada.util import ENT_TEMPLATE_XLS
import matplotlib.pyplot as plt

# provide absolute path of the input excel file
#file_name = ENT_TEMPLATE_XLS
file_name='ibf_drought_impact_ea_v0.xlsx'

imp_set_xlsx = ImpactFuncSet.from_excel(file_name)

imp_set_xlsx.plot()

## exposure 

1. **Import Libraries**:
   - Imports required libraries like `pandas`, `geopandas`, and `cartopy` for data manipulation and geospatial processing.

2. **Read CSV File (`ea_agr_spam.csv`)**:
   - Loads crop production statistics data into a pandas DataFrame (`ex_db`).

3. **Chunk the Data**:
   - Splits the large dataset into smaller chunks of `500,000` rows for efficient processing.

4. **Read Shapefile (`ea_ghcf_icpac.shp`)**:
   - Loads the geographic boundaries of East Africa into a GeoDataFrame (`ea_boundary`).

5. **Spatial Join**:
   - For each chunk of crop data:
     - Converts longitude and latitude into geometry points (`gdb`).
     - Performs a spatial join (`sjoin`) to associate each data point with its corresponding boundary region.
     - Appends the results to a list (`edf_cont`).

6. **Concatenate Processed Data**:
   - Combines all processed chunks into a single DataFrame (`edf1`).
   - Extracts the relevant columns (`latitude`, `longitude`, `value`) into a new DataFrame (`edf2`).

---

### Purpose:
Prepares geospatially-referenced crop production data for further analysis or modeling by linking crop data points to East African boundaries.

In [None]:
from climada.entity import Exposures

import numpy as np
from matplotlib import colors
from matplotlib import pyplot as plt
#from Configuration import *
import os
import pandas as pd

import cartopy.crs as ccrs

from cartopy.io.shapereader import Reader
from cartopy.feature import ShapelyFeature

#file_path = lp_csv_files[5] # define the full file path of the CSV-file

from more_itertools import sliced
import geopandas as gp
CHUNK_SIZE = 500000

ex_db=pd.read_csv('ea_agr_spam.csv')

index_slices = sliced(range(len(ex_db)), CHUNK_SIZE)

ea_boundary=gp.read_file('shp_files/ea_ghcf_icpac.shp')

edf_cont=[]
for index_slice in index_slices:
    chunk = ex_db.iloc[index_slice]
    gdb = gp.GeoDataFrame(chunk, geometry=gp.points_from_xy(chunk.longitude, chunk.latitude))
    edf=gp.sjoin(ea_boundary,gdb)
    #edf1=edf[['GID_0', 'COUNTRY','gno','Nomotorway','primary','secondary','tertiary','unclassified','lon','lat', 'grid_name']]
    edf_cont.append(edf)


edf1=pd.concat(edf_cont)
edf2=edf1[['latitude','longitude','value']]
edf2

#file_path='/home/bulbul/Documents/07-2022/impact_weather_icpac/lab/ea_climada/KEN_2021.csv'
#new_exp = Exposures(pd.read_csv(file_path))
#new_exp.check()


In [None]:
new_exp = Exposures(edf2)
new_exp.check()

1. **Set Logarithmic Color Normalization**:
   ```python
   norm = colors.LogNorm(vmin=500, vmax=4.0e8)
   ```
   - Configures a logarithmic scale for normalizing data values during plotting.
   - Sets a minimum value (`vmin=500`) and a maximum value (`vmax=4.0e8`).
   - This ensures that data with large ranges is visually distinguishable in the plot.

2. **Plot Data as a Hexbin Map**:
   ```python
   ax = new_exp.plot_hexbin(norm=norm, pop_name=False, cmap='RdBu_r', buffer=1)
   ```
   - `new_exp`: An `Exposures` object containing geospatial exposure data.
   - `plot_hexbin`: Plots the exposures as a hexbin map where hexagonal bins aggregate data points.
     - `norm=norm`: Applies the logarithmic normalization defined earlier.
     - `pop_name=False`: Disables population names in the plot.
     - `cmap='RdBu_r'`: Sets the color map to a reversed red-to-blue gradient.
     - `buffer=1`: Extends the plotting area slightly around the exposure data.

3. **Save the Plot as an Image**:
   ```python
   plt.savefig('../../ea_agr_spam_v1.png', bbox_inches='tight')
   ```
   - Saves the generated hexbin plot to a file named `ea_agr_spam_v1.png` in the specified directory.
   - `bbox_inches='tight'`: Ensures the image output tightly fits the plotted content.

---

### Purpose:
The code creates a hexbin plot of geospatial exposure data from the `new_exp` object, using a logarithmic color scale for better visualization of data spanning large ranges. The plot is saved as a PNG image for further use or analysis.

In [None]:


norm = colors.LogNorm(vmin=500, vmax=4.0e8)

ax=new_exp.plot_hexbin(norm=norm, pop_name=False, cmap='RdBu_r', buffer=1)

#fname='/home/bulbul/Documents/07-2022/impact_weather_icpac/lab/ea_ibf_data_resources/exposure-data/gis/ea_global_background.shp'

#ax.add_geometries(Reader(fname).geometries(),ccrs.PlateCarree(),facecolor='None')

plt.savefig('ea_agr_spam_v1.png', bbox_inches='tight')


1. **Import Required Modules**:
   - `Exposures` from `climada.entity`: Used to handle geospatial exposure data for impact analysis.
   - Additional libraries like `pandas`, `numpy`, `matplotlib`, and `cartopy` are imported for data manipulation, visualization, and geospatial operations.

2. **Define CSV File Path**:
   ```python
   file_path = '../../ea_agr_spam.csv'
   ```
   - Specifies the file path to the `ea_agr_spam.csv`, which contains exposure data such as geographic coordinates, crop values, and other relevant information.

3. **Load the Exposure Data**:
   ```python
   new_exp = Exposures(pd.read_csv(file_path))
   ```
   - Reads the CSV file into a pandas DataFrame using `pd.read_csv(file_path)`.
   - Converts the DataFrame into a `Exposures` object (`new_exp`), which is specifically designed for geospatial exposure data in CLIMADA.

4. **Inspect the Loaded Exposure Object**:
   ```python
   new_exp
   ```
   - Outputs the `Exposures` object (`new_exp`) to examine its contents, such as the data structure, columns, and metadata.

---

### Purpose:
The code reads exposure data (e.g., agricultural statistics from `ea_agr_spam.csv`) into a `Exposures` object. This object can then be used for further geospatial analysis or impact modeling in CLIMADA.

In [None]:
from climada.entity import Exposures

import numpy as np
from matplotlib import colors
from matplotlib import pyplot as plt
#from Configuration import *
import os
import pandas as pd

import cartopy.crs as ccrs

from cartopy.io.shapereader import Reader
from cartopy.feature import ShapelyFeature

#file_path = lp_csv_files[5] # define the full file path of the CSV-file

file_path='ea_agr_spam.csv'


#file_path='/home/bulbul/Documents/07-2022/impact_weather_icpac/lab/ea_climada/KEN_2021.csv'
new_exp = Exposures(pd.read_csv(file_path))
new_exp

## hazard 

1. **Import Required Libraries**:
   - `xarray`: For handling multi-dimensional labeled datasets.
   - `rioxarray`: Extends xarray for geospatial raster operations.
   - `pandas`: For working with time series and tabular data.

---

2. **Define File Path for WRSI Data**:
   ```python
   wrsi_mean_path = '../../202411_wrsi/ens_mean_wrsi_2024_20241126_21.838949_51.415695_-11.745695_23.145147.nc'
   ```
   - Specifies the path to a NetCDF file containing WRSI (Water Requirement Satisfaction Index) data for November 2024.

---

3. **Open the NetCDF Dataset**:
   ```python
   db1 = xr.open_dataset(wrsi_mean_path)
   ```
   - Reads the NetCDF file into an xarray `Dataset` object (`db1`), enabling efficient handling of multi-dimensional geospatial data.

---

4. **Rename and Drop Variables**:
   ```python
   db1['spei'] = db1['__xarray_dataarray_variable__']
   db2 = db1.drop(['__xarray_dataarray_variable__'])
   ```
   - `db1['spei']`: Renames the data variable `__xarray_dataarray_variable__` to a more meaningful name, `spei` (Standardized Precipitation-Evapotranspiration Index).
   - `db1.drop()`: Drops the original variable `__xarray_dataarray_variable__` from the dataset, resulting in `db2`.

---

5. **Rearrange Dimensions**:
   ```python
   db3 = db2.transpose('latitude', 'longitude')
   ```
   - Reorders the dimensions of the dataset to place `latitude` first and `longitude` second. This is a common format required by geospatial tools.

---

6. **Save as GeoTIFF**:
   ```python
   db3.rio.to_raster(f'../../wrsi_202412.tif', recalc_transform=False)
   ```
   - Converts the xarray dataset (`db3`) into a GeoTIFF file and saves it as `wrsi_202412.tif`.
   - `recalc_transform=False`: Prevents recalculation of the coordinate transformation during the raster export.

---

### Purpose:
This code processes a NetCDF file containing WRSI data by:
1. Renaming variables for clarity.
2. Reordering dimensions for geospatial compatibility.
3. Saving the processed data as a GeoTIFF file (`wrsi_202412.tif`), which can be used in GIS software or further analysis.

The workflow is tailored for geospatial data preparation and analysis.

In [None]:
import xarray as xr
import rioxarray 
import pandas as pd

wrsi_mean_path=f'202411_wrsi/ens_mean_wrsi_2024_20241126_21.838949_51.415695_-11.745695_23.145147.nc'

db1=xr.open_dataset(wrsi_mean_path)
#db1=db.rename({'longitude':'lon','latitude':'lat'})

#times = pd.date_range("2024/12/","2023/02/11",freq='D')

db1['spei'] = db1['__xarray_dataarray_variable__']
db2 = db1.drop(['__xarray_dataarray_variable__'])

#db3=db2.transpose('lat', 'lon')

#db3=db2.spei.cf

db3=db2.transpose( 'latitude', 'longitude')

db3.rio.to_raster(f'../../wrsi_202412.tif',recalc_transform=False)

In [None]:
import numpy as np
from climada.hazard import Hazard

haz_ven = Hazard.from_raster([f'wrsi_202412.tif'], dst_crs='epsg:4326',attrs={'frequency':np.ones(1)/2}, haz_type='DR')
haz_ven.check()
print('\n Solution 1:')
print('centroids CRS:', haz_ven.centroids.crs)
print('raster info:', haz_ven.centroids.meta)

## Impact calcualtion

1. **Import Impact Calculation Module**:
   ```python
   from climada.engine import ImpactCalc
   ```
   - `ImpactCalc`: A class from CLIMADA used to compute the impact of hazards on exposures using impact functions.

---

2. **Assign an Impact Function to Exposures**:
   ```python
   impact_func_id = 1  # Define or load an appropriate impact function ID
   new_exp.gdf[f'impf_DR'] = impact_func_id
   ```
   - Assigns an impact function ID (`1`) to exposures in `new_exp`. 
   - The column `impf_DR` specifies the impact function for the `DR` hazard type (e.g., drought). 
   - This links each exposure to the appropriate impact function for the hazard.

---

3. **Perform Impact Calculation**:
   ```python
   imp_calc = ImpactCalc(new_exp, imp_set_xlsx, haz_ven)
   result = imp_calc.impact()
   ```
   - `ImpactCalc(new_exp, imp_set_xlsx, haz_ven)`:
     - `new_exp`: The exposure data (e.g., crops or infrastructure) prepared with geospatial references.
     - `imp_set_xlsx`: The impact functions loaded from an Excel file.
     - `haz_ven`: The hazard data (e.g., drought intensity and frequency).
   - `result = imp_calc.impact()`:
     - Computes the impact of the hazard (`haz_ven`) on the exposures (`new_exp`) using the assigned impact functions.
     - Returns the impact result, which quantifies the expected damage or loss.

---

### Purpose:
This code calculates the impact of a specific hazard (e.g., drought) on exposures (e.g., agricultural areas) using pre-defined impact functions. The result provides a quantitative measure of the expected damage, facilitating risk assessment and decision-making.

In [None]:
from climada.engine import ImpactCalc

#haz_type = haz_ven.haz_type
impact_func_id = 1  # Define or load an appropriate impact function ID

# Use the gdf attribute to access and modify data
new_exp.gdf[f'impf_DR'] = impact_func_id
#imp_set = ImpactFuncSet.from_dict(imp_set_xlsx)

# Calculate the impact
imp_calc = ImpactCalc(new_exp, imp_set_xlsx, haz_ven)
result = imp_calc.impact()  # Computes the impact

In [None]:
new_exp.gdf

In [None]:
index_event_start = result.event_name.index('1')
damages_drought = np.asarray([result.at_event[index_event_start]])
print(damages_drought)

In [None]:
result.plot_scatter_eai_exposure(pop_name=False)

In [None]:
result.write_csv('../../impact_202412.csv')

## Probablity maps 

1. **Read and Prepare Impact Data**:
   - Load `impact_202412.csv` and extract impact-related columns (`eai_exp`, `exp_lat`, `exp_lon`).
   - Convert exposure data into a GeoDataFrame (`g_db1`) with point geometries.

2. **Classify Exposure Values**:
   - Classify `eai_exp` into four bins ("Low", "Medium-Low", etc.) and add the classifications to the DataFrame (`class` column).

3. **Read WRSI Data and Convert to GeoDataFrame**:
   - Open a NetCDF file (`prob_lower_tercile_2024...nc`) containing WRSI probabilities and convert it into a GeoDataFrame (`gdf1`).
   - Generate polygon buffers around points in `gdf1` to create geometries for spatial joins.

4. **Spatial Join for Exposure and WRSI**:
   - Perform a spatial join (`sjoin`) between exposures (`g_db1`) and WRSI probabilities (`gdf3`) to associate each exposure point with the WRSI data.

5. **Apply Probability-Based Classification**:
   - Define a function (`get_prob_ibf`) that assigns impact-based classifications (`ibf`) based on `prob_wrsi` and `eai_exp` ranges.
   - Apply this function to the joined dataset to calculate the `ibf` value for each exposure point.

6. **Prepare Final Output**:
   - Extract relevant columns (`latitude`, `longitude`, `ibf`) into a new DataFrame (`wsd1`).
   - Add a `region_id` column and save the processed data as a CSV file (`probablity_ibf_output_d.csv`).

---

### Purpose:
This code integrates exposure data (`eai_exp`) and WRSI probabilities to calculate and classify the probabilistic impact-based forecast (`ibf`). It generates a geospatial dataset ready for visualization or further analysis.

In [None]:
import geopandas as gp

db=pd.read_csv('../../impact_202412.csv')
db1=db[['eai_exp','exp_lat','exp_lon']]
db1.info()
g_db1 = gp.GeoDataFrame(db1, geometry=gp.points_from_xy(db1.exp_lon, db1.exp_lat))
g_db1.set_geometry("geometry")

In [None]:
# Classify into four bins from min to max
cuts, bin_edges = pd.cut(db1['eai_exp'], bins=4, retbins=True, labels=["Low", "Medium-Low", "Medium-High", "High"], right=True)
db1['class'] = cuts

# Printing the DataFrame with classes
print(db1)

# Printing the bin edges
print("Bin edges:", bin_edges)

In [None]:
dbpath=f'../../202411_wrsi/prob_lower_tercile_2024_20241126_21.838949_51.415695_-11.745695_23.145147.nc'

db1=xr.open_dataset(dbpath)

erf=db1.to_dataframe()

erf1=erf.reset_index()

gdf1 = gp.GeoDataFrame(erf1, geometry=gp.points_from_xy(erf1.longitude, erf1.latitude))

#gdf1=gdf[0:12]

gdf1['polygon']=gdf1.geometry.apply(lambda g: g.buffer(0.125, cap_style=3))

gdf2=gdf1[['__xarray_dataarray_variable__','polygon']]
gdf2.columns=['prob_wrsi','geometry']
#gdf1
gdf3=gdf2.set_geometry("geometry")

In [None]:
wsd=g_db1.sjoin(gdf3)

In [None]:


def get_prob_ibf(row):
    #print(row['prob_wrsi'],row['eai_exp'])
    if 0.0<= row['prob_wrsi'] <=0.25 and 0<= row['eai_exp'] <= 3166627.35:
        a=10    
    if 0.25<= row['prob_wrsi'] <=0.5 and 0<= row['eai_exp'] <= 3166627.35:
        a=10    
    if 0.5<= row['prob_wrsi'] <=0.75 and 0<= row['eai_exp'] <= 3166627.35:
        a=10    
    if 0.75<= row['prob_wrsi'] <=1 and 0<= row['eai_exp'] <= 3166627.35:
        a=10    
    ########
    if 0.0<= row['prob_wrsi'] <=0.25 and 3166627.35<= row['eai_exp'] <= 6333254.7:
        a=10    
    if 0.25<= row['prob_wrsi'] <=0.5 and 3166627.35<= row['eai_exp'] <= 6333254.7:
        a=10    
    if 0.5<= row['prob_wrsi'] <=0.75 and 3166627.35<= row['eai_exp'] <= 6333254.7:
        a=20    
    if 0.75<= row['prob_wrsi'] <=1 and 3166627.35<= row['eai_exp'] <= 6333254.7:
        a=20    
    ########
    if 0.0<= row['prob_wrsi'] <=0.25 and 6333254.7<= row['eai_exp'] <=9499882.05:
        a=20    
    if 0.25<= row['prob_wrsi'] <=0.5 and 6333254.7<= row['eai_exp'] <= 9499882.05:
        a=20    
    if 0.5<= row['prob_wrsi'] <=0.75 and 6333254.7<= row['eai_exp'] <= 9499882.05:
        a=30    
    if 0.75<= row['prob_wrsi'] <=1 and 6333254.7<= row['eai_exp'] <= 9499882.05:
        a=30    
    ########
    if 0.0<= row['prob_wrsi'] <=0.25 and 9499882.05<= row['eai_exp'] <=12666509.4:
        a=20    
    if 0.25<= row['prob_wrsi'] <=0.5 and 9499882.05<= row['eai_exp'] <=12666509.4:
        a=30    
    if 0.5<= row['prob_wrsi'] <=0.75 and 9499882.05<= row['eai_exp'] <= 12666509.4:
        a=30    
    if 0.75<= row['prob_wrsi'] <=1 and 9499882.05<= row['eai_exp'] <= 12666509.4:
        a=40
    ########
    if row['prob_wrsi'] is None and row['eai_exp'] is None: 
        a=np.nan
    if row['prob_wrsi'] is None or row['eai_exp'] is None: 
        a=np.nan 
    if pd.isna(row['prob_wrsi']):
        a=np.nan
    return a

wsd['ibf']=wsd.apply(lambda row: get_prob_ibf(row), axis = 1)
wsd

In [None]:
wsd1=wsd[['exp_lat','exp_lon','ibf']]
wsd1.columns=['latitude','longitude','value']

wsd1['region_id']=1

In [None]:
wsd1.to_csv('../../probablity_ibf_output_d.csv',index=False)

## maping of prob_ibf

In [None]:
pdb=pd.read_csv('../../probablity_ibf_output_d.csv')

from more_itertools import sliced
import geopandas as gp
CHUNK_SIZE = 500000

index_slices = sliced(range(len(db)), CHUNK_SIZE)

ea_boundary=gp.read_file('../../shp_files/ea_ghcf_icpac.shp')

edf_cont=[]
for index_slice in index_slices:
    chunk = pdb.iloc[index_slice]
    gdb = gp.GeoDataFrame(chunk, geometry=gp.points_from_xy(chunk.longitude, chunk.latitude))
    edf=gp.sjoin(ea_boundary,gdb)
    #edf1=edf[['GID_0', 'COUNTRY','gno','Nomotorway','primary','secondary','tertiary','unclassified','lon','lat', 'grid_name']]
    edf_cont.append(edf)

In [None]:
edf1=pd.concat(edf_cont)
edf2=edf1[['latitude','longitude','value']]
edf2

In [None]:
edf2

In [None]:
count_df = edf2.groupby('value').count()
count_df

In [None]:
from climada.entity import Exposures
import matplotlib
from matplotlib.colors import LinearSegmentedColormap

import numpy as np
from matplotlib import colors
from matplotlib import pyplot as plt
#from Configuration import *
import os
import pandas as pd

import cartopy.crs as ccrs

from cartopy.io.shapereader import Reader
from cartopy.feature import ShapelyFeature

#file_path = lp_csv_files[5] # define the full file path of the CSV-file

def return_colormap():
    """
    Create colormap of matplotlib based on number of class and given colorcode

    Parameters
    ----------
    params : class object
        Input/Output parameter definitions.
        
    Returns
    -------
    c_cmap : Object
        matplotlib colormap.

    """
    c = matplotlib.colors.ColorConverter().to_rgb
    colorlist=[c("#00c252"), c("#f3ff00"), c("#c85500"), c("#ff0000")]
    color_code=colorlist
    classif= [10, 20, 30, 40]
    c_cmap = LinearSegmentedColormap.from_list("my_colormap",color_code, N=len(classif), gamma=1.0)
    return c_cmap



#file_path='/home/bulbul/Documents/07-2022/impact_weather_icpac/lab/ea_climada/KEN_2021.csv'
new_exp = Exposures(edf2)
new_exp.check()

norm = colors.LogNorm(vmin=10, vmax=40)

c_cmap=return_colormap()

ax=new_exp.plot_hexbin(norm=norm, pop_name=False, cmap=c_cmap, buffer=1)

#fname='/home/bulbul/Documents/07-2022/impact_weather_icpac/lab/ea_ibf_data_resources/exposure-data/gis/ea_global_background.shp'

#ax.add_geometries(Reader(fname).geometries(),ccrs.PlateCarree(),facecolor='None')

#plt.savefig('/home/ibf.png', bbox_inches='tight')
