# IBTrACS Data Processing

This notebook demos how to use the functions in `src/ocha_lens/datasources/ibtracs.py`

In [1]:
import ocha_lens as lens

### 1. Basic functions

Load in the IBTrACS NetCDF file. Here you can set the `dataset` to `ALL`, `ACTIVE`, or `last3years`. 

In [2]:
ds = lens.ibtracs.load_ibtracs(dataset="last3years")

Parse the `Dataset` to get storm-level metadata.  

In [3]:
df_storm = lens.ibtracs.get_storms(ds)

Now parse the `Dataset` to get the track-level dataframes. 

In [4]:
gdf_tracks = lens.ibtracs.get_tracks(ds)

### 2. Further customizations

What if I want to fill in the track level data with a variable that we've dropped from the original source? 

In [31]:
sel_var = "usa_r34"

# Let's take our original xarray object and convert it to a dataframe that we can join
# the ds has 159 data variables so we should ideally only select the ones that we need
ds_subset = ds[["sid", sel_var]]
df_select = ds_subset.to_dataframe().reset_index()

# Now clean up the dataframe a bit
df_ = lens.ibtracs.normalize_radii(df_select, radii_cols=[sel_var])
df_["valid_time"] = df_["time"].dt.round("min")
df_ = lens.ibtracs._convert_string_columns(df_, ["sid"])
df_ = df_[["sid", "valid_time", sel_var]]
df_ = df_[df_.valid_time.notna()]

# And do a basic merge back with the original gdf
gdf_tracks_merged = gdf_tracks.merge(df_, how="left")

# For example now we might see how the USA variable fills in some missing data from what bom reported
gdf_tracks_merged[gdf_tracks_merged.provider == "bom"][
    ["sid", "valid_time", "quadrant_radius_34", "usa_r34"]
].head(5)

Unnamed: 0,sid,valid_time,quadrant_radius_34,usa_r34
1417,2022008S13148,2022-01-08 00:00:00,"[nan, nan, nan, nan]","[nan, nan, nan, nan]"
1418,2022008S13148,2022-01-08 06:00:00,"[nan, nan, nan, nan]","[nan, nan, nan, nan]"
1419,2022008S13148,2022-01-08 12:00:00,"[nan, nan, nan, nan]","[nan, nan, nan, nan]"
1420,2022008S13148,2022-01-08 18:00:00,"[nan, nan, nan, nan]","[nan, nan, nan, nan]"
1421,2022008S13148,2022-01-09 00:00:00,"[nan, nan, nan, nan]","[10.0, 15.0, 60.0, 45.0]"
