# Components

```{glue:figure} NTR_components_stds
:scale: 50%
:align: right
```

In this notebook we'll explore the various contributions to high and low water levels. At the moment this is exploratory - we are using this to take broad stroke looks. We'll do this by breaking down the time series of hourly water levels at a tide gauge into different frequency bands, with the idea that certain processes fall within these timescales. For example, we know that ENSO timescales are roughly 4-7 years. We know that PDO timescales are closer to 20 years, and the timescale of mesoscale eddies the Hawaiian Island archipelago are 3 - 6 months. It's important to keep in mind that in this analysis we are not directly relating any of these processes to the observed sea levels at the tide gauges, but rather we are looking at _variability on similar timescales_. 

Thus, we're breaking sea level down into:

$\eta = \eta_{tide} + \eta_{t_N} + \eta_{NTR}$

where

$\eta_{NTR} = \eta_{D} + \eta_{S} + \eta_{ItA} + \eta_{InA} + \eta_{W} + \eta_{HF}$

Note that $\eta_{tide}$ here is accounting for the nodal cycle modulation, and thus it is absent from the non-tidal residuals.

<!-- # make a dictionary of the timescales and the processes
timeframes = {'Decadal': 'e.g. PDO, 8-30+ yr', 
              'Seasonal': 'Annual, Semi-Annual, Qtr-Annual',
              'Interannual': 'e.g. ENSO, 1-8 yr', 
              'Intraannual': 'e.g. Mesoscale eddies', 
              'Weekly': '1 week - 2 months', 
              'Storms': '& other short-term variability',
              'Nodal': '18.6 yr tidal modulation'} -->


## Setup

## Obtain the Non-tidal Residual (NTR)

First we'll estimate the astronomical tides at our gauge locations using the selected epoch. The tide values are estimated using the python implementation of UTide (pypi.org/project/utide, based off Codiga 2011).

The routine is as follows:

- Detrend the hourly sea level data for 1983-2001 epoch
- Solve for coefficents with no nodal corrections using the detrended data
- Solve for coefficients with nodal corrections using the detrended data
- Reconstruct the full timeseries based on:
    - all coefficients (including nodal, annual, and semi-annual cycles)
    - all coefficients except the nodal cycle
    - all coefficients except the annual and semi-annual cycles

Here, we define the NTR as:

$NTR = SL - T - LT$

where SL is the sea level, T is the predicted tide (including nodal, annual and semi-annual) and LT is the linear trend.

The nodal modulation is the difference between the tidal predictions with and without the nodal corrections applied.

In [None]:
station_ids = ds['station_id'].values
station_ids

In [None]:
#plot sea level
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(ntr_data.time, ntr_data.sea_level, label='Sea Level')
ax.plot(ntr_data.time, ntr_data.ntr , label='NTR')
ax.plot(ntr_data.time, ntr_data.trend, label='Trend')
ax.plot(ntr_data.time, ntr_data.seasonal_cycle, label='Seasonal')

# add zero line
ax.axhline(0, color='k', linestyle='--', lw=1)

# add legend
ax.legend()


In [None]:
plt.plot(ntr_data['time'], ntr_data.tide + ntr_data.ntr, label='Tide+NTR')
plt.plot(ntr_data['time'], ntr_data.sea_level_detrended, label='Sea Level')
#truncate x axis to 2016 only
plt.xlim(pd.Timestamp('2016-01-01'), pd.Timestamp('2016-2-21'))

## First, let's look at the seasonal cycle
We obtain the seasonal cycle by using the SA and SSA coefficients from the tidal analysis. SA (Solar Annual) and SSA (Solar Semi-Annual) "mostly reflect yearly meteorological variations influencing sea level." [(NOAA Tides and Current Glossay)](https://tidesandcurrents.noaa.gov/glossary.html#S:~:text=per%20solar%20hour.-,Sa,-Solar%20annual%20constituent)

In [None]:
# Step 1, look at the Annual and Semi-annual cycles from UTide
#convert timeseries to day of year
ntr_data['dayofyear'] = ntr_data['time'].dt.dayofyear
#plot time series with day of year
plt.figure(figsize=(12, 6))
plt.scatter(ntr_data['dayofyear'], ntr_data['seasonal_cycle'], label='NTR with Nodal Cycle', alpha=0.05)

# change x-axis to be months
plt.xticks(np.arange(0, 365, 30), ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec','Jan'])
plt.xlabel('Day of Year')
plt.ylabel('Seasonal Cycle' + ' (' + ds['sea_level'].attrs['units'] + ')')

# set xlim
plt.xlim([0, 365])

In [None]:
# # Let's try treating the nodal cycle  in terms of its envelope 
# Fit the upper envelope to a sinusoidal function with a period of 18.61 years
# The following code is adapted from Thompson et al. (2021), Projected 
# high-tide flooding in the United States: Rapid increases and extreme months, Nature Climate Change.

nodal = ntr_data['nodal'].copy()
nodal = nodal - nodal.mean()  # remove the mean
# set index to time
nodal.index = ntr_data['time']

nodal_upper_envelope = nodal.resample('YS').quantile(0.995).interpolate(method='linear')
nodal_lower_envelope = nodal.resample('YS').quantile(0.005).interpolate(method='linear')
t = nodal_upper_envelope.index.year + 0.5

def skewed_sine(t, A, phase, skew, offset):
    omega = 2 * np.pi / 18.61
    return offset + A * np.sin(omega * t + phase + skew * np.sin(omega * t + phase))

from scipy.optimize import curve_fit

def get_mod_envelope(nodal, t):

    # Initial guesses: A, phase, skew
    p0 = [np.std(nodal.values), 0, 0, np.mean(nodal.values)]  # amplitude ~ std of signal, phase = 0, no skew

    # Fit
    popt, _ = curve_fit(skewed_sine, t, nodal.values, p0=p0,bounds=([0, -2*np.pi, -2*np.pi, -np.inf], [np.inf, 2*np.pi, 2*np.pi, np.inf]))

    A_fit, phase_fit, skew_fit, offset_fit = popt

    ncyc_upper = skewed_sine(t, *popt)
    # ncyc_upper = skewed_sine(t, A_fit,phase_fit, 0, offset_fit)

    # make series
    ncyc_upper = pd.Series(ncyc_upper, index=pd.to_datetime((t - 1950) * 365.25, unit='D', origin='1950-01-01'))

    return ncyc_upper

# Fit the upper envelope
ncyc_upper = get_mod_envelope(nodal_upper_envelope, t)
# Fit the lower envelope
ncyc_lower = get_mod_envelope(nodal_lower_envelope, t)

ncyc_mod_upper = np.max(ncyc_upper) - np.min(ncyc_upper)
ncyc_mod_lower = np.max(ncyc_lower) - np.min(ncyc_lower)

print('Nodal cycle amplitude:', str(round(ncyc_mod_upper*100, 2)) , 'cm')


In [None]:
plt.scatter(monthly_max_real_date,100*monthly_max, label='Monthly Maxima', color='green',s=10)
plt.plot(yearly_max.index,100*yearly_max, label='99.9th Percentile (Annual Maxima)', color='yellow')
# plt.scatter(nodal_upper_envelope.index,0.1*nodal_upper_envelope, label='Nodal Cycle Upper Envelope Monthly', color='magenta',s=3)
# plt.scatter(nodal_upper_envelope1.index,0.1*nodal_upper_envelope1, label='Nodal Cycle Upper Envelope Monthly', color='yellow',s=3)


plt.plot(pcyc_upper_double.index, 100*(pcyc_upper_double), label='4.4 Cycle Fit', color='orange', lw=2)
plt.plot(nodal_upper.index, 100*(nodal_upper), label='Nodal Cycle Fit', color='red', lw=2)
# plt.plot(ncyc_upper.index, 0.1*(nodal_upper-nodal_upper.mean()+pcyc_upper_double), label='18.61y+4.425y+8.85y', color='blue', lw=2)
# add legend
plt.legend()

# add title
plt.title('Nodal Cycle and Perigean Cycle Upper Envelope at ' + station_name)
plt.xlabel('Time')
plt.ylabel('Predicted Water Level (cm)')
# interpolate the pyc_upper_double to the nodal cycle upper envelope

#set xlimits to just 2000-2022
# plt.xlim([np.datetime64('2010-01-01'), np.datetime64('2022-12-31')])

In [None]:
perigean_cycle = pcyc_upper_double - pcyc_upper_double.mean()


In [None]:
ntr_data_mags


```{caution}
The timescales probably need some refinement. For example: mesoscale processes. Higher the latitude the longer the period!!
```
From Chen, S., and B. Qiu (2010), Mesoscale eddies northeast of the Hawaiian archipelago from satellite altimeter observations, J. Geophys. Res., 115, C03016, doi:10.1029/2009JC005698


"We define dominant periods of the mesoscale eddy activity by locating the periods at which the spectral peaks within the mesoscale range of 90–180 days. This definition is crude yet robust for the subregions with sharp spectral peaks like the 24°N–27°N, 160°W–155°W one (130 days) and the 18°N–21°N, 170°W–165°W one (90 days), but is also applicable to other subregions. In the lee of the island of Hawaii, 90 day oscillations dominate the mesoscale eddy activity. In the subregions between 24°N and 30°N, a 130 day peak often prevails, but in the 30°N–33°N band, a weak 180 day peak emerges. The pattern is that the higher the latitude, the longer the dominant period."

Also:
Firing, Y. L., and M. A. Merrifield (2004), Extreme sea level events at Hawaii: Influence of mesoscale eddies, Geophys. Res. Lett., 31, L24306, doi:10.1029/2004GL021539.

In [None]:
ntr_data

In [None]:
def filter_ntr(ntr_data):
    # ntr_noAnnual, ntr_Annual = filter_known_frequency_components(ntr_data['ntr_detrended'], time_diffs,1/annual , width=widthSeasonal)
    # ntr_noSemiAnnual, ntr_SemiAnnual = filter_known_frequency_components(ntr_noAnnual, time_diffs, 1/semiannual, width=widthSeasonal)
    # ntr_noQtrAnnual, ntr_QtrAnnual = filter_known_frequency_components(ntr_noSemiAnnual, time_diffs, 1/qtrannual, width=widthSeasonal)
    # ntr_Seasonal = ntr_Annual + ntr_SemiAnnual + ntr_QtrAnnual

    rec_length = (ntr_data['time'].max() - ntr_data['time'].min()).days
    
    # if rec_length < 35*365.25:
        #interdecadal
        # ntr_interdecadal, ntr_highFreq = butterworth_lowpass(ntr_data['ntr_detrended'], time_diffs, 1/interdecadal, order=3, padtype='even', padlen=3)
        #decadal
    ntr_decadal, ntr_highFreq = butterworth_lowpass(ntr_data['ntr'], time_diffs, 1/decadal, order=3) 
    #     ntr_decadal = ntr_decadal + ntr_interdecadal
    # else:
    #     #interdecadal
    #     ntr_interdecadal, ntr_highFreq = butterworth_lowpass(ntr_data['ntr_detrended'], time_diffs, 1/interdecadal, order=3, padtype='even', padlen=3)
    #     #decadal
    #     ntr_decadal, ntr_highFreq = butterworth_lowpass(ntr_highFreq, time_diffs, 1/decadal, order=3)
    
    #interannual
    ntr_interannual, ntr_highFreq = butterworth_lowpass(ntr_highFreq, time_diffs, 1/annual, order=3)

    # intraannual
    # this should be done in wavelets instead of a lowpass filter??

    ntr_subannual, ntr_highFreq = butterworth_lowpass(ntr_highFreq, time_diffs, 1/monthly, order=5)

    ntr_monthly, ntr_highFreq = butterworth_lowpass(ntr_highFreq, time_diffs, 1/weekly, order=5)

    # Remove high frequencies (weekly to hourly)
    # ntr_weekly is timescales longer than 7 days but less than 1 month
    ntr_weekly, ntr_highFreq = butterworth_lowpass(ntr_highFreq, time_diffs, 1/daily, order=5)

    rank_tide = ntr_data['tide'].rank(method='first', ascending=False)

    # make dataframe of filtered data
    ntr_filtered = pd.DataFrame({'time': ntr_data['time'], 
                             'ntr': ntr_data['ntr'], 
                             'sea_level': ntr_data['sea_level'],
                             'sea_level_detrended': ntr_data['sea_level_detrended'],
                             'tide': ntr_data['tide'],
                             'Nodal Amp': ncyc_interp.values-ncyc_interp.mean(),
                             'Nodal Mod': ntr_data['nodal'],
                             'Perigean': perigean_cycle.values,
                             'Trend': ntr_data['trend'],
                            #  'Interdecadal': ntr_interdecadal, 
                             'Decadal': ntr_decadal, 
                             'Interannual': ntr_interannual, 
                             'Seasonal': ntr_data['seasonal_cycle'], 
                             'Subannual': ntr_subannual, 
                             'Monthly': ntr_monthly,
                             'Weekly': ntr_weekly, 
                             'Storms & HF': ntr_highFreq,
                             'Rank Tide': rank_tide } )
                        #    'NTR Trend': ntr_trend_series['ntr']
    # if rec_length < 35*365.25:
    #     ntr_filtered.pop('Interdecadal')

    component_names = list(ntr_filtered.columns) 
    component_names.remove('time')
    component_names.remove('ntr')
    component_names.remove('sea_level')
    component_names.remove('sea_level_detrended')
    component_names.remove('Nodal Amp')
    component_names.remove('Nodal Mod')
    component_names.remove('Rank Tide')

    # add trend back into ntr
    # ntr_filtered['ntr'] = ntr_filtered['ntr'] + ntr_filtered['NTR Trend']
    
    return ntr_filtered, component_names


In [None]:
# look at filtered components of nodal cycle only
# nodal_data = ntr_data.copy()
# nodal_data.index = ntr_data['time']
# envelope_demeaned = envelope - np.nanmean(envelope)
# envelope_demeaned.index = nodal_data.index
# nodal_data['ntr_detrended'] = envelope_demeaned
# nodal_filtered = filter_ntr(nodal_data)

# #rename "ntr" in nodal_filtered to "nodal upper envelope"
# nodal_filtered = nodal_filtered.rename(columns={'ntr': 'nodal envelope'})

# nodal_component_std = nodal_filtered.std() 
# nodal_component_std = nodal_component_std.drop('time')

#plot ntr_filtered['Nodal Amp']
ntr_filtered['Nodal Amp'].plot()

Note that to define the upper envelope we used monthly maxima. So there should be NO correlation in the weekly/storms.

Next, a sanity check to make sure that everything adds up to the right sum.

In [None]:
#plot ntr, then plot summed components
plt.figure(figsize=(2, 2))

# sum all components in components_names programmatically
ntr_sum = ntr_filtered[component_names].sum(axis=1)


#make a dotted 1:1 line
plt.plot(ntr_filtered['sea_level_detrended'], ntr_filtered['sea_level_detrended'], 'k:', label='1:1 line',linewidth=0.5,alpha=0.5)
plt.scatter(ntr_filtered['sea_level_detrended'], ntr_sum)

plt.xlabel('Sea Level')
plt.ylabel('Sum of components')

# add RMSD to plot
rmsd = np.sqrt(np.mean((ntr_filtered['sea_level_detrended'] - ntr_sum)**2))
plt.text(0.05, 0.95, f'RMSD: {rmsd:.2f} cm', ha='left', va='top', transform=plt.gca().transAxes, fontsize=8)
plt.title('Sum of components vs Sea Level')


Note there is correlation between some of these timeseries, likely due to the filtering process employed in this notebook. 

In [None]:
ntr_filtered

In [None]:
# Is the interannual component correlated with ENSO? Let's use the ONI index
# load ONI data

CI_dir = Path(data_dir / 'climate_indices')
climateIndex = ['AO','BEST','ONI','PDO','PMM','PNA','TNA']

CIcorr = np.zeros((len(climateIndex), 30))

# Arrays to store peak correlation and lag for each climate index
CIcorr_max_peaks = np.zeros(len(climateIndex))
CIcorr_max_lag = np.zeros(len(climateIndex))

for indCI in range(len(climateIndex)):
    CI = pd.read_csv(CI_dir / (climateIndex[indCI] + '.csv'), parse_dates=['time'])
    # ntr_CI = pd.merge_asof(ntr_filtered_monthly.sort_index(), CI.sort_index(), left_index=True, right_index=True, direction='nearest')
    CI['time'] = pd.to_datetime(CI['time'])

    # Perform the merge
    ntr_CI = pd.merge_asof(ntr_filtered_monthly, CI, left_index=True, right_on='time', direction='nearest')
    # Define the number of lags
    lag = 30
    corr = np.zeros(lag)

    if climateIndex[indCI] == 'PDO' or climateIndex[indCI] == 'PMM': #<--- IS THIS CORRECT?
        # For PDO and PMM, we need to add the decadal component to the interannual component
        ntr_CI['signal'] = ntr_CI['Interannual'] + ntr_CI['Decadal']
    else:
        # For other climate indices, we just use the interannual component
        ntr_CI['signal'] = ntr_CI['Interannual']


    # Calculate lagged correlation
    for i in range(1, lag + 1):
        corr[i - 1] = np.corrcoef(ntr_CI[climateIndex[indCI]][:-i], ntr_CI['signal'][i:])[0, 1]
    CIcorr[indCI,:] = corr
    # get max correlation and lag
    CIcorr_max_peaks[indCI] = np.max(abs(CIcorr[indCI,:]))
    CIcorr_max_lag[indCI] = np.argmax(abs(CIcorr[indCI,:]))

# Use the max correlation to determine the winning Climate Index
climateIndex_bestcorr = climateIndex[np.argmax(abs(CIcorr_max_peaks))]
climateIndex_bestlag = CIcorr_max_lag[np.argmax(abs(CIcorr_max_peaks))]

# now adjust the climateIndex by the lag and plot together with the ntr

CI = pd.read_csv(CI_dir / (climateIndex_bestcorr + '.csv'), parse_dates=['time'])
#adjust the time by the lag
CI['time'] = pd.to_datetime(CI['time'])
CI['time'] = CI['time'] + pd.DateOffset(months=CIcorr_max_lag[np.argmax(abs(CIcorr_max_peaks))])
# Perform the merge
ntr_CI = pd.merge_asof(ntr_filtered_monthly, CI, left_index=True, right_on='time', direction='nearest')
# rename the columns

#plot
fig, ax1 = plt.subplots(figsize=(12, 6))
ax1.plot(ntr_CI['time'], ntr_CI['Interannual']+ntr_CI['Decadal'], label='Interannual NTR', color='blue')
# ax1.plot(ntr_CI['time'], ntr_CI['Decadal'], label='Decadal NTR', color='blue')

# plt.plot(ntr_CI['time'], ntr_CI[climateIndex_bestcorr], label=climateIndex_bestcorr, color='red')
ax1.set_xlabel('Time')
ax1.set_ylabel('Non-Tidal Residuals (mm)', color='blue')
ax1.tick_params(axis='y', labelcolor='blue')

# Create a second y-axis for ONI
ax2 = ax1.twinx()
ax2.plot(ntr_CI['time'], ntr_CI[climateIndex_bestcorr], label=climateIndex_bestcorr, color='red')
ylabel = climateIndex_bestcorr  + f' ({climateIndex_bestlag:.0f} month lead)'
ax2.set_ylabel(ylabel, color='red')
ax2.tick_params(axis='y', labelcolor='red')

corr = ntr_CI['Interannual'].corr(ntr_CI[climateIndex_bestcorr])
if corr < 0:
    ax2.invert_yaxis()  # Flip the axis

# add text for correlation
plt.text(0.05, 0.95, f'Correlation: {corr:.2f}\n{climateIndex_bestcorr} leads NTR by {climateIndex_bestlag:.0f} months', ha='left', va='top', transform=plt.gca().transAxes, fontsize=8)
plt.title('Interannual NTR and ' + climateIndex_bestcorr)

In [None]:
ntr_component_stds
# save to csv

In [None]:
ntr_data

In [None]:
# Create figure
fig, ax = plt.subplots(figsize=(1, 6))
# ntr_component_vars_cumsum = ntr_component_vars.cumsum()/ntr_component_vars_sum * ntr_var #normalize to the variance of the ntr (not filtered)
# Plot stacked bars
bottom = 0
for i in range(len(ntr_component_waveheight.index)-1, -1, -1):
    ax.bar('Components', ntr_component_waveheight[i], bottom=0, label=ntr_component_waveheight.index[i].replace('\n', ' '))

# ax.bar('Total NTR', np.std(ntr_filled), color='white', edgecolor='black', linewidth=1)

# Labels and title
ax.set_ylabel('Height (cm)')
ax.set_title('Contributions to SWL by Frequency \n' + station_name)
ax.legend(loc='lower left', bbox_to_anchor=(1, 0))
# plt.xticks(rotation=45)
# plt.grid(axis='y', linestyle='--', alpha=0.7)

# no box
for spine in ax.spines.values():
    spine.set_visible(False)

figName = 'NTR_components_stds' + station_name
glue('NTR_components_stds',fig,display=False)

# save the wave height to csv
savepath = Path(data_dir, f'ntr_data/ntr_{station_id:03d}_component_waveheight.csv')
ntr_component_waveheight.to_csv(savepath)


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Set up figure
fig, ax = plt.subplots(figsize=(10, 6))  # adjust width based on number of stations

# Station names and component labels
stations = ntr_combined.columns

# remove 552
# stations = [station for station in stations if station != 552]
stations = [station for station in stations if station != 548]
stations = [station for station in stations if station != 14]
stations = [station for station in stations if station != 547]



# only include stations in ds
stations = [station for station in stations if station in ds.station_id.values]

# # remove French Frigate, Kaumalapau, and Barbers Point
# stations = [station for station in stations if station not in [552, 548, 14, 547]]

# get latitude of stations
latitudes = ds.lat.sel(station_id=stations).values
# sort by latitude

sorted_indices = np.argsort(latitudes)
# sort stations by latitude
stations = np.array(stations)[sorted_indices]

# Get the components
ntr_combined_norm = ntr_combined_norm[stations]
ntr_combined = ntr_combined[stations]
components = ntr_combined.index
x = np.arange(len(stations))  # one x-position per station

# Set color palette
colors = plt.cm.tab20(np.linspace(0, 1, len(components)))

# Plot each component
bottom = np.zeros(len(stations))
for i, component in reversed(list(enumerate(components))):
    heights = ntr_combined.loc[component].values
    ax.bar(x, heights, bottom=0, label=component.replace('\n', ' '))
    bottom += heights

#names of stations instead of numbers, make a dictionary
station_names = ds.station_name.sel(station_id=stations).values

# Customize axes
ax.set_ylabel("Contribution to NTR (cm)")
ax.set_xlabel("Station")
ax.set_title("Components by Frequency and Station")
ax.set_xticks(x)
ax.set_xticklabels(station_names, rotation=45, horizontalalignment='right')
ax.legend(loc='upper left', bbox_to_anchor=(1, 1))

# Remove box
for spine in ax.spines.values():
    spine.set_visible(False)

plt.tight_layout()
plt.show()



It's important to note that in the plot above, we're looking at the $4\sigma$, which is akin to 'significant wave height.' Nor do we consider each $\sigma$ independently in this plot, but instead we compute the standard deviation of the combined signals, as we work our way up to higher and higher frequencies. For example, the purple line above shows the contributions of any cycles that occur on timescales longer than 1 year. Each individual componenent has its own standard deviation but none are truly indepedent signals (due to the filtering mechanism here) and therefore the variances cannot be directly added together to represent the total variances of the whole signal. (See the correlation plot above.)

In [None]:
extremes_highest = extremes['Highest Date'].values
extremes_lowest = extremes['Lowest Date'].values

# CAUTION MANUAL ENTRY HERE
# add Oct 19 2024 8 am to extremes_highest for Nuku'alofa
# extremes_highest = np.append(extremes_highest, pd.to_datetime('2024-10-19 08:00:00'))

extremes_highest = pd.to_datetime(extremes_highest)
extremes_lowest = pd.to_datetime(extremes_lowest)

extremes_highest





In [None]:
# make separate plot for comparison of event size vs climatology

idx = 3

sl_extreme = ntr_filtered_extremes_high['sea_level'].iloc[idx]
event_date = ntr_filtered_extremes_high['time'].values[idx]
date_str = pd.to_datetime(event_date).strftime('%Y-%m-%d %H:%M')
data_on_date = ntr_filtered_extremes_high[ntr_filtered_extremes_high['time'] == event_date]


# Extract the components
components = column_order[6:]
# remove 'Nodal Mod' from components
components.remove('Nodal Mod')
components.remove('Trend')
# components.remove('NTR Trend')
component_values = data_on_date[components].values.flatten()
x_positions = np.arange(len(components))  # Positions for each component
y_stds = 0.1*ntr_component_stds[components].values.flatten()

# we want to include the y_stds from Nodal Mod, but not Nodal Amp, so replace the y_stds with the std of Nodal Mod
# y_stds[components.index('Nodal Amp')] = 0.2*ntr_component_stds['Nodal Mod']


fig, ax = plt.subplots(1, 1, figsize=(5, 3))

# assign 1 color to each component. There are XX components, so we need XX colors
colors = plt.cm.tab20(np.linspace(0, 1, len(components)))


### --- TOP LEFT PLOT: Bar Chart --- ###
bar_width = 0.2
ax.bar(x_positions, 0.1*component_values, alpha=1, color=colors)
ax.errorbar(x_positions, np.zeros_like(x_positions), yerr=y_stds, fmt='none', color='black',alpha=0.4, markersize=3, capsize=5, label='Standard Deviation')

ax.set_xticks(x_positions)

# replace 'Nodal Amp' with 'Nodal' in the x-ticks
components = [comp.replace('Nodal Amp', 'Nodal') for comp in components]
ax.set_xticklabels(components, rotation=45, ha='right')

# Title and labels
ax.set_title('Components on ' + date_str)
ax.set_ylabel('Contribution to \nNon-Tidal Residuals (cm)')

# Add dotted line at 0
ax.axhline(0, color='black', linestyle='--', linewidth=0.5)

In [None]:
# make timeseries figure
import matplotlib.pyplot as plt
# add mdates for x-axis formatting
import matplotlib.dates as mdates

componentsNTR = ['Decadal', 'Interannual', 'Subannual', 'Monthly', 'Weekly', 'Storms & HF']
componentsTide = ['Nodal Mod', 'Perigean', 'Seasonal']

fig, ax = plt.subplots(1, 1, figsize=(6, 3))


component_values = data_on_event[componentsNTR].values.flatten()
# component_values2 = data_on_event2[componentsNTR].values.flatten()
hatches = '//////'

x_positions = np.arange(len(componentsNTR))  # Positions for each component
y_stds = 100*ntr_component_stds[componentsNTR].values.flatten()

x_positionsTide = np.arange(len(componentsTide)) + 0.4  # Positions for each tide component
y_stdsTide = 100*ntr_component_stds[componentsTide].values.flatten()

# we want to include the y_stds from Nodal Mod, but not Nodal Amp, so replace the y_stds with the std of Nodal Mod
# y_stds[components.index('Nodal Amp')] = 0.2*ntr_component_stds['Nodal Mod']

# assign 1 color to each component. There are XX components, so we need XX colors
# component_order = ['Nodal', 'Decadal', 'Interannual', 'Seasonal', 'Subannual', 'Monthly','Weekly', 'Storms & HF']
tab10_colors = plt.cm.tab10.colors
colors = tab10_colors[:5] + tab10_colors[6:7]
colorsTide = tab10_colors[7:10]  # Use the last 3 colors for tide components
component_colors = {comp: colors[i % 6] for i, comp in enumerate(componentsNTR)}
component_colorsTide = {comp: colorsTide[i % 3] for i, comp in enumerate(componentsTide)}
#switch the last two colors in component_colorsTide
component_colorsTide['Seasonal'], component_colorsTide['Perigean'] = component_colorsTide['Perigean'], component_colorsTide['Seasonal']


### --- TOP PLOT: Bar Chart --- ###
bar_width = 0.5

ax.bar(
    x_positions,
    100*component_values,
    alpha=1,
    color=[component_colors[c] for c in componentsNTR],
    width=bar_width,
    label=date_str + ', NTR = ' + f'{100 * data_on_event["ntr"].values[0]:.2f} cm , Height: ' + f"{data_on_event['sea_level'].values[0] - mhhw:.2f} m"
)
# ax.bar(
#     x_positions+0.2,
#     0.1*component_values2,
#     alpha=1,
#     color=[component_colors[c] for c in componentsNTR],
#     width=bar_width,
#     hatch=hatches,
#     label=date_str2 + ', NTR = ' + f'{0.1 * data_on_event2["ntr"].values[0]:.2f} cm'
# )
ax.errorbar(x_positions, np.zeros_like(x_positions), yerr=y_stds, fmt='none', color='black',alpha=0.4, markersize=3, capsize=5)
ax.legend(fontsize=6, frameon=True, loc='upper left')
ax.set_xticks(x_positions)


# Move x-tick labels to the top
ax.set_xticklabels([])
ax_top = ax.secondary_xaxis('top')
ax_top.set_xticks(x_positions)
ax_top.set_xticklabels(componentsNTR, rotation=45, ha='left', fontsize=8)

#remove bottom x-ticks
ax.tick_params(axis='x', which='both', bottom=False, top=False, labelbottom=False)

# Title and labels
# ax.set_title('Components on ' + date_str)
ax.set_ylabel('Contribution to \nSWL (cm)')

# Add dotted line at 0
ax.axhline(0, color='black', linestyle='--', linewidth=0.5)
ax.set_ylim([-6, 10])

box = ax.get_position()
#make it skinnier
ax.set_position([box.x0, box.y0, box.width * 0.65, box.height])  # Shrink width to 50%
box_skinny = ax.get_position()
# add another axis on the right for the tide components

ax2 = fig.add_axes([box_skinny.x0 + box_skinny.width+0.025, box.y0, box.width * 0.35-0.025, box.height])  # Create a new Axes on the right side
# Plot tide components

ax2.bar(x_positionsTide, 100*data_on_event[componentsTide].values.flatten(), alpha=1, color = [component_colorsTide[c] for c in componentsTide], width=bar_width)
# ax2.bar(x_positionsTide+0.2, 0.1*data_on_event2[componentsTide].values.flatten(), alpha=1, color = [component_colorsTide[c] for c in componentsTide], width=bar_width, hatch=hatches)
ax2.set_ylabel('Tide Components (cm)')
ax2.axhline(0, color='black', linestyle='--', linewidth=0.5)
#remove bottom x-ticks
ax2.tick_params(axis='x', which='both', bottom=False, top=False, labelbottom=False)
ax2.set_ylim([-6,10])
ax2.errorbar(x_positionsTide, np.zeros_like(x_positionsTide), yerr=y_stdsTide, fmt='none', color='black',alpha=0.4, markersize=3, capsize=5, label='$\pm 1\sigma$')
ax.set_xticks(x_positions)

ax2.legend(fontsize=6, frameon=True, loc='upper left')


# Move x-tick labels to the top
ax2.set_xticklabels([])
ax2_top = ax2.secondary_xaxis('top')
ax2_top.set_xticks(x_positionsTide)
componentsTide = [comp.replace('Nodal Mod', 'Nodal') for comp in componentsTide]
componentsTide = [comp.replace('Seasonal', 'Annual + SA') for comp in componentsTide]
ax2_top.set_xticklabels(componentsTide, rotation=45, ha='left', fontsize=8)

#put ylabel on the right side
ax2.yaxis.set_label_position("right")
ax2.yaxis.tick_right()

adjust_axFormat(ax2)

ax2.text(0.98, 0.97, 'a2', transform=ax2.transAxes, fontsize=14,
            verticalalignment='top', horizontalalignment='right', fontweight='heavy')
    
ax.text(0.98, 0.97, 'a1', transform=ax.transAxes, fontsize=14,
            verticalalignment='top', horizontalalignment='right', fontweight='heavy')
        
# add station info on the bottom xlabel
ax.set_xlabel(station_name + ' (' + str(station_id) + ')')

# save the file to desktop as a png
figName = 'NTR_components_' + station_name + '_' + date_str
glue(figName,fig,display=False)

savepath = Path(output_dir, figName + '.png')
fig.savefig(savepath, dpi=300, bbox_inches='tight')


In [None]:
# import matplotlib.pyplot as plt
# import matplotlib.dates as mdates
# import pandas as pd
# import numpy as np

idx = 3

# sl_extreme = ntr_filtered_extremes_high['sea_level'].iloc[idx]
dates_to_plot = ntr_filtered_extremes_high['time'].values[idx]
# date_str = pd.to_datetime(dates_to_plot).strftime('%Y-%m-%d %H:%M')
# data_on_date = ntr_filtered_extremes_high[ntr_filtered_extremes_high['time'] == dates_to_plot]

# # Extract the components
# components = column_order[4:]
# # remove 'Nodal Mod' from components
# components.remove('Nodal Amp')
# component_values = data_on_date[components].values.flatten()

# # set tide and trend relative to MHHW
# component_values[components.index('tide')] -= 0.1*mhhw + 0.1*msl
# # # do the same for the trend
# component_values[components.index('Trend')] -= 0.1*mhhw + 0.1*msl

# # remove tide and trend

# component_values = np.delete(component_values, [components.index('tide'), components.index('Trend')])
# components.remove('tide')
# components.remove('Trend')

# x_positions = np.arange(len(components))  # Positions for each component

# fig, axes = plt.subplots(2, 2, figsize=(10, 6), gridspec_kw={'height_ratios': [1, 1], 'width_ratios': [1, 1]})

# # assign 1 color to each component. There are XX components, so we need XX colors
# colors = plt.cm.tab20(np.linspace(0, 1, len(components)))


# ### --- TOP LEFT PLOT: Bar Chart --- ###
# ax = axes[0, 0]  # First row, first column
# ax.bar(x_positions, 0.1*component_values, alpha=1, color=colors)
# ax.errorbar(x_positions, np.zeros_like(x_positions), yerr=y_stds, fmt='none', color='black',alpha=0.4, markersize=3, capsize=5, label='Standard Deviation')

# ax.set_xticks(x_positions)

# # replace 'Nodal Amp' with 'Nodal' in the x-ticks
# componentLables = [comp.replace('Nodal Mod', 'Nodal*') for comp in components]
# componentLables = [comp.replace('Seasonal','Seasonal*') for comp in componentLables]
# ax.set_xticklabels(componentLables, rotation=45, ha='right')



# # Title and labels
# ax.set_title('Components on ' + date_str)
# ax.set_ylabel('Contribution to \nTWL (cm)')

# # Add dotted line at 0
# ax.axhline(0, color='black', linestyle='--', linewidth=0.5)

# # Resize subplot width
# box = ax.get_position()
# # ax.set_position([box.x0, box.y0, box.width * 0.5, box.height])  # Shrink width to 75%

# ### --- TOP RIGHT PLOT: NTR & Weekly Trends --- ###
# ax = axes[0, 1]  # First row, second column

# Filter data to ±10 days around `dates_to_plot`
plusTime = pd.Timedelta('2d')
timespan = pd.date_range(start=dates_to_plot - plusTime, end=dates_to_plot + plusTime, freq='h')

# Ensure time is a datetime index
data_on_date = ntr_filtered[ntr_filtered['time'].isin(timespan)].copy()
data_on_date.set_index('time', inplace=True)

# # Plot NTR and Weekly trends
# # ax.plot(data_on_date.index, 0.1*data_on_date['ntr'], label='NTR', color='black', linewidth=0.5)
# for col in data_on_date.columns:
#     #only do for columns that aren't tide or sea level
#     if col not in ['sea_level','sea_level_detrended','tide','NTR','ntr','Storms & HF','NTR Trend','Nodal Amp','Trend']:
#         col_index = components.index(col)
#         ax.plot(data_on_date.index, 0.1*data_on_date[col], label=col, color = colors[col_index],linewidth=1)
#     if col == 'Storms & HF': #put on the bottom
#         col_index = components.index(col)
#         ax.plot(data_on_date.index, 0.1*data_on_date[col], label=col, color = colors[col_index],linewidth=1,zorder=0)

# # nodal_daily = data_on_date['Nodal Mod'].resample('D').max()
# # ax.plot(nodal_daily.index, 0.1*nodal_daily, label='Nodal Daily', color='green', linewidth=1)

# ax.axvline(dates_to_plot, color='red', linestyle='--', linewidth=0.5)

# #ensure x-axis is readable
# ax.xaxis.set_major_locator(mdates.DayLocator(interval=10))
# ax.xaxis.set_minor_locator(mdates.DayLocator(interval=1))
# ax.xaxis.set_major_formatter(mdates.DateFormatter('%m-%d'))
# ax.set_xlim([timespan[0], timespan[-1]])
# ax.set_title('TWL Components')
# ax.set_xlabel('Time')
# ax.legend(loc='upper right')

# ### --- BOTTOM PLOT: Sea Level, Tide, and Interannual --- ###
# fig.delaxes(axes[1, 1])
# fig.delaxes(axes[1, 0])
# ax = fig.add_subplot(2, 1, 2)

# # Filter data to ±100 days around `dates_to_plot`
# plusTime = pd.Timedelta('10d')
# timespan = pd.date_range(start=dates_to_plot - plusTime, end=dates_to_plot + plusTime, freq='h')

# # Ensure time is a datetime index
# data_on_date = ntr_filtered[ntr_filtered['time'].isin(timespan)].copy()
# data_on_date.set_index('time', inplace=True)

# # Plot sea level, tide, and interannual trend
# ax.plot(data_on_date.index, 0.1*(data_on_date['sea_level'] - mhhw), label='Sea Level', color='blue', linewidth=0.5)
# ax.plot(data_on_date.index, 0.1*(data_on_date['tide']+data_on_date['sea_level']-data_on_date['sea_level_detrended']-mhhw), label='Tide', color='cyan', linewidth=0.5)
# # ax.plot(data_on_date.index, 0.1*(data_on_date['ntr_withNodal'] - mhhw), label='Interannual', color='orange', linewidth=0.5)
# ax.set_ylabel('$RSL_{MHHW}$ (cm)')

# # Add legend and vertical line
# ax.legend(loc='lower right')

# # add circle at the date and height
# ax.scatter(dates_to_plot, 0.1*(sl_extreme-mhhw), color='red', s=50, zorder=5, facecolors='none')
# ax.annotate(f'{date_str}, {0.1*(sl_extreme-mhhw):.2f} cm', 
#              (dates_to_plot, 0.1*(sl_extreme-mhhw)), 
#              textcoords='offset points', 
#              xytext=(10, -10), 
#              ha='left', 
#              fontsize=8)

# ax.axvline(dates_to_plot, color='red', linestyle='--', linewidth=0.5)

# # Set x-axis limits
# ax.set_xlim([timespan[0], timespan[-1]])
# ax.set_title('Observed and Predicted Sea Level')

# # add title to entire figure
# fig.suptitle('Non-Tidal Residuals and Sea Level at ' + station_name + ' on ' + date_str, y=1.05)

# plt.tight_layout()
# plt.show()


# # save the file to desktop as a png
# figName = 'NTR_components_' + station_name + '_' + date_str
# glue(figName,fig,display=False)

# savepath = Path(output_dir, figName + '.png')
# fig.savefig(savepath, dpi=300, bbox_inches='tight')

# # data_on_date


In [None]:
# We need to treat the tide component differently here to get a better comparison
# get the daily high tides
tide_data = ntr_data['tide']
tide_data.index = ntr_data['time']
tide_max_daily = tide_data.resample('D').max()

tide_max_daily_std = tide_max_daily.std()

tide_min_daily = tide_data.resample('D').min()
tide_min_daily_std = tide_min_daily.std()

In [None]:
def plot_component_amps(extremes_low_relative_to_std, high_or_low = 'high', station_name = ''):
    import matplotlib.colors as mcolors

    # Create heatmap figure
    fig, ax = plt.subplots(figsize=(7, 6))

    #drop the time column and turn it into the index
    extremes_high_relative_to_std_subset = extremes_low_relative_to_std.set_index('time')
    

    cmap = plt.cm.coolwarm
    colors = [(cmap(0.0)),  # Dark blue at -3
              (cmap(0.45)), # Light blue at -1
              (cmap(0.5)),  # White at 0
              (cmap(0.55)), # Light red at 1
              (cmap(1.0))]  # Dark red at 3
    positions = [-3, -1, 0, 1, 3]  # Assigning key points in data range

    # Create a new colormap
    new_cmap = mcolors.LinearSegmentedColormap.from_list("modified_coolwarm", list(zip(np.linspace(0, 1, len(colors)), colors)))

    # Keep a **linear scale** but use the modified colormap
    norm = mcolors.Normalize(vmin=-3, vmax=3)

    # Plot heatmap
    if high_or_low == 'high':
        #order by highest to lowest sea level
        extremes_high_relative_to_std_subset = extremes_high_relative_to_std_subset.sort_values(by='sea_level', ascending=False)
    if high_or_low == 'low':
        extremes_high_relative_to_std_subset = extremes_high_relative_to_std_subset.sort_values(by='sea_level', ascending=True)

    # drop sea level column
    extremes_high_relative_to_std_subset = extremes_high_relative_to_std_subset.drop(columns=['sea_level','sea_level_detrended','Trend','Nodal Amp'])

    heatmap = ax.imshow(extremes_high_relative_to_std_subset.T, cmap=new_cmap, norm=norm,aspect='auto')

    # label rows and columns
    ax.set_xticks(np.arange(len(extremes_high_relative_to_std_subset)))
    ax.set_xticklabels(extremes_high_relative_to_std_subset.index.strftime('%Y-%m-%d %H:%M'),rotation=60, ha='right')
    ax.set_yticks(np.arange(len(extremes_high_relative_to_std_subset.columns)))
    ax.set_yticklabels(extremes_high_relative_to_std_subset.columns)

    # add colorbar, should be same height as heatmap
    cbar = fig.colorbar(heatmap, ax=ax, fraction=0.04, pad=0.04)
    cbar.set_label('Relative Amplitude\n (Standard Deviations)')

    # if high_or_low is high, then "Highest"
    if high_or_low == 'high':
        ax.set_title('Extreme High Sea Level Events:\nRelative Amplitudes of Non-Tidal Residual Components\n' + station_name)
    elif high_or_low == 'low':
        ax.set_title('Extreme Low Sea Level Events:\nRelative Amplitudes of Non-Tidal Residual Components\n' + station_name)

    return fig, ax

## Make a Table


In [None]:
# make time the index
# extremes_table = extremes_table.set_index('time')

#change everything except time column to cm
extremes_table.iloc[:,1:] = extremes_table.iloc[:,1:]*100

#format time to be more readable
# extremes_table['time'] = extremes_table['time'].dt.strftime('%Y-%m-%d %H:%M')

# round to 2 decimal places
extremes_table = extremes_table.round(1)

extremes_table
# remove sum column
extremes_table = extremes_table.drop(columns='sum')

# put sea_level in first column
extremes_table = extremes_table[['time','sea_level','tide','ntr','Nodal Mod','Decadal','Interannual','Seasonal','Subannual','Weekly','Storms & HF']]


In [None]:
# now combine extremes_table with extremes_high_relative_to_std
extremes_table_relative = extremes_table.copy()
extremes_table_relative = extremes_table_relative.set_index('time')
std_table = extremes_high_relative_to_std.copy()
std_table = std_table.set_index('time')

#sort both on sea_level
extremes_table_relative = extremes_table_relative.sort_values(by='sea_level', ascending=False)
std_table = std_table.sort_values(by='sea_level', ascending=False)

extremes_table_relative

# Find common columns
common_columns = extremes_table_relative.columns.intersection(std_table.columns)

# Select only common columns from both tables
extremes_common = extremes_table_relative[common_columns]
std_common = std_table[common_columns]

# Rename std columns to make them distinct
std_common = std_common.rename(columns={col: f"{col}_std" for col in common_columns})

# Interleave columns (merge column-by-column)
interleaved_columns = sum(zip(extremes_common.columns, std_common.columns), ())  # Creates interleaved column order

# Combine the tables, ensuring interleaved order
formatted_table = pd.concat([extremes_common, std_common], axis=1)[list(interleaved_columns)]

# Add back 'sea_level' and 'ntr' at the front
formatted_table = pd.concat([extremes_table_relative[['sea_level', 'ntr']], formatted_table], axis=1)

#drop sea_level_std column
formatted_table = formatted_table.drop(columns=['sea_level_std','sea_level'])

# add 'sea level' column back in front
formatted_table.insert(0, 'sea_level', extremes_table_relative['sea_level'])

# format all to 1 decimal place
formatted_table = formatted_table.round(1)

formatted_table


In [None]:
formatted_table_time = formatted_table.reset_index()
formatted_table_time['time'] = formatted_table_time['time'].dt.strftime('%Y-%m-%d %H:%M')
formatted_table_time = formatted_table_time.rename(columns={'Storms & HF_std': 'Storms_std'})

formatted_table_time.columns


In [None]:
station

In [None]:
#make a pretty pdf of the table with great_tables
from great_tables import GT, html, style, loc

# make time a column again
formatted_table_time = formatted_table.reset_index()
# make time a string
formatted_table_time['time'] = formatted_table_time['time'].dt.strftime('%Y-%m-%d %H:%M')

#change 'Storms & HF_std' to 'Storms_std'
formatted_table_time = formatted_table_time.rename(columns={'Storms & HF_std': 'Storms_std'})

# ntr_columns = ['Interdecadal','Interdecadal_std',
#                'Decadal','Decadal_std',
#                'Interannual','Interannual_std',
#                'Seasonal','Seasonal_std',
#                'Intraannual','Intraannual_std',
#                'Weekly','Weekly_std',
#                'Storms & HF','Storms_std']

ntr_columns = ['Decadal','Decadal_std',
               'Interannual','Interannual_std',
               'Seasonal','Seasonal_std',
               'Intraannual','Intraannual_std',
               'Weekly','Weekly_std',
               'Storms & HF','Storms_std']

# ntr_columns = ['Nodal','Nodal_std']

# col_width_dict = #make dictionary using ntr_columns
col_width_dict = {col: "30px" for col in ntr_columns}
#add time to col_width_dict
col_width_dict['time'] = "120px"
col_width_dict['Nodal Amp'] = "20px"

std_columns = [col for col in formatted_table_time.columns if 'std' in col]

# Create a Table object
table = (
    GT(formatted_table_time)
    .tab_options(table_font_size="12px")
    .cols_width(cases={"time" : "150px"})
        .cols_label(
        time=html(''),
        ntr=html('NTR'),
        sea_level=html('Sea Level'),
        tide=html('Tide'),
        tide_std=html('(σ̂)'),
        Decadal_std=html('(σ̂)'),
        Interannual_std=html('(σ̂)'),
        Interannual=html('Inter-\nannual'),
        Seasonal_std=html('(σ̂)'),
        Subannual_std=html('(σ̂)'),
        Subannual=html('Intra-\nannual'),
        Weekly_std=html('(σ̂)'),
        Storms_std=html('(σ̂)'),
        **{"Nodal Mod_std": html('(σ̂)')},  # Use quotes for column names with spaces
        #  **{"Nodal Amp_std": html('(σ̂)')},  # Use quotes for column names with spaces
        # tide=html('Tide')
        )
        # .tab_spanner(
            # label="Non-Tidal Residual", columns=ntr_columns)
        .tab_header(
            title=station_name + ' (' + str(station) + ')', subtitle='Top 10 Extreme Sea Level Events and their Non-Tidal Residual Components')
        .tab_source_note(
            source_note='Data are in cm, relative to MHHW. The (σ̂) represents the magnitude of each component relative to its standard deviation. Data source: NOAA CO-OPS Hourly Water Level. Time is GMT.')
        # .tab_source_note(
            # source_note='Data: ' +ds.attrs['title'] + ', ' + ds.attrs['publisher_url'] + ', ' + 'UHSLC Station ID: ' + str(station))
        # .fmt_number(
            # columns=std_columns, pattern='({x})',decimals=1)
        .data_color(columns=std_columns, palette='RdBu',reverse=True,domain=(-4,4),alpha=0.5)

)

# save the table to a pdf
output_path = Path(output_dir, f'1.5.2_SL_rankings_NTR_relative_amplitudes_top10_{station_name}_high_table.pdf')
table.save(str(output_path))
# save the table to a png
output_path = Path(output_dir, f'1.5.2_SL_rankings_NTR_relative_amplitudes_top10_{station_name}_high_table.png')
table.save(str(output_path))

In [None]:
# make time a column again
formatted_table_time = extremes_table_relative.reset_index()
# make time a string
formatted_table_time['time'] = formatted_table_time['time'].dt.strftime('%Y-%m-%d %H:%M')


# col_width_dict = #make dictionary using ntr_columns
col_width_dict = {col: "5%" for col in ntr_columns}
#add time to col_width_dict
# col_width_dict['time'] = "120px"

std_columns = [col for col in formatted_table_time.columns if 'std' in col]

timeframes_str = "; ".join([f"{k}: {v}" for k, v in timeframes.items()])

# Create a Table object
(
    GT(formatted_table_time)
    .tab_options(table_font_size="13px")
    .cols_width(cases={"time": "150px"})
    .cols_label(
        time = html(''),
        ntr=html('NTR'), sea_level=html('Sea Level'), 
        # Interdecadal = html('Inter-\ndecadal'),
        Interannual=html('Inter-\nannual'),Subannual=html('Sub-\nannual'),
        tide=html('Tide'))
        .tab_spanner(
            label="Non-Tidal Residual", columns=componentsNTR)
        .tab_header(
            title=station_name + ' (' + str(station) + ')',subtitle='Top 10 Extreme Sea Level Events and their Non-Tidal Residual Components')
        .tab_source_note(
            source_note='Data are in cm, relative to MHHW. '
            'The Nodal component shown here is included in the Tide component, representing the nodal modulation of the tide at the given hour. '
            'The timescales of the NTR components are as follows: ' + timeframes_str + '.')
            
        .tab_source_note(
            # source_note='Data: ' +ds.attrs['title'] + ', ' + ds.attrs['publisher_url'] + ', ' + 'UHSLC Station ID: ' + str(station))
            source_note = 'Data: NOAA CO-OPS Hourly Water Level, time in GMT')
        .fmt_number(
            columns=std_columns, pattern='({x})',decimals=1)
        .data_color(columns=std_columns, palette='RdBu',reverse=True,domain=(-4,4),alpha=0.5)
)

In [None]:
# let's turn our our bar chart into a pie chart
# we'll use what we used previously to get the std of each component
# ntr_cumsum_stds
# ntr_var = ntr_filtered['ntr'].var()

# ntr_component_vars['covariance'] = ntr_var - sum_ntr_var

# get the difference between each successive component
ntr_cumsum_diff = ntr_cumsum_stds.diff()
ntr_cumsum_diff[0] = 0
# ntr_cumsum_diff['NTR Trend'] = ntr_cumsum_stds['NTR Trend']

# ntr_cumsum_diff['Interdecadal'] = ntr_cumsum_stds['Interdecadal']

#reverse the order of the components
# ntr_cumsum_stds = ntr_cumsum_stds[::-1]
# ntr_component_vars = ntr_component_vars[::-1]

# # make a pie chart, ignoring co-variance for now because omg
fig, ax = plt.subplots(figsize=(3,3))
ax.pie(ntr_cumsum_diff, labels=ntr_cumsum_diff.index, startangle=140)
center_circle = plt.Circle((0, 0), 0.5, fc='white')  # Creates a white hole
ax.add_patch(center_circle)



In [None]:
# make a locations dictionary, with stations: (lat, lon)
# from ds
stations = ds['station_name'].values
lons = ds['lon'].values
lats = ds['lat'].values

locations = {station: (lon,lat) for station, lat, lon in zip(stations, lats, lons)}
locations_id = {id: (lon,lat) for id, (lon,lat) in zip(station_ids, locations.values())}

# make pie_data dictionary
ntr_component_stds_subset = ntr_component_stds.copy()
ntr_component_stds_subset = ntr_component_stds_subset.drop(['sea_level', 'tide'])
pie_data = {station: ntr_component_stds_subset for station in stations}


locations
station_ids

In [None]:
locations_id[station]

In [None]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import inset_axes

crs = ccrs.PlateCarree(central_longitude=180)

# Create figure and axis
fig, ax = plt.subplots(figsize=(10,5), subplot_kw={'projection': crs})
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.LAND, color='lightgrey')

# cmems = xr.open_dataset(data_dir / 'cmems_L4_SSH_0.125deg_1993_2024.nc')

xlims = [ds['lon'].min()-5, ds['lon'].max()+5]
ylims = [ds['lat'].min()-5, ds['lat'].max()+5]
xlims_360 = [x + 360 if x < 0 else x for x in xlims]

ax.set_extent([xlims_360[0], xlims_360[1], ylims[0], ylims[1]], crs=ccrs.PlateCarree())

def plot_pie_inset(data, lon, lat, station, ax, width, alpha=1):
    # Convert lat/lon to map display coordinates
    x, y = ax.projection.transform_point(lon, lat, ccrs.PlateCarree())[:2]
    y = y + 1 
    x = x + 0.5
    pie_ax = inset_axes(ax, width=width, height=width, loc=10, 
                        bbox_to_anchor=(x, y), bbox_transform=ax.transData, borderpad=0)
    pie_ax.set_facecolor('none')  # Fully transparent background

    wedges, texts, autotexts = pie_ax.pie(data, autopct='',startangle=140, 
                                          wedgeprops={'alpha': 0.8, 'edgecolor': 'black', 'linewidth': 0.5})
    
    # pie_ax.pie(data, startangle=140)  # Draw pie chart
    pie_ax.set_xticks([])
    pie_ax.set_yticks([])
    pie_ax.set_frame_on(False)  # Hide frame
    # add title
    # pie_ax.set_title(station, fontsize=8)

for station, data in pie_data.items():
    lon, lat = locations_id[station]
    # if data is all zeros, skip
    if np.all(data == 0):
        continue
    width = 0.5*(ntr_mag[station]/max(ntr_mag.values()))

    plot_pie_inset(data, lon, lat, station,ax, width=width)

ax.scatter(lons, lats, color='black', s=5, label='Station', transform=ccrs.PlateCarree())


#add grid
gl = ax.gridlines(draw_labels=True, linestyle=':', color='black',
                  alpha=0.5,xlocs=ax.get_xticks(),ylocs=ax.get_yticks(),crs=crs)
#make all labels tiny
gl.xlabel_style = {'size': 8}
gl.ylabel_style = {'size': 8}


# Will need to fix longitude labels!!
ax.set_title('Non-Tidal Residual Component Beachballs by Station')


In [None]:
# ntr_component_stds_df drop columns that are zeros, remove from lons and lats as well
# lons = lons[(ntr_component_stds_df != 0).any(axis=0)]
# lats = lats[(ntr_component_stds_df != 0).any(axis=0)]
ntr_component_stds_df = ntr_component_stds_df.loc[:, (ntr_component_stds_df != 0).any(axis=0)]


In [None]:
ntr_component_stds_df_mags.loc[componentsNTR]


In [None]:
# make ntr components figure for all gauges
# Create figure
# fig, ax = plt.subplots(figsize=(6, 6))

# make empty dataframe ntr_component_stds_df
ntr_component_stds_df_new = pd.DataFrame(columns=ds['station_name'].values)

# get list of all ntr_component_stds in data folder, and combine them into a single dataframe
for station, id in zip(ds['station_name'].values, ds['station_id'].values):
    # read in the ntr_component_stds file
    station_path = Path(data_dir, f'ntr_data/ntr_{id:03d}_component_stds.csv')
    if not station_path.exists():
        continue
    station_data = pd.read_csv(station_path, index_col=0)
    # add to the dataframe
    ntr_component_stds_df_new[station] = station_data.squeeze()
# ntr_component_stds = pd.read_csv(Path(data_dir, f'ntr_data/ntr_{station:03d}_component_stds.csv'), index_col=0)


# ntr_component_vars_cumsum = ntr_component_vars.cumsum()/ntr_component_vars_sum * ntr_var #normalize to the variance of the ntr (not filtered)
# # Plot stacked bars
# bottom = 0
# for i in range(len(ntr_component_waveheight.index)-1, -1, -1):
#     ax.bar('Components', ntr_component_waveheight[i], bottom=0, label=ntr_component_waveheight.index[i].replace('\n', ' '))

# # ax.bar('Total NTR', np.std(ntr_filled), color='white', edgecolor='black', linewidth=1)

# # Labels and title
# ax.set_ylabel('Height (cm)')
# ax.set_title('Non-Tidal Residual Components by Frequency \n' + station_name)
# ax.legend(loc='lower left', bbox_to_anchor=(1, 0))
# # plt.xticks(rotation=45)
# # plt.grid(axis='y', linestyle='--', alpha=0.7)

# # no box
# for spine in ax.spines.values():
#     spine.set_visible(False)

# figName = 'NTR_components_stds' + station_name
# glue('NTR_components_stds',fig,display=False)

ntr_component_stds_df