# Coil Normalization
We have shown the derivation of coils to describe the flow of real- and complex-valued probability, however many dynamical systems are not typically expressed as flows as probability. A simple technique, referred to here as coil normalization, can be used to express any system dynamics as a flow of probability. 

## System Change as a Probability
For a univariate timeseries, we can either assign some maximum absolute change in a single $\Delta t$, either calculated from historical dyanmics or estimated. Then we simply treat the series as having two distinct system states - $\uparrow$ or $\downarrow$, and treat all actual changes as a superposition of these two states:

$$
\Delta y(t) = P(\uparrow)\Delta y_{max} - P(\downarrow)\Delta y_{max} \iff P(\uparrow) + P(\downarrow) = 1
$$

So $P(\uparrow)=1$ and $P(\downarrow)=0$ represents the system increasing by the maximum change, $P(\uparrow)=0$ and $P(\downarrow)=1$ represents the system decreasing by the maximum change, and $P(\uparrow)=0.5$ and $P(\downarrow)=0.5$ represents the system having no change between $t$ and $t+\Delta t$. 

It follows that:

$$
P(\uparrow) = \frac{\Delta y(t) + \Delta y_{max}}{2\Delta y_{max}} \quad and \quad P(\downarrow) = 1- \frac{\Delta y(t) + \Delta y_{max}}{2\Delta y_{max}}
$$


This operation is reversible provided a known original value. 

## Application to Multi-Feature Systems
In multivariate systems, this same approach can be applied to each feature, and then the array of all probabilities can be normalized to yield a single array of probabilities across $\uparrow_{i}$ and $\downarrow_{i}$ states for each feature $i$. This too is reversible by renormalizing $\uparrow_{i}$ and $\downarrow_{i}$ to each other for all $i$.   

## Coil Normalization Class

To handle calculating of max absolute change, normalization, and denormalization, we can make a ```CoilNormalizer``` class. 

In [77]:

import pandas as pd
import numpy as np

class CoilNormalizer():
    def __init__(self):
        super(CoilNormalizer, self).__init__()
        self.max_change_df = None
        self.conserved_subgroups = None
    
    def max_absolute_change(self, df_diff, change_factor = 1.0):
        # Calculate the absolute change for each feature
        abs_change = df_diff.abs()

        # Find the maximum absolute change for each feature
        max_changes = abs_change.max() * change_factor

        # Convert the series to a DataFrame
        max_change_df = max_changes.to_frame(name='Max Absolute Change')

        return max_change_df
    
    def normalize(self, df, fit_change = True, change_factor = 1.0):
        # Calculate the change for each feature
        df_diff = df.diff()

        # Drop first row
        df_diff = df_diff.iloc[1:,:]
        
        # If we don't fit the max absolute change, the expectation is
        #  we are using a previously fit one. 
        if fit_change:
            # Calculate the max absolute change for each feature
            self.max_change_df = self.max_absolute_change(df_diff, change_factor = change_factor)
        else:
            if self.max_change_df is None:
                raise ValueError('No max absolute change previously fit')

        normalized_features = []
        conserved_subgroups = {}
        count = 0
        index = 0
        for column in df_diff.columns:
            max_change_val = self.max_change_df.loc[column, 'Max Absolute Change']
            normalized_feature = (df_diff[column] + max_change_val) / (2 * max_change_val)
            
            normalized_counter = 1 - normalized_feature
            
            normalized_features.append(normalized_counter)
            normalized_features.append(normalized_feature)
            
            conserved_subgroups[index] = [count, count + 1]
            index += 1
            count += 2
            
        # Create dataframe
        normalized_df = pd.DataFrame(normalized_features).T
                
        # Normalize the dataframe again by row such that each row sums to 1
        normalized_df = normalized_df.div(normalized_df.sum(axis=1), axis=0)
        
        self.conserved_subgroups = conserved_subgroups
        return normalized_df
    
    def denormalize(self, normalized_df, initial_value):
        # For now assume there are even pairs
        denorm_dict = {}
        for key, value in self.conserved_subgroups.items():
            df_sel = normalized_df.iloc[:,value]
            denorm_dict[df_sel.columns[1]] = df_sel.div(df_sel.sum(axis=1), axis=0).iloc[:,1]
            
        denormalized_df = pd.DataFrame(denorm_dict, index=normalized_df.index)
        
        reconstructed_array = []
        new_value = initial_value
        reconstructed_array.append(initial_value)
        delta_max = self.max_change_df.iloc[:,0]
        for i in range(denormalized_df.shape[0]):
            delta_t = 2 * denormalized_df.iloc[i,:] * delta_max - delta_max
            new_value = new_value + delta_t
            reconstructed_array.append(new_value)

        return pd.DataFrame(reconstructed_array, index = [initial_value.name] + list(denormalized_df.index))

## Coil Normalization Demo

Coil normalization can be demonstrated by loading in a Pandas dataframe of some multivariate timeseries data. 

In [78]:
# Load in csv
df = pd.read_csv(r"../data/jena_climate_2009_2016.csv",
                parse_dates=['Date Time'],
                index_col=['Date Time'])
df.index = pd.to_datetime(df.index, format='%d.%m.%Y %H:%M:%S')

# Save data frame
df_orig = df.copy()
# For these tests we will just use a small slice of the dataset
df = df.iloc[:3000,:]

Then we can create a coil normalizer and create a new normalized dataframe. 

In [79]:
# Instantiate CoilNormalizer
coilnormer = CoilNormalizer()

coilnormed_df = coilnormer.normalize(df)
coilnormed_df

Unnamed: 0_level_0,p (mbar),p (mbar),T (degC),T (degC),Tpot (K),Tpot (K),Tdew (degC),Tdew (degC),rh (%),rh (%),...,H2OC (mmol/mol),H2OC (mmol/mol),rho (g/m**3),rho (g/m**3),wv (m/s),wv (m/s),max. wv (m/s),max. wv (m/s),wd (deg),wd (deg)
Date Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2009-01-01 00:20:00,0.033510,0.037919,0.044643,0.026786,0.044586,0.026843,0.044702,0.026727,0.035356,0.036072,...,0.044173,0.027256,0.026055,0.045373,0.039755,0.031674,0.037946,0.033482,0.037331,0.034098
2009-01-01 00:30:00,0.037478,0.033951,0.038004,0.033425,0.037989,0.033439,0.036424,0.035005,0.033925,0.037504,...,0.036654,0.034774,0.033641,0.037787,0.042623,0.028806,0.043482,0.027946,0.032172,0.039257
2009-01-01 00:40:00,0.036596,0.034832,0.031136,0.040293,0.030937,0.040491,0.030038,0.041391,0.034641,0.036788,...,0.030075,0.041353,0.040662,0.030767,0.033759,0.037669,0.036875,0.034554,0.033080,0.038349
2009-01-01 00:50:00,0.035714,0.035714,0.034799,0.036630,0.035032,0.036397,0.035005,0.036424,0.036072,0.035356,...,0.034774,0.036654,0.036609,0.034819,0.035975,0.035454,0.034554,0.036875,0.034088,0.037341
2009-01-01 01:00:00,0.036155,0.035273,0.030678,0.040751,0.030482,0.040946,0.029565,0.041864,0.034641,0.036788,...,0.030075,0.041353,0.041086,0.030343,0.037148,0.034281,0.035714,0.035714,0.037870,0.033559
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2009-01-21 19:20:00,0.031305,0.040123,0.038690,0.032738,0.038672,0.032757,0.036424,0.035005,0.033567,0.037861,...,0.037594,0.033835,0.032369,0.039060,0.034150,0.037278,0.033482,0.037946,0.037431,0.033998
2009-01-21 19:30:00,0.030423,0.041005,0.038919,0.032509,0.039126,0.032302,0.036660,0.034768,0.033209,0.038219,...,0.036654,0.034774,0.031945,0.039484,0.041319,0.030109,0.040179,0.031250,0.036523,0.034906
2009-01-21 19:40:00,0.031305,0.040123,0.039148,0.032280,0.039354,0.032075,0.037370,0.034059,0.033925,0.037504,...,0.038534,0.032895,0.031851,0.039578,0.034281,0.037148,0.032411,0.039018,0.039636,0.031792
2009-01-21 19:50:00,0.029541,0.041887,0.039377,0.032051,0.039581,0.031847,0.038553,0.032876,0.034641,0.036788,...,0.040414,0.031015,0.031238,0.040190,0.037018,0.034411,0.040179,0.031250,0.032152,0.039277


Each pair in this timeseries is representative of the $\uparrow_{i}$ , $\downarrow_{i}$ pair for each feature $i$. As expected, each row of this normalized timeseries sums to 1:

In [80]:
coilnormed_df.sum(axis = 1)

Date Time
2009-01-01 00:20:00    1.0
2009-01-01 00:30:00    1.0
2009-01-01 00:40:00    1.0
2009-01-01 00:50:00    1.0
2009-01-01 01:00:00    1.0
                      ... 
2009-01-21 19:20:00    1.0
2009-01-21 19:30:00    1.0
2009-01-21 19:40:00    1.0
2009-01-21 19:50:00    1.0
2009-01-21 20:00:00    1.0
Length: 2999, dtype: float64

We should be able to take any slice of the coil normalized time series and reproduce the exact original data, provided an initial value. 

In [81]:
start_index = 200 # Start index of the slice
end_index = 300 # End index of the slice
df_orig_slice = df.iloc[start_index:end_index,:]
coilnormed_df_slice = coilnormed_df.iloc[(start_index):(end_index-1),:]
initial_value_slice = df.iloc[start_index,:]

denormed_slice = coilnormer.denormalize(coilnormed_df_slice, initial_value_slice)

We can superimpose the denormalized and original timeseries slices on one another to confirm they are identical:

In [82]:
import plotly.graph_objects as go

# Sample data creation
df1 = denormed_slice
df2 = df_orig_slice

# Plotting
fig = go.Figure()

# Add traces for the first dataframe
for column in df1.columns:
    fig.add_trace(go.Scatter(x=df2.index, y=df1[column], mode='lines', name=f'Denormed: {column}'))

# Add traces for the second dataframe
for column in df2.columns:
    fig.add_trace(go.Scatter(x=df2.index, y=df2[column], mode='lines', name=f'Orig: {column}'))

# Update layout
fig.update_layout(title='Interactive Denormalization Check',
                  xaxis_title='Date',
                  yaxis_title='Value',
                  legend_title='Legend',
                  hovermode='x unified')

# Show the plot
fig.show()


This coil normalization technique can be used to transform any timeseries into a flow of probabilities, which will be helpful in our efforts to model these system dynamics with coils. 