# Formulation of Real-Valued Coil as Neural Network Layer

Having established how coils are capable of producing interesting system dynamics in the form of flows of probability and demonstrating how any system dynamics can be expressed as a flow of probabilities using coil normalization, a natural next step is attempting to model real system dynamics with coils. 

We find that coils can be translated into Recurrent Neural Networks, as will be shown in this section. This was inspired by [Mamba's] handling of state spaces using RNNs / CNNs, though there is no direct code overlap. 

In [1]:
import pandas as pd
import numpy as np
import torch

We will begin with loading and coil normalizing our timeseries data as we did in the coil normalization section:

In [2]:
# Load in csv
df = pd.read_csv(r"../data/jena_climate_2009_2016.csv",
                parse_dates=['Date Time'],
                index_col=['Date Time'])
df.index = pd.to_datetime(df.index, format='%d.%m.%Y %H:%M:%S')

# Select certain columns
#df = df[['p (mbar)', 'T (degC)', 'rh (%)', 'VPmax (mbar)', 'sh (g/kg)', 'H2OC (mmol/mol)', 'rho (g/m**3)', 'wv (m/s)']]

# Save data frame
df_orig = df.copy()
# For these tests we will just use a small slice of the dataset
df = df.iloc[:3000,:]

In [3]:
# Instantiate CoilNormalizer
from coilspy.normalization import CoilNormalizer

coilnormer = CoilNormalizer()

coilnormed_df = coilnormer.normalize(df)

## Segmentation of Timeseries
For training a neural coil, we will want to segment our timeseries into tensors of size $[batch\;size, segment\;length, number\;of\; features]$. This will allow us to train a neural network (in this case just a single *neural coil* layer) to produce predictions for all features at the next step in time, for all steps in our sequence. 

In creating our segments, we have the option of whether or not our segments should have any overlap. The follow helper function can be used to segment timeseries to a given length and overlap. 

In [4]:
def segment_time_series(series, length, overlap=None):
    """
    Segment a time series into overlapping or non-overlapping segments.

    Parameters:
    - series (np.ndarray): Input time series array of shape [total_length, n_features].
    - length (int): Length of each segment.
    - overlap (bool): Whether to create overlapping segments. Default is False.

    Returns:
    - np.ndarray: Array of segments.
    """
    total_length, n_features = series.shape
    segments = []
    
    if overlap is None:
        step = length  # Non-overlapping segments with step size equal to segment length
    else:
        step = overlap  # If overlap, use value

    for start in range(0, total_length - length + 1, step):
        segment = series[start:start + length]
        segments.append(segment)
    
    return np.stack(segments)

We can apply this function to our timeseries to get our segments, however first we should create a timeseries that is offset by 1 to use as the outputs we wish to produce. 

In [5]:
# Generate and segment the time series
series = coilnormed_df.values
length = 36

series_x = series[:-1,]
series_y = series[1:,]

When we segment this timeseries, we will have a collection of input sequences and output sequences. 

In [6]:
segments_x = segment_time_series(series_x, length, overlap=None)
segments_y = segment_time_series(series_y, length, overlap=None)

# Convert to tensors
segments_tensor_x = torch.tensor(segments_x, dtype=torch.float)
segments_tensor_y = torch.tensor(segments_y, dtype=torch.float)

# Prepare inputs and targets
X = segments_tensor_x.to("cuda")
# Shift segments to the right by one timestep to create the targets
Y =  segments_tensor_y.to("cuda")

# Get number of features and batch size
n_features = X.shape[2]
batch_size = X.shape[0]

print("Available Batches: ", batch_size)

Available Batches:  83


## Neural Coil Formulation

Now we must consider the formulation of the neural coil layer itself. We find that we can use a very similar formulation as what is used for generation of complex-valued flows of probability, with some key differences:
1. All parameters will be real-valued
2. We can relax our earlier contraint that $\sum\limits_{i=1}^{N}\mathbf{M}_{ijk} = 1$ as this would restrict our training. We can instead simply renormalize our selected transition tensor $\mathbf{A}$ such that $\sum\limits_{i=1}^{N}\mathbf{A}_{ij}(t + \Delta t) = 1$

Aside from these relaxations, the process of selecting transition tensors and stepping through the coil will be identical to those used before.

For the initialization, the only trainable parameters will be those in the interaction tensor. It should be noted that this lack of hidden layers and parameters gives neural coil layers improved interpretability - every parameter can be interpreted such as "The probability of transitioning from B to A given a transition from C to D at the last timestep". For simplicity, we will initialize the transition tensor at all 0. 

In [7]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class NeuralCoilLayer(nn.Module):
    def __init__(self, n_features, n_batch, sel_temp = 1e-3, norm_temp = 7, device = "cpu"):
        super(NeuralCoilLayer, self).__init__()
        self.interaction_tensors = nn.Parameter(torch.rand(n_features, n_features, n_features, n_features + 1))
        self.sel_temp = sel_temp
        self.norm_temp = norm_temp
        
        starting_tensor = torch.softmax(torch.zeros(n_batch, n_features, n_features), dim = 1)
        if device == "cuda":
            self.starting_transition_tensor = starting_tensor.to("cuda")
        else:
            self.starting_transition_tensor = starting_tensor
            
    def select_transition_tensor(self, state_tensor,transition_tensor, interaction_tensor, sel_temperature):
        # Combine state and transition tensors into a norm_subgroups tensor
        norm_subgroups = torch.cat((state_tensor.unsqueeze(-1), transition_tensor), dim=2)
        
        # Get candidate transition tensors:
        candidate_transition_tensors = (torch.mul(interaction_tensor, norm_subgroups.unsqueeze(1).unsqueeze(1))).sum(-2) # [batches, states, states, states + 1]
        
        # Determine the largest and smallest states
        high_magnitude = torch.softmax(((state_tensor) / sel_temperature), dim = 1)
        low_magnitude = torch.softmax(1 - ((state_tensor) / sel_temperature), dim = 1)
        
        # Find which transition corresponds to moving from the highest state to the lowest state, and focus on that:
        transition_focus_slices = (torch.mul(torch.mul(candidate_transition_tensors, high_magnitude.unsqueeze(2).unsqueeze(1)), low_magnitude.unsqueeze(2).unsqueeze(2)))
        
        # Determine the selection weights by which slice is the highest in the focus transition
        selection_weights = torch.softmax(transition_focus_slices.sum(2).sum(1) / sel_temperature, dim = 1)
        
        # Perform weighted averaging
        selected_transition_tensor = torch.mul(candidate_transition_tensors, selection_weights.unsqueeze(1).unsqueeze(1)).sum(-1)
        
        return selected_transition_tensor
        
    def step_coil(self, state_tensor, transition_tensor):
        
        selected_transition_tensor = self.select_transition_tensor(state_tensor, transition_tensor, self.interaction_tensors, sel_temperature= self.sel_temp)
        
        selected_transition_tensor = torch.softmax(selected_transition_tensor * self.norm_temp, dim =1)
        
        new_state_tensor = torch.mul(selected_transition_tensor, state_tensor.unsqueeze(1)).sum(dim = -1)
        

        return new_state_tensor, selected_transition_tensor


    def forward(self, x):
        batch, length, n_features = x.size()
        output = x.new_empty(batch, length, n_features)

        # Initialize previous transition tensors (for the first step)
        # Assuming it's a list of zero tensors for simplicity
        transition_tensor = self.starting_transition_tensor

        for l in range(length):
            state_tensor = x[:, l, :]
            
            # Compute output for this step
            output[:, l, :], transition_tensor = self.step_coil(state_tensor, transition_tensor)

        return output, transition_tensor

In the ```forward``` pass of the neural coil layer, the transition tensor will start at 0, and the initial state will be taken from the first entry of the input sequence. This state and transition tensor will be passed into the coil step, which will output a new state and transition tensor. The new state is saved as the layer output, and the process is repeated, using the next state in the input sequence and the previously generated transition tensor. This is repeated for the length of the sequence.

This procedure is used during training, where otherwise the ```step_coil``` function can be used autoregressively, passing the previously generated state and transition tensors into the coil step to produce new ones, allowing for perpetuated generation of probability flows. While this can be done from the very first timestep, as transition tensors start at 0, we show how it can be beneficial to "wind up" coils, loading in known states for a number of steps to burn in the transition tensor before letting the coils generate autoregressively. 

## Neural Coil Training

Note that we have not performed any train / test splits for the data. That is because we will use a separate sequence of the original timeseries data for testing the trained coils autoregressively.



We will use a ```DataLoader``` for handling our training timeseries

In [8]:
from torch.utils.data import DataLoader, TensorDataset
from torch import optim

# Data loading
dataset = TensorDataset(X, Y)
batch_size = 10
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True, drop_last=True)

We can initialize a model that consists of a single neural coil layer. Note that neural coil layers can stack together and with other neural network layers, which will be demonstrated separately. 

In [9]:
# Single Coil
model = NeuralCoilLayer(
    n_features = n_features,
    n_batch = batch_size,
    device="cuda"
).to("cuda")

We will use the following training procedure, letting the optimizer run for a very long time to determine the best training result. While typically this should be expected to produce a overfit model it will be interesting to see how close coils can match system dynamics. 

In [10]:
epochs = 3000

# Hyperparameters
lr = 5e-3  # Initial learning rate
weight_decay = 0.1

# Optimizer
optimizer = optim.AdamW(model.parameters(), lr=lr, weight_decay=weight_decay)


In [11]:
# Loss and optimizer
criterion = nn.MSELoss()
#optimizer = optim.Adam(model.parameters(), lr=0.005)

# Training loop
best_loss = float('inf')
best_model_state = None

for epoch in range(epochs):
    model.train()
    for batch_X, batch_Y in dataloader:
        optimizer.zero_grad()
        # Forward pass
        outputs, _ = model(batch_X)
        loss = criterion(outputs, batch_Y)

        # Backward and optimize
        loss.backward()
        
        optimizer.step()
    
        
    # Save the best model
    if loss.item() < best_loss:
        best_loss = loss.item()
        best_model_state = model.state_dict()

    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item()}')

Epoch [10/3000], Loss: 3.5206750908400863e-05
Epoch [20/3000], Loss: 2.2361318769981153e-05
Epoch [30/3000], Loss: 2.6876285119215026e-05
Epoch [40/3000], Loss: 1.971589881577529e-05
Epoch [50/3000], Loss: 1.8713511963142082e-05
Epoch [60/3000], Loss: 2.4211911295424215e-05
Epoch [70/3000], Loss: 1.4536228263750672e-05
Epoch [80/3000], Loss: 1.8514861949370243e-05
Epoch [90/3000], Loss: 2.765654062386602e-05
Epoch [100/3000], Loss: 1.8028624253929593e-05
Epoch [110/3000], Loss: 2.015966128965374e-05
Epoch [120/3000], Loss: 1.4516161172650754e-05
Epoch [130/3000], Loss: 1.714676727715414e-05
Epoch [140/3000], Loss: 2.1920122890151106e-05
Epoch [150/3000], Loss: 2.189523002016358e-05
Epoch [160/3000], Loss: 2.1727935745730065e-05
Epoch [170/3000], Loss: 1.9041561245103367e-05
Epoch [180/3000], Loss: 1.857761526480317e-05
Epoch [190/3000], Loss: 1.5232404621201567e-05
Epoch [200/3000], Loss: 1.7159480194095522e-05
Epoch [210/3000], Loss: 1.787843939382583e-05
Epoch [220/3000], Loss: 1.486

We will load the best training state to make all these epochs worth it:

In [12]:
# Load the best model state after training
model.load_state_dict(best_model_state)
print("Best model loaded with loss:", best_loss)

Best model loaded with loss: 3.031301730516134e-06


## Coil Training Input to Output Review

With the coil finally trained, we can investigate how well it has accomplished the original task of producing an outputs for each step of an input sequence. Recall that for this check we are not generating autoregressively, instead for each step in the sequence the coil is re-grounded with a known input. 

In [21]:
import plotly.graph_objects as go

def plot_model_output_vs_target(model_outputs, targets, batch_index=0, feature_index=0):
    # Extract the specified feature for the given batch from both the model outputs and targets
    model_output_series = model_outputs[batch_index, :, feature_index].detach().numpy()
    target_series = targets[batch_index, :, feature_index].numpy()
    
    # Create a range for the x-axis (timesteps)
    timesteps = list(range(model_output_series.shape[0]))
    
    # Create traces
    model_trace = go.Scatter(x=timesteps, y=model_output_series, mode='lines', name='Model Output')
    target_trace = go.Scatter(x=timesteps, y=target_series, mode='lines', name='Target')
    
    # Create the figure and add traces
    fig = go.Figure()
    fig.add_trace(model_trace)
    fig.add_trace(target_trace)
    
    # Add title and labels
    fig.update_layout(title=f'Model Output vs Target for Feature {feature_index}, Batch {batch_index}',
                      xaxis_title='Timestep',
                      yaxis_title='Value')
    
    # Show the figure
    fig.show()

# Assuming `y` and `Y` are your model outputs and targets, respectively
# Adjust batch_index and feature_index as needed
plot_model_output_vs_target(outputs.to("cpu"), batch_Y.to("cpu"), batch_index=4, feature_index=2)

Unsurprisingly, when only tasked with predicting the next timestep given the previous timestep, the coil does quite well. We expect to see the discrepancies in the beginning of the sequence, as this is the burn-in of the transition tensor. This is helpful as it gives us a sense of how long we may need to wind up our coil before the transition tensor is properly initialized for autoregressive generation. 

## Unenforced Autoregressive Coil Generation

To give a sense of the dynamics coils are capable of producing, we can simply allow a coil to autoregressively generate from some initial state without performing any reconfirmation of state.

In [14]:
states = []
# Select the batch we want to make predictions for
batch = 8


# Grab starting state tensor
state_tensor = batch_X[:,0,:]

# How many steps do we want to run the coil overall?
max_steps = 100

batch_size = batch_X.shape[0]
transition_tensor = torch.softmax(torch.zeros(batch_size, n_features, n_features), dim = 1).to("cuda")
for step_state in range(max_steps):
    state_tensor, transition_tensor = model.step_coil(state_tensor, transition_tensor)
    states.append(state_tensor[batch,:])

# Move state dynamics to CPU
data = [row.to('cpu').detach().numpy() for row in states]
# Transpose the data to get 5 traces
traces = list(zip(*data))


# Create the figure and add traces
fig = go.Figure()

# Plotting
for i, trace in enumerate(traces):
    model_trace = go.Scatter(y=trace, mode='lines', name=f'State {i}')
    fig.add_trace(model_trace)

# Add title and labels
fig.update_layout(title=f'Self-Perpetuating Coil Dynamics',
                    xaxis_title='Timestep',
                    yaxis_title='Value')

# Show the figure
fig.show()

Here we can see complexity in the dynamics able to be produced by a single neural coil layer, while also getting a sense for why these can be difficult to train. 

## Enforced Autoregressive Coil Generation

As previously mentioned, we can also perform coil generation where we provide enforcement for a certain number of timestamps by passing in known state values, before letting the coil autoregressively generate. This helps initialize the transition tensor. 

In this example, we will enforce the coil with known values for the first 24 timesteps before allowing it to autoregressively generate probability flows. While our time series sections only have a sequence length of 36, we can have coils generate indefinitely. 

In [15]:
states = []
# Select the batch we want to make predictions for
batch = 1
# After what step do we want the coil to start making its own predictions?
prediction_step = 24

# What features do we want to plot?
feature_sel = [0, 15]

# Grab starting state tensor
state_tensor = batch_X[:,0,:]

# How many steps do we want to run the coil overall?
max_steps = 70

states.append(state_tensor[batch,feature_sel])
batch_size = batch_X.shape[0]
transition_tensor = torch.softmax(torch.zeros(batch_size, n_features, n_features), dim = 1).to("cuda")
for step_state in range(1,max_steps):
    state_tensor, transition_tensor = model.step_coil(state_tensor, transition_tensor)
    if step_state <= prediction_step:
        state_tensor = batch_X[:,step_state,:]
    states.append(state_tensor[batch,feature_sel])
    
# Move state dynamics to CPU
data = [row.to('cpu').detach().numpy() for row in states]
traces = list(zip(*data))

# Create the figure and add traces
fig = go.Figure()

# Plotting
for i, trace in enumerate(traces):
    model_trace = go.Scatter(y=trace, mode='lines', name=f'Modelled Feature {feature_sel[i]}')
    fig.add_trace(model_trace)
    
    
# Get observed data to CPU and plot
data = batch_X[batch,:,feature_sel].to('cpu')
# Transpose the data to get 5 traces
traces = list(zip(*data))

# Plotting
for i, trace in enumerate(traces):
    obs_trace = go.Scatter(y=trace, mode='lines', name=f'Observed Feature {feature_sel[i]}')
    fig.add_trace(obs_trace)
    
# Add Line highlighting when the coil becomes responsible for its own predictions
fig.add_vline(x=prediction_step, line_width=3, line_dash="dash", line_color="grey", annotation_text='Point of Prediction')    

# Add title and labels
fig.update_layout(title=f'Coil Training Modelled vs Observed Comparison',
                    xaxis_title='Timestep',
                    yaxis_title='Value')

# Show the figure
fig.show()

Here we can see, at the Point of Prediction where the coil is no longer provided the support of known values, it begins freely generating dynamics. 

## Coil "Memory" Demonstration

It should be emphasized that the process of providing known values up to a point of prediction during coil generation does mean the coil only makes predictions based on the last known value. We can demonstrate that, due to the transition tensor, coils have a form of "memory" where different inputs provided during the "wind up" stage will result in different dynamics. We can demonstrate this by performing two enforced generations where in one the inputs for a single timestep during the wind up period are arbitrarily changed. 

In [16]:
import matplotlib.pyplot as plt

states = []
# Select the batch we want to make predictions for
batch = 8

# After what step do we want the coil to start making its own predictions?
prediction_step = 24

# What features do we want to plot?
feature_sel = [0,15]

# Grab starting state tensor
state_tensor = batch_X[:,0,:]

# How many steps do we want to run the coil overall?
max_steps = 70

# Which timestamp do we want to force modify?
step_modify = 20

states.append(state_tensor[batch,feature_sel])
batch_size = batch_X.shape[0]
transition_tensor = torch.softmax(torch.zeros(batch_size, n_features, n_features), dim = 1).to("cuda")
for step_state in range(1,max_steps):
    state_tensor, transition_tensor = model.step_coil(state_tensor, transition_tensor)
    if step_state <= prediction_step:
        state_tensor = batch_X[:,step_state,:]
    states.append(state_tensor[batch,feature_sel])
    

# Get observed data to CPU and plot
data = [row.to('cpu').detach().numpy() for row in states]
# Transpose the data to get 5 traces
traces = list(zip(*data))

# Create the figure and add traces
fig = go.Figure()

# Plotting
for i, trace in enumerate(traces):
    model_trace = go.Scatter(y=trace, mode='lines', name=f'Modelled Feature {feature_sel[i]}')
    fig.add_trace(model_trace)
    
# Try with changing one X in memory
states = []
state_tensor = batch_X[:,0,:]
states.append(state_tensor[batch,feature_sel])
batch_size = batch_X.shape[0]
transition_tensor = torch.softmax(torch.zeros(batch_size, n_features, n_features), dim = 1).to("cuda")
for step_state in range(1,max_steps):
    state_tensor, transition_tensor = model.step_coil(state_tensor, transition_tensor)
    if step_state <= prediction_step:
        state_tensor = batch_X[:,step_state,:]
        
    # change a single state in memory:
    if step_state == step_modify:
        state_tensor = batch_X[:,step_state,:] * 3
    states.append(state_tensor[batch,feature_sel])


# Get observed data to CPU and plot
data = [row.to('cpu').detach().numpy() for row in states]
# Transpose the data to get 5 traces
traces = list(zip(*data))

# Plotting
for i, trace in enumerate(traces):
    obs_trace = go.Scatter(y=trace, mode='lines', name=f'Altered Feature {feature_sel[i]}')
    fig.add_trace(obs_trace)
    
# Add Line highlighting when the coil becomes responsible for its own predictions
fig.add_vline(x=prediction_step, line_width=3, line_dash="dash", line_color="grey", annotation_text='Point of Prediction')    

# Add title and labels
fig.update_layout(title=f'Coil Memory Demonstration',
                    xaxis_title='Timestep',
                    yaxis_title='Value')


fig.add_vrect(x0=step_modify-1, x1=step_modify+1, 
              annotation_text="altered time", annotation_position="top left",
              fillcolor="grey", opacity=0.25, line_width=0)

# Show the figure
fig.show()

We can see that, while most of the inputs during the wind up are the same, the fact one timestamp has different values causes very different coil dynamics. 

## Coil Prediction Testing

To see how coils perform on unseen data, we can load in a wider range of the original timeseries, generating the sequence chunks for the coil inputs. 

In [17]:
# Take original timeseries and bring in data we haven't seen before
df_test = df_orig.iloc[:7000,:]

# Take our previously fit coil normalizer and apply it to the new data
coilnormed_df_test = coilnormer.normalize(df_test, fit_change=False)

# Generate and segment the time series
series = coilnormed_df_test.values
length = 36

series_x = series[:-1,]
segments_x = segment_time_series(series_x, length,overlap=None)

# Convert to tensors
segments_tensor_x = torch.tensor(segments_x, dtype=torch.float)

# Prepare inputs and targets
X = segments_tensor_x.to("cuda")

# Get number of features and batch size
n_features = X.shape[2]
batch_size = X.shape[0]

print("Available Batches: ", batch_size)

Available Batches:  194


We can do a similar comparison of modelled and observed features as before. 

In [18]:
states = []
# Select the batch we want to make predictions for
batch = 102

# After what step do we want the coil to start making its own predictions?
prediction_step = 24

# What features do we want to plot?
feature_sel = [0,15]

# Grab starting state tensor
state_tensor = X[:,0,:]

# How many steps do we want to run the coil overall?
max_steps = 45

states.append(state_tensor[batch,feature_sel])
batch_size = X.shape[0]
transition_tensor = torch.softmax(torch.zeros(batch_size, n_features, n_features), dim = 1).to("cuda")
for step_state in range(1,max_steps):
    state_tensor, transition_tensor = model.step_coil(state_tensor, transition_tensor)
    if step_state <= prediction_step:
        state_tensor = X[:,step_state,:]
    states.append(state_tensor[batch,feature_sel])
    #print(sum(state_tensor[batch,:]))
    
# Move state dynamics to CPU
data = [row.to('cpu').detach().numpy() for row in states]
traces = list(zip(*data))

# Create the figure and add traces
fig = go.Figure()

# Plotting
for i, trace in enumerate(traces):
    model_trace = go.Scatter(y=trace, mode='lines', name=f'Modelled Feature {feature_sel[i]}')
    fig.add_trace(model_trace)
    
    
# Get observed data to CPU and plot
data = X[batch,:,feature_sel].to('cpu')
# Transpose the data to get 5 traces
traces = list(zip(*data))

# Plotting
for i, trace in enumerate(traces):
    obs_trace = go.Scatter(y=trace, mode='lines', name=f'Observed Feature {feature_sel[i]}')
    fig.add_trace(obs_trace)
    
# Add Line highlighting when the coil becomes responsible for its own predictions
fig.add_vline(x=prediction_step, line_width=3, line_dash="dash", line_color="grey", annotation_text='Point of Prediction')    

# Add title and labels
fig.update_layout(title=f'Coil Testing Modelled vs Observed Comparison',
                    xaxis_title='Timestep',
                    yaxis_title='Value')

# Show the figure
fig.show()

## Denormalization Checks

To translate from flows of probability back into our original timeseries values, we can use the denormalization function that is part of our coil normalizer class. 

In [20]:
# We should be able to take any slice of the coilnormed timeseries and reproduce
start_index = 36
length = 36
end_index = start_index + length
df_test_slice = df_test.iloc[start_index:end_index,:]
coilnormed_df_test_slice = coilnormer.normalize(df_test_slice, fit_change=False)
initial_value_slice = df_test.iloc[start_index,:]

# Generate and segment the time series
series = coilnormed_df_test_slice.values

series_x = series[:-1,]

# Convert to tensors
segments_tensor_x = torch.tensor(series_x, dtype=torch.float).unsqueeze(0)

# Prepare inputs and targets
X = segments_tensor_x.to("cuda")


states = []
# Select the batch we want to make predictions for
batch = 0

# After what step do we want the coil to start making its own predictions?
prediction_step = 24

# Grab starting state tensor
state_tensor = X[:,0,:]

# How many steps do we want to run the coil overall?
max_steps = length - 1

states.append(state_tensor[batch,:])
batch_size = X.shape[0]
transition_tensor = torch.softmax(torch.zeros(batch_size, n_features, n_features), dim = 1).to("cuda")
for step_state in range(1,max_steps):
    state_tensor, transition_tensor = model.step_coil(state_tensor, transition_tensor)
    if step_state <= prediction_step:
        state_tensor = X[:,step_state,:]
    states.append(state_tensor[batch,:])
    
# Move state dynamics to CPU
data = [row.to('cpu').detach().numpy() for row in states]

for index, dat_in in enumerate(data):
    coilnormed_df_test_slice.iloc[index,:] = dat_in
    
denormed_slice = coilnormer.denormalize(coilnormed_df_test_slice, initial_value_slice)


# Sample data creation
df1 = denormed_slice
df2 = df_test_slice

# Plotting
fig = go.Figure()

# Add traces for the first dataframe
for column in df1.columns:
    fig.add_trace(go.Scatter(x=df2.index, y=df1[column], mode='lines', name=f'Model: {column}'))

# Add traces for the second dataframe
for column in df2.columns:
    fig.add_trace(go.Scatter(x=df2.index, y=df2[column], mode='lines', name=f'Orig: {column}'))

# Update layout
fig.update_layout(title='Model Denormalized Testing',
                  xaxis_title='Date',
                  yaxis_title='Value',
                  legend_title='Legend',
                  hovermode='x unified')

# Show the plot
fig.show()
