# Identification of Fading Memory Nonlinearities

We will revisit the problem of system identification and have a look at an example where we use a simple
convolutional neural network to model our unknown system.

## What is System Identification?
The general setup for system identification is depicted in the figure below. We observe an input signal $x[n]$ and
corresponding measurements $y[n]$ of the system's output $f(x[n], x[n-1], \dots, y[n-1], y[n-2], \dots)$ that are corrupted by measurement noise $\nu[n]$.
The goal of system identification is now to find an approximate system $\hat{f}(\cdot)$ that "behaves as similarly as
possible to the true system $f(\cdot)$".

<img src="../../data/figures/system-identification.png" alt="System Identification Schematic" width="600" />

What does "as similar as possible" actually mean? We can characterize the approximation quality by considering
(functions of) the error signal $e[n] = \hat{y}[n] - y[n]$. Here we will use a standard measure, the mean squared error
(MSE) $J = \frac{1}{N} \sum_{n=1}^N e[n]^2$, that comes as a natural consequence when we assume that the noise $\nu$
 is normally distributed. See the class notes for more details.

**Importantly, be aware that we make several assumptions here:**
 - The system $f(\cdot)$ is deterministic.
 - The measurements $y[n]$ are corrupted with additive noise, whereas they could arbitrarily depend on the noise!
 - We assume that $x[n]$ and $y[n]$ are scalars.

We will further assume that $f(\cdot)$ is a static (memory-less) nonlinearity.

So let's get started. First, we need some imports.

In [None]:
%reload_ext autoreload
%autoreload 2
import matplotlib.pyplot as plt
import numpy as np
from scipy.linalg import hankel

from src.models.simple_cnn import SimpleCNN
from src.utils.plotting import init_plot_style
%pylab

# initialize our global plot style
init_plot_style()

We create our true system that we want to identify later on and plot the system response for a random test signal.

In [None]:
def system(x):
    """Implements a deterministic, fading memory nonlinearity.

    To generate an output of equal length as the input signal we apply zero-padding.

    Parameters
    ----------
    x : float or numpy.ndarray
        The systems input.

    Returns
    ------
    float or numpy.ndarray
        The corresponding system output.
    """
    mem_depth = 5
    weights = np.array([0.1, 0.5, 1.4, 5.3, 3.2])
    padded_x = np.concatenate((np.zeros(mem_depth - 1,), x))
    X = hankel(padded_x[:mem_depth], padded_x[mem_depth-1:])
    X[0,:] = (X[0,:] - X[4,:])**2
    X[1,:] = np.exp(X[1,:] - X[2,:])
    X[2,:] = np.sqrt(np.abs(X[2,:]))
    X[3,:] = X[3,:]**3
    X[4,:] = np.log(np.abs(X[4,:]) + 1e-3)
    return weights.dot(X)


# sample support and compute system output
n_samples = 50
test_signal = np.random.uniform(0, 1, n_samples)
# test_signal = np.cos(0.1*np.pi*np.arange(n_samples))
test_output = system(test_signal)

# plot the input/output behavior
plt.close('all')
plt.figure()
plt.plot(test_signal, label='Test Input, $x[n]$')
plt.plot(test_output, label='Test Output, $y[n]$')
plt.xlabel('Time Index, $n$')
plt.ylabel('$x[n]$, $y[n]$')
plt.legend()
plt.tight_layout()

We can now generate a training and test set by generating random test signals,
propagating them over the system and corrupting them with additive white Gaussian noise.

In [None]:
n_train = 500 # length of training signals
n_test = 200 # length of test signals
nu = 0.4
rng = np.random.default_rng(seed=0)

x_train = rng.uniform(size=n_train) * 2.0 - 1.0 # draw samples from support
noise = rng.normal(loc=0, scale=nu, size=n_train) # generate noise
y_train = system(x_train) + noise # simulate measurements

x_test = rng.uniform(size=n_test) * 2.0 - 1.0 # draw samples from support
noise = rng.normal(loc=0, scale=nu, size=n_test) # generate noise
y_test = system(x_test) + noise # simulate measurements

We are now ready to fit our simple neural network to the system.

In [None]:
# create and fit an instance of our simple CNN model given our training data
cnn_model = SimpleCNN(num_kernels=5, mem_depth=5)
train_loss_list, val_loss_list = cnn_model.fit(x_train, y_train, learning_rate=1e-1, max_epochs=500)

# plot the evolution of the training MSE
plt.close('all')
plt.figure()
plt.plot(list(range(1, 1 + len(train_loss_list))), train_loss_list)
plt.plot(list(range(1, 1 + len(val_loss_list))), val_loss_list)
plt.xlabel('Epoch')
plt.ylabel('Training Set MSE')
plt.tight_layout()


Finally, we plot the predicted training and test outputs with their respective ground truths.

In [None]:

prediction = cnn_model(x_train).detach().squeeze().numpy()
train_mse = np.mean((prediction - y_train)**2)
print(f'Training MSE is {train_mse}.')
plt.figure()
plt.plot(y_train, label='True Training Output, $y[n]$')
plt.plot(prediction, label=r'Predicted Output, $\hat{y}[n]$')
plt.xlabel('Time Index, $n$')
plt.ylabel(r'$y[n]$, $\hat{y}[n]$')
plt.legend()
plt.tight_layout()

prediction = cnn_model(x_test).detach().squeeze().numpy()
test_mse = np.mean((prediction - y_test)**2)
print(f'Test MSE is {test_mse}.')
plt.figure()
plt.plot(y_test, label='True Test Output, $y[n]$')
plt.plot(prediction, label=r'Predicted Output, $\hat{y}[n]$')
plt.xlabel('Time Index, $n$')
plt.ylabel(r'$y[n]$, $\hat{y}[n]$')
plt.legend()
plt.tight_layout()