# Signal functions in depth

* Geometry example, raw acc/gyro data from Wave and compute pose, 3D cube? Gravity? Linear acceleration?
* Filtering / spectrogram example with mic data / wave forms
* ML inference example
* Windowed signal functions: delay? FFT?
* Creating custom signal functions, serialization quirks

`SignalFunction`s are functions that take in one or more sigals and return a new signal. They can be one of to types:

* one-to-one functions take in one sample and return one sample. They can be stateful and depend on previous samples, but the number of samples out should always be the same as number of samples in.
* many-to-one (Windowed) functions operate on a sliding window of samples. The windows can overlap, but the function only returns one value per window. They can still receive a single sample at a time but will only return a value when the window has been filled. As well as defining what to do with the whow to process a window they need to define the window length window overlap. 

In [11]:
from genki_signals.system import System
from genki_signals.sources import MicSource, CameraSource, Sampler
import genki_signals.functions as f
from genki_signals.frontends import *

A good example of a windowed signal function is short-time Fourier transform (stft) which computes the Discrete Fourier transform on a sliding window:

In [4]:
mic = MicSource()
stft = f.FourierTransform("audio", "fourier", window_size=1024, window_overlap=512)
fourier_system = System(mic, [stft])

fourier_system.start()

One way to view data from a `System` is to connect it to a buffer with `System.register_data_feed`. This can be useful to see if a `SignalFunction` is behaving properly

In [5]:
from genki_signals.buffers import DataBuffer

buffer = DataBuffer()

fourier_system.register_data_feed(id(buffer), lambda d: buffer.extend(d))

Now all data that goes through our `System` will be sent to the buffer. Below we can view our buffer.

In [6]:
buffer

DataBuffer(max_size=None, data=timestamp: (49,)
audio: (50176,)
fourier: (513, 98))

In [7]:
fourier_system.stop()

Note that in this case, the time series called `'fourier'` is shorter than `'audio'` - it is downsampled. The `DataBuffer` will accept data like this, but with great power comes great reponsibility. This can be a source of bugs since one tends to assume that timeseries grouped together are operating on the same frequency.

Another example of a `SignalFunction` are the `Inference` and `WindowedInference` functions which take in a onnx file and calculate real-time inference with the neural network.

Let's load a model which predicts the relative depth of an image. For this example to work you need to have [pytorch](https://pytorch.org/) and [timm](https://pypi.org/project/timm/) installed.

(Note: You can also create your own `Inference` function but more on custom `SignalFunctions` later.)

We start by loading the model and creating an ONNX file:

In [9]:
import torch

#load the model from torch.hub
midas = torch.hub.load("intel-isl/MiDaS", "MiDaS_small")

input_resolution = (256, 256)
dummy_input = torch.randn((1, 3, *input_resolution))
model_path = "./midas.onnx"

# export the model to onnx
torch.onnx.export(
    midas,
    dummy_input,
    model_path,
    input_names=["input"],
    output_names=["output"],
    # dynamic_axes=({"input": [0]})
)

Using cache found in /Users/egill/.cache/torch/hub/intel-isl_MiDaS_master


Loading weights:  None


Using cache found in /Users/egill/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master


verbose: False, log level: Level.ERROR



To connect the model to our camera we need to create a `CameraSource`, a `Sampler` to sample it, and connect it to a `System` which also computes the inference

In [12]:
camera = CameraSource(resolution=input_resolution)
camera_sampler = Sampler({"model_input": camera}, 10)
model_inference = f.Inference("model_input", "model_output", model_path, stateful=False)
inference_system = System(camera_sampler, [model_inference], update_rate=50)
inference_system.start()

The `Inference` function takes in an input name, output name, model path, and the boolean parameter `stateful`. When `stateful` is `False`, the model works on one input sample at a time independently. 

With `stateful=True`, the model is assumed to also take in a state vector as input, and output a new state vector along with the output (i.e. it is a Recurrent Neural Network), allowing it to operate with some historical context.

Now the depth image is under the key "model_output" and we can visualize it with our `WidgetFrontend`

In [13]:
from genki_signals.frontends import Video, WidgetFrontend

video = Video("model_input")
depth = Video("model_output")

frontend = WidgetFrontend(inference_system, [video, depth])

frontend

VBox(children=(HBox(children=(Image(value=b'', format='jpeg'), Image(value=b'', format='jpeg'))), HBox()))

In [14]:
inference_system.stop()

Notes on using multiple signal functions:

If one `SignalFunction` depends on the output of another `SignalFunction` then it needs to come after the other in the list of functions when initializing the `System`

An example of a sequence of `SignalFunction`s:

In [18]:
pos_to_vel = Differentiate("mouse_pos", "timestamp", "mouse_vel")
vel_to_acc = Differentiate("mouse_vel", "timestamp", "mouse_acc")
acc_to_vel = Integrate("mouse_acc", "timestamp", "mouse_vel_2")
vel_to_pos = Integrate("mouse_vel_2", "timestamp", "mouse_pos_2", use_trapz=False)

mouse_source = MouseSource()
sampler = Sampler({"mouse_pos": mouse_source}, 100)
system = System(sampler, [pos_to_vel, vel_to_acc, acc_to_vel, vel_to_pos])

system.start()

In [19]:
from genki_signals.frontends import Line, WidgetFrontend

pos =  Line("timestamp", "mouse_pos")
pos2 = Line("timestamp", "mouse_pos_2")
vel = Line("timestamp", "mouse_vel")
vel2 = Line("timestamp", "mouse_vel_2")

frontend = WidgetFrontend(system, [pos, pos2, vel, vel2])
frontend

VBox(children=(HBox(children=(Figure(axes=[Axis(label='timestamp', scale=LinearScale()), Axis(label='mouse_pos…

In [20]:
system.stop()

Note that the "position" we end up with after differentiating twice and itegrating twice again is a very poor approximation, and drifts a lot. The reason for this is that any small errors in the twice-differentiated series (acceleration) due to numerical inaccuracies etc. get compounded when integrating. When integrating twice, the error increases as time squared.

Note that this method will store all intermediate results. To prevent that we can wrap these functions with `Combine`

In [21]:
pos_to_acc_to_pos = Combine([pos_to_vel, vel_to_acc, acc_to_vel, vel_to_pos], name="mouse_pos_2")

system = System(sampler, [pos_to_acc_to_pos])
system.start()

In [22]:
from genki_signals.frontends import Line, WidgetFrontend

pos =  Line("timestamp", "mouse_pos")
pos2 = Line("timestamp", "mouse_pos_2")

frontend = WidgetFrontend(system, [pos, pos2])
frontend

VBox(children=(HBox(children=(Figure(axes=[Axis(label='timestamp', scale=LinearScale()), Axis(label='mouse_pos…

In [23]:
system.stop()

Lastly let's create a custom `SignalFunction`. 

To do that we need to extend the `SignalFunction` base class which takes three arguments input_signals, name and params. This is done for serialization purposes which we will cover further in the next example notebook

Then we only need to implement a `__call__` method for our class.

This method takes in a batch of samples of our input_signals and returns a batch of outputs. 

The signals passed into the function are defined in the `__init__` method (order matters).



In [17]:
import numpy as np
from genki_signals.functions import SignalFunction
from scipy import ndimage, signal

class ConvGrayscaleImage(SignalFunction):
    def __init__(self, input_signal, name, kernel, inverse=False):
        super().__init__(input_signal, name=name, params={"kernel": kernel, "inverse": inverse})
        self.kernel = kernel / np.linalg.norm(kernel)
        self.inverse = inverse

    def __call__(self, signal):
        grayscale = np.average(signal, axis=0, weights=[0.2989, 0.5870, 0.1140])
        l = [ndimage.convolve(grayscale[..., i], self.kernel) for i in range(grayscale.shape[-1])]
        if self.inverse:
            return 255 - np.stack(l, axis=-1)
        return np.stack(l, axis=-1)


def gaussian_kernel(n, std):
    '''
    Generates a n x n matrix with a centered gaussian 
    of standard deviation std centered on it. If normalised,
    its volume equals 1.'''
    gaussian1D = signal.gaussian(n, std)
    gaussian2D = np.outer(gaussian1D, gaussian1D)
    return gaussian2D

kernel = np.array([[-1, -1, -1, -1, -1],
                   [-1,  1,  2,  1, -1],
                   [-1,  2,  4,  2, -1],
                   [-1,  1,  2,  1, -1],
                   [-1, -1, -1, -1, -1]])

gaussian = gaussian_kernel(5, 1)

gaussian_edges = ndimage.convolve(gaussian, kernel)

conv = ConvGrayscaleImage("video", "video_edges", gaussian_edges, True)

In [18]:
video = CameraSource(0)
sampler = Sampler({"video": video}, sample_rate=60)
system = System(sampler, [conv])

system.start()

In [19]:
video_normal = Video("video")
video_edges = Video("video_edges")

frontend = WidgetFrontend(system, [video_normal, video_edges])

frontend

VBox(children=(HBox(children=(Image(value=b'', format='jpeg'), Image(value=b'', format='jpeg'))), HBox()))

In [None]:
system.stop()