In [None]:
import pandas as pd 
import numpy as np
import scipy.stats as ss
import time

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from IPython.display import display
# set some styling defaults for matplotlib
plt.style.use("seaborn-talk")
mpl.rcParams["figure.dpi"] = 90  # change this to set apparent figure size
mpl.rcParams["figure.figsize"] = (7, 3)
mpl.rcParams["figure.frameon"] = False

# set decimal precision to 3 dec. places
%precision 3

In [None]:
#%pip install pfilter
#%pip install ipycanvas
# uncomment and run the above if you don't have pfilter and/or ipycanvas installed

In [None]:
from pfilter import ParticleFilter, gaussian_noise, squared_error, independent_sample
from scipy.stats import norm, gamma, uniform 
from particle_tools import FilterCanvas


# Example 1: probabilistic filters

## Outcomes
You will understand:
* How probabilistic filtering works
* How to implement basic probabilistic filters
* How filtering can be used to extract hidden states
* How to integrate a particle filter into an interactive system

    


## Goal
* Estimate a hidden state in a continuous process from a stream of observations.
* Be able to deal with missing or noisy observations.
* Be able to project forward into the future as needed.
* Be able to quantify uncertainty and the expected value of possible states.

## Task
We'll build a simple "swipe" gesture recogniser. This is intended to recognise a swipe-left or swipe-right movement. Obviously, this is a relatively simple problem to solve, but we'll see how to properly represent uncertainty when approaching it from a Bayesian perspective.

<img src="imgs/swipe.png" width="50%">

## Process
* We build a **filter**, which in this context means a process that combines observations that occur sequentially.
* Note: it's not really the same as a filter in the sense of signal processing, though the concept of processing sequential signals is the same.
    
* The data generating process is assumed to have some temporal coherence; predictions of the future depend on past states.
* Typically, we make a *Markov assumption*: that the current unobserved state encodes everything we know about the next state.
    
### Predictor-corrector

The concept is simple: we first build a model that just predicts what we think might be going on -- a pure simulator. Then we can take observations and use them to "filter out" predictions that are unrealistic given those observations. This is formulated as a Bayesian belief update.

* Prior at time t + evidence at time t -> posterior at time t

$$P(X_{t+1}|Y_t, X_t) \propto P(Y_t | X_{t+1}) P(X_{t+1})$$
$$P(X_{t+1}|Y_t, X_t) \propto P(Y_t | X_{t+1}) P(X_{t})$$

* (we don't typically care about normalising this distribution, we just want to track *relative* likelihoods of hypotheses)

<img src="imgs/stochastic.png" width="50%">

## Package: `pfilter`

We'll use the `pfilter` package for this example. This implements a simple interface to a particle filter. *Caveat: I wrote `pfilter`, so my definition of simple may not be everyone's!*

### Particle filter
A particle filter just represents the current posterior distribution as a collection of samples (definite estimates) and updates them given evidence observed, and a forward model of what evidence *would be expected* for any given sample. It operates in discrete time, feeding the posterior from one step as the prior from the next.

We can include dynamics, which specify how the unknown state is believed to be changing over time. Usually, these are *stochastic* dynamics -- they add some diffusion or noise to accommodate the inexact dynamics we implement.

Likelihood is implemented by *weighting* samples according to how similar their hypothesised observations are to the true observation. This gives a pseudo-likelihood that is easy to apply.

This is a sequential Monte Carlo: we push forward a block of samples through some expected dynamics, compare them to an observation, and reproduce those samples that best approximate the current observation: predict-correct.



## Setup

### Latent variables

We need define the variables we want to infer; we'll form distributions over these. We'll stick to a very simple model, which has three variables:

* `direction`: + or - for left or right swipe
* `phase`: the proportion through the swipe, as a number from 0->1
* `phase_rate`: the current rate of movement through the swipe, as a number from 0->1

From this, we'll compute a distribution over whether or not a gesture is completed, which we can use to trigger actions. We'll do this by computing the how much evidence there is that the user is at the end of a swipe, with a consistent direction.

### Initial priors for time t=0
We need to set prior distributions on these variables at time t=0. We'll assume we can equally likely be going left or right, and that swipes  start at the beginning (phase close to 0) and have some variable possible rates.

* direction: uniform -1, 1
* phase: uniform 0, 0.2 (could start just a bit through a swipe)
* phase_rate: normal 1, 0.5 (going at a steady rate, with some variation)

In [None]:
columns = ["direction", "phase", "phase_rate"]

# some basic guesses for how these variables might start out
# note: direction is actually -1, 0 or 1, but I've used a continuous
# distribution just to make things a bit easier
prior_fn = independent_sample([uniform(loc=-1, scale=2).rvs, 
                                uniform(loc=0.0, scale=0.2).rvs, 
                                norm(loc=1,scale=0.25).rvs])
                               
prior_fn(10)


### Dynamic model
Our filter is applied to a *process* that unfolds over time. The previous timestep posterior is used as the prior for the next timestep. Because we can *predict* what will happen, we can push this posterior through some dynamics to make a better prior for the next timestep.

Now we need a model of what would happen in the future *if we had no information about what the user was doing* -- that is, a pure forward simulator. We'll assume that a gesture that is active will continue in the direction it was going, at the same phase rate.

In [None]:
def dynamics(x):
    dt = 0.05
    # phase += direction * phase_rate    
    x[:,1] = np.tanh(x[:,1] + x[:,2] * dt)
    return x

In [None]:
dynamics(prior_fn(10)) # moves forward one timestep

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
pf = ParticleFilter(prior_fn=prior_fn, n_particles=100, dynamics_fn=dynamics)        
c = FilterCanvas(pf)
c.draw()
c.start()

### Diffusion
The dynamics we implemented are *deterministic*; they assume that, for example, an initial hypothesis that a gesture is swiping left at 10px per second will continue to behave that way in the future. Given that we know our estimation is uncertain, this is a very strong assumption. A way to ameliorate this is to introduce some **diffusion**: stochastic (random) dynamics, that cause the distribution over latent states to "spread out" over time. This is trivial to implement: we just add a bit of noise to each of our particles at each time step:

In [None]:
def noise(x):
    return gaussian_noise(x, sigmas=[0.00, 0.02, 0.05])

If we look at the simulation now, we can see that the particles "wander" rather than following straight line paths.

In [None]:
pf = ParticleFilter(prior_fn=prior_fn, n_particles=100, dynamics_fn=dynamics, noise_fn=noise)        
c = FilterCanvas(pf)
c.draw()
c.start()

### Observation and likelihood

So far, we have a forward model that can sample: a data generating process that behaves as we expect the hidden state to behave. We need to implement the *corrector* step, by introducing likelihood. 

To do this, we introduce observations, and specify a function that tells us: given a hidden (latent) state predict a *definite* hypothesised observation. We can them approximate the (relative) likelihood by comparing each hypothesised observation against the real one, and weighting the result according to how close they are.

We'll assume all we can see is the mouse x position at a given instant.

In [None]:
def observe(x):
    # return what we *expect* to observe, given the hypotheses we have
    direction = np.sign(np.where(np.abs(x[:,0])<0.5, 0, x[:,0]))
    return np.where(direction==0, np.random.uniform(-1,1,direction.shape), x[:, 1] * direction)

def weight(x, y):
    return squared_error(x, y, sigma=0.1)

In [None]:
pf = ParticleFilter(prior_fn=prior_fn, n_particles=200, dynamics_fn=dynamics, noise_fn=noise, observe_fn=observe,
                   weight_fn=weight, resample_proportion=0.01)  
pf.update([0.5]) # pass an observation

In [None]:
c = FilterCanvas(pf, observing=True)
c.draw()
c.start()

## Applying

### Estimating intention and triggering actions

Now we can track *state*; what we want to do is form a distribution over *intention*. We'll apply simple post-processing to
the particles to estimate how likely it is we have completed a left or right swipe. We'll just count the number of particles in different states, and use the proportions as our probabilities.

This will give us a distribution over `swipe` with three outcomes:

* no swipe
* left swipe
* right swipe


In [None]:
def estimate_swipe(particles):
    # find all particles that have finished, and separate them into left and right
    terminated_particles = particles[(particles[:,1]>0.5)]
    left_particles = terminated_particles[terminated_particles[:,0]<0]
    right_particles = terminated_particles[terminated_particles[:,0]>0]
    n_left =len(left_particles)
    n_right = len(right_particles)
    n_total = len(particles)
    n_none = n_total - (n_left+n_right)    
    
    swipe = {"left":n_left/n_total,
             "right":n_right/n_total,
             "no":n_none/n_total}
    return swipe
             
    
    

In [None]:
estimate_swipe(pf.particles)

In [None]:
pf = ParticleFilter(prior_fn=prior_fn, n_particles=100, dynamics_fn=dynamics, noise_fn=noise, observe_fn=observe,
                   weight_fn=weight, resample_proportion=0.01)      
c = FilterCanvas(pf, observing=True, estimator=estimate_swipe)
c.draw()
c.start()

### Actuation
Bayesian interface estimate *beliefs* about state as probability distributions. While this is elegant, it does pose problems when we want to actually do something. "Doing something" means making an irreversible change of state. This is the **barrier of action**  -- the barrier that separates movement in belief space from discrete state changes.

We need to decide on a decision rule to actuate based on beliefs.

The *rational* way to do this is to compute the expected value of each outcome (using some utility function to ascribe a numerical goodness to each action) and choose the action with the highest expected value. In this case, we don't have different values for our actions, so we'll just use the probabilities directly. We'll threshold on the probability of each intention state, and if there is enough evidence, we'll trigger the action.

In [None]:
def trigger_action(swipe):
    # we trigger on crossing a probability threshold
    # this is not the only choice; we could e.g. use entropy 
    # to detect a certain state, and select the max. prob. result
    p_threshold = 0.75
    if swipe["left"]>p_threshold:
        return "left"
    if swipe["right"]>p_threshold:
        return "right"
    return "no"

In [None]:
pf = ParticleFilter(prior_fn=prior_fn, n_particles=200, dynamics_fn=dynamics, noise_fn=noise, observe_fn=observe,
                   weight_fn=weight, resample_proportion=0.05)      
c = FilterCanvas(pf, observing=True, estimator=estimate_swipe, trigger=trigger_action)
c.draw()
c.start()

### Prediction
One of the major advantages of the particle filtering approach is that we have a generative model which we can *run without data* to make predictions about the future evolution of the system. This can be used to implement **predictive interfaces** which have appropriate uncertainty.

All we do is call the particle filter update but decline to provide an observations; the particles are then driven by their internal dynamics alone (like disengaging the clutch in a car and coasting along).

In [None]:
# update, but don't provide observations
pf.update()
pf.update()
pf.update()

print(pf.particles)

In [None]:
pf = ParticleFilter(prior_fn=prior_fn, n_particles=50, dynamics_fn=dynamics, noise_fn=noise, observe_fn=observe,
                   weight_fn=weight, resample_proportion=0.01)      
c = FilterCanvas(pf, observing=True, predict=True)
c.draw()
c.start()

## Reflection
### What did we gain?
* We were able to track gestures even with not-very-reliable mouse input.
* We maintained uncertainty, and could choose to actuate gestures actions when we were quantifiably sure the intention was there.
* We could display uncertainty in estimation to the user.
* We could also predict the future: this could be useful for display or to reduce apparent latency.
### What was difficult?
* The filter can be tricky to tune, particularly in weighting the hypothesised sensor values against the real ones
* A sample based approach is easy to understand, but isn't the most efficient, and the quality of interaction depends on the sample number.
* Any time we are dealing with uncertainty in the loop we have to find ways of reflecting that to the user -- doing so in effectively can be challenging.

### What else could we do?
* We could obviously extend this to much more complex gestures, or other input devices.
* We could use the uncertainty more intelligently in the interaction
* We could take better advantage of the fact we don't have to be locked to real-time