<img src="imgs/header.png" width="100%">


# Unit III: Probabilistic filtering
#### Inferring user intention in a noisy world
<b>[John Williamson](http://johnhw.com)</b> 

----

>All theorems are true. 
>All models are wrong. 
>And all data are inaccurate. 
>
>What are we to do? 
> We must be sure to remain uncertain.
>
> -- *[Leonard A. Smith, Proc. International School of Physics ``Enrico Fermi", (1997)](http://www2.maths.ox.ac.uk/~lenny/fermi96_main_abs.html)* 


<img  src="Capture.PNG"/>
*A probabilistic filter-based gesture recogniser*

## Introduction 

-----------------

### What is probabilistic filtering?
One view on interaction is to see user intentions as **unknown values** which are partially observed through input sensors. The time series of inputs only give a partial, noisy, incomplete view of intention. 

<img src="imgs/brainspace.png" width="100%">

Probabilistic filtering makes it possible to track the evolution of some unknown variables [user intentions] given observed evidence [input], in a way that is **robust**. They work by inferring a **distribution** over possible hidden variables, and updating them over time.

<img src="imgs/stochastic.png" width="50%">

Probabilistic filtering is an **inverse probability** approach, and it requires that we think of interaction from a unique perspective. We have to explicitly be able to write down:

* what we want to know (i.e. the **state space of intention**);
* how that will change over time (i.e. the **dynamics of intention**);
*  a model that *if we knew what the user intention was, what the expected behavior would be* (i.e. a **function mapping intention -> expected inputs**).

Note that this last point is the **inverse** of the typical way of approaching this problem, where we try and find a mapping from a sensors to intention, by design or by learning. 


### Why is this computational HCI?
Probabilistic filtering means writing down an **executable, statistical model** of behavior. This has the two key elements of computational interaction:
* an explicit mathematical model of user-system behavior;
* an algorithmic element that, using this model, can apply computational power to improving interaction.

It satisfies the requirement that better interfaces can be achieved via 
* improved modeling;
* more powerful algorithms;  
* or increased computational power, 

rather than the workhorses of traditional HCI:
* more design ingenuity;
* and stronger evaluation.

### What are existing techniques?
* **Crafted mappings**, where we try to find (by hand) transforms from sensors to intentions that are  simple or obvious. **Example:** a button, which has two physical states, and maps on to two intentional states via two electrical states. Pushed down = current flows = user intended to switch on. The mapping from electrical states to intentional states is **designed.**

* **Machine learning**, where we train a system to recognize a class of input patterns as being representative of an intended behavior. **Example:** Finger gesture recognizer; hundreds of examples of many users performing one of N multi-touch gestures are recorded. These are used to train a random forest to classify the intended gesture. The mapping from electrical states (capacitive sensors) to intentional states is **learned**.

### Benefits
* **Robustness to noise** PFs work well even with sensors that are noisy.
* **Robustness to poorly specified models** PFs can cope predictably even if our models are bad.
* **Robustness to intermittence** PFs can continue sensibly interpolate when input cuts out.
* **Uncertainty estimates** PFs **know how certain they are** and this can be used in the interaction design
* **Better feedback** PFs predict distributions over past, present and future, and they offer the opportunity to give rich feedback insights to users.
* **Flexible modeling** PFs can incorporate both fundamental modeling (e.g. physiological or cognitive models) and data-driven machine learning.

### History
* 1960s Kalman filter (Swerling, Kalman, Bucy), Extended Kalman Filter (Schmidt)
* late 1960-1990s Particle filter / sequential Monte Carlo
* 1992 Bootstrap filter (Gordon)
* 1995 Unscented Kalman Filter (Uhlmann)
* 1998 Condensation: particle filter for vision problems (Isard and Blake)

## Principles [28 mins]
### Overview diagram

### Use case
### Problem description
We are going to solve xxx
### Meat
Meat goes here!

### Key algorithm summary
| Algorithm       | Dynamics       | State distribution | Efficiency | Optimizable |
|-----------------|----------------|--------------------|------------|-----------|
| Particle        | Arbitrary      | Arbitrary          | Low        | No        |
| Kalman          | Linear         | Gaussian           | Very high  | Yes       |
| Extended Kalman | Locally linear | Gaussian           | High       | Yes       |
| Unscented Kalman| Arbitrary      | Gaussian           | High       | ?         |
| HMM             | Transitions    | Discrete           | High       | Yes       |

* Dynamics: permissible state transition functions.
* State distribution: distribution type for representing current state.
* Efficiency: computational efficiency.
* Optimizable: can the parameters of the filter be automatically optimized given training data?


### Gallery
Research papers here (thumbnail + link), short description of why cool

### Pitfalls
Hands-on guru knowledge goes here.

## Outlook [5 mins]
### Scope and limitations
#### Limitations
* PFs can be computationally intensive to run. 
* Curse-of-dimensionality can make the attractive simplicity of PFs work poorly in practice as the state space expands.
* Particle filters are simple and elegant, but inferentially weak.
* Kalman filters are rigid and restrictive, but very inferentially efficient.
* Hybrid approaches (Ensemble Kalman filter, Unscented Kalman Filter, hybrid particle/Kalman filters) can trade these off, but aren't off the shelf solutions.


### Resources
How do I learn to do this in depth? Blogs, papers, software, videos, online courses, etc.
* Read the [Condensation paper](http://vision.stanford.edu/teaching/cs231b_spring1415/papers/isard-blake-98.pdf).
* Read [the Kalman filter in pictures](http://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures/)
* Watch [the particle filter without equations](https://www.youtube.com/watch?v=aUkBa1zMKv4)

### Future of probabilistic filtering

#### Learned models

Much use of probabilistic filters has depended on strong mathematical models of the fundamental process. For example, in rocket science, sophisticated physics models were used to specify the Kalman filters used for stable control. 

However, it is becoming increasingly possible to **infer** these models from observations. Techniques such as deep learning (for example variational autoencoders or generative adversarial networks) make it possible to learn very sophisticated *generative models* from observations of
data. 

These models can be dropped into probabilistic filters to produce robust inferential engines for user interaction.