# Chapter Summary

<img src="Figures4/S40-Warehouse_robots-01.jpg" alt="Splash image with cute robot with a stalked eye" width="60%" align=center style="vertical-align:middle;margin:10px 0px">

In the previous chapters, we have mainly considered discrete probability distributions. In Chapter 2, we modeled the world state using five
discrete categories of trash, and in Chapter 3 we modeled the world as five discrete rooms.
In this chapter, we began a serious study of continuous random variables, first to represent the robot's state, and then to represent
sensor readings.
Representing and reasoning about continuous probability distributions is significantly more difficult than working in discrete domains,
and we introduced a set of representational and inference tools that were able to scale to these more difficult problems.

# Models

In this chapter, we represented the state of the robot using continuous coordinates, $x \in \mathbb{R}^2$,
and we modeled uncertainty in state using a Gaussian distribution. 
Gaussian distributions have several nice properties. 
First, they are completely characterized by two parameters,
a mean vector, $\mu \in \mathbb{R}^n$, and a covariance matrix, $\Sigma \in \mathbb{R}^{n \times n}$.
In the one-dimensional case, these are scalars, denoted by $\mu$ and $\sigma$.
Perhaps more importantly, Gaussian distributions enjoy the privilege of being very good approximations
for many stochastic aspects of real-world systems.
Roboticists, and engineers in general, often resort to the assumption that noise, disturbances, or other stochastic aspects of
real-world systems can be accurately approximated using Gaussian distributions.

To model uncertainty in the motion model, we introced the conditional Gaussian pdf.
In particular, we assumed that noise in the motion model could be modeled as additive Gaussian noise,
so that the state at time $k+1$ is defined as

$$x_{k+1} = x_k + u_k + w_k$$

in which $x_k$ is the state at time $k$, $w_k$ is the random distrubance,
and $u_k$ is the commanded motion at time $k$.
Under our Gaussian assumption that $w_k \sim N(\mu,\Sigma)$,
the probability distribution for the state at time $k+1$ is given by
the conditional Gaussian pdf

$$
p(x_{k+1}|x_{k}, u_k) = \mathcal{N}(x_{k_1}; x_{k} +  u_k, \Sigma)
$$

Thus, by assuming Gaussian noise in the motion model, we arrive to a kind of
"Gaussian in/Gaussian out" formulation, which can greatly simplify certain inference
problems (e.g., by using the Kalman filter).

We can also use continuous conditional pdf's to model sensors.
For example, if the ideal (i.e., noise-free) sensor reading at time $k$
is defined by a function $h(x_k)$, we can model the sensor
output by the random variable

$$z_k = h(x_k) + w_k$$

in which $w_k$ is the noise term (unrelated to the noise in our motion model).
If $w_k$ is a Gaussian random variable,
the conditional distribution for sensor measurement given the value of $h(x_k)$
is given by

$$
\begin{aligned}
p(z_k|x_k) &= \mathcal{N}(z_k;\mu=h(x_k), \sigma^2) \\
&= \frac{1}{\sqrt{2\pi\sigma^2}} \exp\{-\frac{1}{2\sigma^2}(z_k-h(x_k))^2\}
\end{aligned}
$$

This approach generalizes nicely to the case of multi-dimensional sensors, as we saw
for the case of GPS-like sensors with Gaussian noise.

There are, of course, many problems for which the uncertainty cannot be adequately modeled using
Gaussian distributions. For example, a Gaussian distribution, which has a single mode,
cannot adequately model a multi-model distribution.
A classic example of this situation is a robot in a long hallway that senses an office door;
the robot has a strong belief that it is in front of a door, but no way to know which
door.
This situation corresponds to a probability distribution with modes at locations that are in front of office doors.
In this chapter, we saw two ways to represent complex probability distributions: grids and samples.
In the case of grids, we merely decompose the state space into a grid, and assign to each grid
cell a value that corresponds to the probability that the state lies in that cell.
In the case of samples, the situation is less structured.
Instead of a uniform grid, sampling-based approaches represent the probability distribution by
a collection of weighted samples (also called *particles*). The value of the sample specifies a state,
and the weight approximates the probability mass associated to a local neighborhood of the sample.
While grid-based representations grow exponentially with the dimension of the state space,
sampling-based approaches are much more efficient, but require the availability of methods
that can generate good sets of samples.

Finally, in addition to introducing these methods for dealing with uncertainty,
we also developed a simple geometric model for wheeled robot locomotion,
specifically for the case of robots with omni wheels.
In particular, we developed the differential relationships between the rotation of the
robot's wheels, and the instantaneous velocity of the robot.
There is, of course, uncertainty associated to this motion model; however,
rather than explicitly consider this uncertainty, we merely bundled up all
of the uncertainties associated with robot motion into the noise parameter
$w_k$. This simplification leads to efficient computation, but it also has
a fairly firm theoretical basis in the Central Limit Theorem,
a well-known theorem from probability that essentially ensures
that the aggregate of many sources of uncertainty can be well-characterized
using a Gaussian distribution (there are, of course, many caveats and conditions, 
which we will not consider here).



# Reasoning

Gaussian dynamic Bayes nets
localization w/Bayes filters
Markv localizatin (grid)
particle filter (smapling)
Kalman filter

Planning via value function

learning as parameter estimatino (MLE, regressin, EM)

# Background and 

here are a few.
[Probabilistic Robotics](https://mitpress.mit.edu/9780262201629/probabilistic-robotics/) 
