# Course 3: Localization
## Part 1: Markov Localization in Theory
#### By Jonathan L. Moran (jonathan.moran107@gmail.com)
From the Self-Driving Car Engineer Nanodegree programme offered at Udacity.

## Objectives

* Apply the [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem) to vehicle localisation;
* Practise computing posterior probabilities for several observations;
* Use the [Markov Assumption](https://en.wikipedia.org/wiki/Markov_chain) and [law of total probability](https://en.wikipedia.org/wiki/Law_of_total_probability) to initialise a [Bayes' filter](https://en.wikipedia.org/wiki/Recursive_Bayesian_estimation) with meaningful estimates.

## 1. Introduction

In [1]:
### Importing required modules

In [2]:
from decimal import Decimal
import numpy as np
import pandas as pd
import os

In [3]:
!python --version

In [4]:
### Setting environment variables and parameters

In [5]:
ENV_COLAB = False               # True if running in Google Colab instance

In [6]:
# Root directory
DIR_BASE = '' if not ENV_COLAB else '/content/3-Localization'
DIR_BASE = os.path.abspath(DIR_BASE)
DIR_BASE

'/Users/jonathanmoran/Development/ND0013-Self-Driving-Car-Engineer/3-Localization/3-1-Markov-Localization'

In this part of the Markov Localization course we set up the foundations necessary to implement the [Bayes' filter](https://en.wikipedia.org/wiki/Recursive_Bayesian_estimation) for robot localisation. In this notebook we will not be writing much code, as we leave our C++ implementation tasks to the second notebook, [`2022-11-25-Course-3-Localization-Exercises-Part-2.ipynb`](). Instead, we will be practising working out the Bayes' theorem calculations by hand using probability values computed for simulated data.

## 2. Programming Task

### 2.1. Calculate Localization Posterior

To continue developing our intuition for this filter and prepare for later coding exercises, let's walk through the calculations for determining posterior probabilities at several pseudo-positions $x$, for a single time-step. We will start with a time-step after the filter has already been initialised and run a few times. We will cover initialisation of the filter in an upcoming concept.

In [7]:
def value_to_decimal(value):
    if value == 'NULL':
        return np.nan
    return '%.2E' % Decimal(value)

In [8]:
file_path = os.path.join(DIR_BASE, 'data/2022-11-25-Lesson-3-1-Calculate-Localization-Posterior.csv') 
df = pd.read_csv(file_path, index_col=0)
df = df.applymap(value_to_decimal)

In [9]:
df

Unnamed: 0_level_0,P(location),P(observation | location),Raw P(location | observation),Normalized P(location | observation)
pseudo_position (x),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1.67E-02,0.00E+00,0.00E+00,0.00E+00
2,3.86E-02,6.99E-03,NAN,2.59E-02
3,4.90E-02,8.52E-02,4.18E-03,4.01E-01
4,3.86E-02,NAN,5.42E-03,5.21E-01
5,1.69E-02,3.13E-02,5.31E-04,5.10E-02
6,6.51E-03,9.46E-04,6.16E-06,NAN
7,NAN,3.87E-06,6.55E-08,6.29E-06
8,3.86E-02,0.00E+00,0.00E+00,0.00E+00


Recall the general form of the Bayes' theorem:

$$
\begin{align}
P\left(a\vert b\right) = \frac{P\left(b \vert a\right)P\left(a\right)}{P\left(b\right)}
\end{align}
$$

For the localisation problem, we have the following terms:
* $P\left(\textrm{location} \ \vert \ \textrm{observation}\right)$ — the posterior probability $P\left(a \vert b\right)$, i.e., the _normalised_ probability of a position given the observation;
* $P\left(\textrm{observation} \ \vert \ \textrm{location}\right)$ — the likelihood $P\left(b \vert a\right)$, i.e., the probability of an observation given a position;
* $P\left(\textrm{location}\right)$ — the prior probability $P\left(a\right)$, i.e., the probability of a position;
* $P\left(\textrm{observation}\right)$ — the prior probability $P\left(b\right)$, i.e., the probability of an observation.

Note that in the table above we have the **Normalized P(location | observation)** term, which is the **Raw P(location | observation)** term after dividing by the $P\left(\textrm{observation}\right)$ value — the total probability of $P\left(b\right)$. In other words, the entire fraction given on the right-hand side of the Bayes' rule.  Consequently, the **Raw P(location | observation)** term is the posterior probability prior to dividing by the total probability $P\left(\textrm{observation}\right)$, i.e., the numerator of the fraction on the right-hand side of the Bayes' rule.

#### The observation likelihood

To compute the observation likelihood term, $P\left(\textrm{observation} \ \vert \ \textrm{location} \right)$, for the pseudo-position $x=4$, we use the following relation:

$$
\begin{align}
P\left(\textrm{b} \vert \textrm{a}\right) = \frac{P\left(a \vert b\right)}{P\left(a\right)}
\end{align}
$$

which we obtain after re-arranging the general form of the Bayes' rule. Note that here this corresponds to dividing the posterior term **Raw P(location | observation)** by the location prior probability **P(location)**.

In [10]:
### The pseudo-position x=4
x_4 = df.iloc[3]
x_4

P(location)                             3.86E-02
P(observation | location)                    NAN
Raw P(location | observation)           5.42E-03
Normalized P(location | observation)    5.21E-01
Name: 4, dtype: object

In [11]:
### Calculating the probability value
x_4 = x_4.astype(np.float64)
p_4 = value_to_decimal(x_4['Raw P(location | observation)'] / x_4['P(location)'])
p_4

'1.40E-01'

In [12]:
### Setting the value in the DataFrame
df['P(observation | location)'][4] = p_4

#### The posterior probability

To compute the raw posterior probability term, **Raw P(location | observation)**, for the pseudo-position $x = 2$, we use the following relation:

$$
\begin{align}
P\left(\textrm{posterior}\right) = P\left(b \vert a\right) * P\left(a\right)
\end{align}
$$

which is non-normalised expression on the right-hand side of the Bayes' rule. In other words, the product of the likelihood $P\left(b \vert a\right)$ and prior probability $P\left(a\right)$.

In [13]:
### The pseudo-position x=2
x_2 = df.iloc[1]
x_2

P(location)                             3.86E-02
P(observation | location)               6.99E-03
Raw P(location | observation)                NAN
Normalized P(location | observation)    2.59E-02
Name: 2, dtype: object

In [14]:
### Calculating the probability value
x_2 = x_2.astype(np.float64)
p_2 = value_to_decimal(x_2['P(observation | location)'] * x_2['P(location)'])
p_2

'2.70E-04'

In [15]:
### Setting the value in the DataFrame
df['Raw P(location | observation)'][2] = p_2

#### The normalised posterior probability

To compute the normalised posterior probability for the pseudo-position $x = 6$, we have to first obtain the sum of the **Raw P(location | observation)** terms to get the total posterior probability. Using the expression for the normalising constant $P\left(b\right) we have:

$$
\begin{align}
p\left(\theta\right) = \int p\left(x \vert \theta\right)p\left(\theta\right)d\theta = \sum_{x=1}^{n} p\left(x \vert a\right)p\left(a\right)
\end{align}
$$

which is the sum over all non-normalised posterior values as given by the [law of total probability](https://en.wikipedia.org/wiki/Bayesian_statistics#Bayes'_theorem). Assuming we have a discrete distribution given by psuedo-position variable $x$, this is nothing but the sum over the product of the likelihood and prior probability value.

Therefore, we add all values **Raw P(location | observation)** from $x=1$ to $x=8$,

In [16]:
P_posterior_raw = df['Raw P(location | observation)'].astype(np.float64)
P_posterior_raw

pseudo_position (x)
1    0.000000e+00
2    2.700000e-04
3    4.180000e-03
4    5.420000e-03
5    5.310000e-04
6    6.160000e-06
7    6.550000e-08
8    0.000000e+00
Name: Raw P(location | observation), dtype: float64

In [17]:
### Summing the non-normalised total posterior probability
p_sum = P_posterior_raw.sum()
p_sum

0.0104072255

Then, to find the normalised posterior probability, we divide the raw posterior probability value **Raw P(location | observation)** at the given pseduo-position $x = 6$ by the total probability normalisation term we computed above.

In [18]:
### Calculating the normalised posterior probability
p_6 = value_to_decimal(P_posterior_raw[6] / p_sum)
p_6

'5.92E-04'

In [19]:
### Setting the value in the DataFrame
df['Normalized P(location | observation)'][6] = p_6

#### The prior position probability

To compute the prior position probability for the pseudo-position $x = 7$, we can divide the posterior probability $P\left(\textrm{posterior}\right)$ by the prior observation probability $P\left(b\right)$. Recalling the formula for $P\left(\textrm{posterior}\right)$,

$$
\begin{align}
P\left(\textrm{posterior}\right) = P\left(b \vert a\right) * P\left(a\right),
\end{align}
$$

and knowing that **Normalized P(location | observation)** is

$$
\begin{align}
P\left(a \vert b\right) = \frac{P\left(b \vert a\right) * P\left(a\right)}{P\left(b\right)},
\end{align}
$$

we obtain the prior position probability by dividing the posterior **Raw P(location | observation)** by the observation likelihood **P(observation | location)**.

In [20]:
### The pseudo-position x=7
x_7 = df.iloc[6]
x_7

P(location)                                  NAN
P(observation | location)               3.87E-06
Raw P(location | observation)           6.55E-08
Normalized P(location | observation)    6.29E-06
Name: 7, dtype: object

In [21]:
### Calculating the prior position probability
x_7 = x_7.astype(np.float64)
p_7 = value_to_decimal(
    x_7['Raw P(location | observation)'] / x_7['P(observation | location)']
)
p_7

'1.69E-02'

In [22]:
### Setting the value in the DataFrame
df['P(location)'] = p_7

#### The final DataFrame

With the above calculations, we obtain a complete probability distribution with values:

In [23]:
df

Unnamed: 0_level_0,P(location),P(observation | location),Raw P(location | observation),Normalized P(location | observation)
pseudo_position (x),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,0.0169,0.0,0.0,0.0
2,0.0169,0.00699,0.00027,0.0259
3,0.0169,0.0852,0.00418,0.401
4,0.0169,0.14,0.00542,0.521
5,0.0169,0.0313,0.000531,0.051
6,0.0169,0.000946,6.16e-06,0.000592
7,0.0169,3.87e-06,6.55e-08,6.29e-06
8,0.0169,0.0,0.0,0.0


From the [law of total probability](https://en.wikipedia.org/wiki/Law_of_total_probability) we know that our posterior probability values for the discrete 1-D case should add up to $1.0$. 

To verify this, we take the sum of the resulting normalised posterior values:

In [24]:
### Summing the normalised posterior values
df['Normalized P(location | observation)'].astype(np.float64).sum()

0.9994982900000001

such that we obtain a resulting total probability very close to $1.0$.

Hooray! This was a great start to Bayesian statistics, which we will use together with the [Markov Assumption](https://en.wikipedia.org/wiki/Markov_chain) to perform inference over the map range space using a [Bayes' filter](https://en.wikipedia.org/wiki/Recursive_Bayesian_estimation). This will allow us to estimate vehicle location using nothing but a single pair of consecutive measurements and a 1-D range map, i.e., a set of landmark positions defined relative to the ego-vehicle heading. Let's go! 

## Credits

This assignment was prepared by Aaron Brown, Tiffany Huang and Maximilian Muffert of Mercedes-Benz Research & Development of North America (MBRDNA), 2021 (link [here]()).