# Hochschule Bonn-Rhein-Sieg

# Mathematics for Robotics and Control

# Assignment 10 - Random Variables

Author(s)/team members:

* Kabilan Tamilmani  (ktamil2s)
* Karthik Sundararaj (ksunda2s)

In [1]:
import numpy as np
import IPython
import matplotlib.pyplot as plt
import typing

## Exercise 1: Expected Value Properties [5 points]

Given a discrete random variable $X$ and its probability mass function $P(X)$, the expected value of $X$ can be calculated using the formula

\begin{equation*}
    E[X] = \sum_{x}{xP(x)}
\end{equation*}

The expected value has some interesting properties that can often simplify seemingly complicated calculations. In this exercise, we'll investigate some of those properties.

a) [2 points] Show that the expected value of $Y = aX + b$, where $a$ and $b$ are constants, is equal to $aE[X] + b$. (*Hint*: Use the fact that the expected value of a function $g(X)$ is equal to $E[g(X)] = \sum_{x}{g(x)P(x)}$)

let say our function g(X) = Y = aX+b <br/>
then $E[g(X)] = \sum_{x} g(x) P(x)$<br/>
$E[Y] = \sum_{x} (ax+b) P(x)$<br/>
a and b are constant and above equation can be rewritten as,<br/>
$E[Y] = a\sum_{x} x P(x)+\sum_{x}bP(x)$<br/>
for the second term, sum over all the probability of x is 1, so<br/>
$E[Y] = a\sum_{x} x P(x)+b\cdot 1$<br/>
$E[Y] = aE[X]+b$<br/>

b) [2 points] The variance of $X$ is defined as

\begin{equation*}
    Var(X) = E[(X - E[X])^2]
\end{equation*}

Show that $Var(X) = E[X^2] - (E[X])^2$.

\begin{equation*}
    Var(X) = E[(X - E[X])^2]\\
\end{equation*}
we know that, 
\begin{equation*}
    E[X] = \sum_{x}{xP(x)}\\
    Var(X) = \sum_{x}{(x^2 - 2xE[X] + E[X]^2)}P(x)\\
    Var(X) = \sum_{x}{x^2P(x)} - 2E[X]\sum_{x}{xP(x)} + E[X]^2\sum_{x}{P(x)}\\
    Var(X) = E[X^2] - 2E[X]E[X] + E[X]^2\\
    Var(X) = E[X^2] - 2E[X]^2 + E[X]^2\\
    Var(X) = E[X^2] - E[X]^2\\
\end{equation*}

c) [1 point] Given two random variables $X$ and $Y$, show that $E[X + Y] = E[X] + E[Y]$.

let say $g(X,Y)=X+Y$ then expectation of g(X,Y) can be given as,<br/>
$E[g(X,Y)] = \sum_{x}\sum_{y}{g(x,y)P(x)P(y)}$<br/>
$E[X+Y] = \sum_{x}\sum_{y}{(x+y)P(x)P(y)}$<br/>
$E[X+Y] = \sum_{x}\sum_{y}{xP(x)P(y)} + \sum_{x}\sum_{y}{yP(x)P(y)}$<br/>
$E[X+Y] = \sum_{x}{xP(x)}\sum_{y}{P(y)} + \sum_{x}{P(x)}\sum_{y}{yP(y)}$<br/>
The sum of the probability of x or y is equal to 1, <br/>
$E[X+Y] = \sum_{x}{xP(x)} + \sum_{y}{yP(y)}$<br/>
$E[X+Y] = E[X] + E[Y]$

## Exercise 2: Random Variables and Conditional Probabilities [15 points]

Let's suppose that we have a very simple speech synthesis system that can only generate sequences of the letters \{ a, b, c, d \}. Based on the data we have used for training the system, we know that the prior probabilities of generating the letters are $P(a) = 0.1$, $P(b) = 0.4$, $P(c) = 0.2$, and $P(d) = 0.3$. This particular system works by generating letters one at a time; as only some letter combinations are considered valid words, generating letter $n$ affects the probabilities of generating letter $n+1$. In particular, we know that

\begin{align*}
    &P(a|a) = 0.2 \hspace{2cm} P(b|a) = 0.1 \hspace{2cm} P(c|a) = 0.6 \hspace{2cm} P(d|a) = 0.1\\
    &P(a|b) = 0.4 \hspace{2cm} P(b|b) = 0.2 \hspace{2cm} P(c|b) = 0.1 \hspace{2cm} P(d|b) = 0.3\\
    &P(a|c) = 0.1 \hspace{2cm} P(b|c) = 0.2 \hspace{2cm} P(c|c) = 0.4 \hspace{2cm} P(d|c) = 0.3\\
    &P(a|d) = 0.4 \hspace{2cm} P(b|d) = 0.4 \hspace{2cm} P(c|d) = 0.2 \hspace{2cm} P(d|d) = 0.0
\end{align*}

Given that the letters from $1$ to $n-1$ don't affect the probabilities of generating the subsequent letters, we say that letter $n+1$ is conditionally independent of letters $1$ to $n-1$ given letter $n$. For instance, if the system has generated the sequence *bc*, the probability of generating *d* next, namely $P(d|b,c)$, will be equal to $P(d|c)$. The probabilities given above are thus all we need for investigating how this particular speech synthesis system works.

a) [2 points] How many four-letter sequences of $\{ a, b, c, d \}$ are there in total? How many of these are valid for this particular speech synthesis system?

Without repetation there are  4! = 24 four-letter sequences. <br/>
With repetation there are $4^4$ = 256 four-letter sequences. <br/>

b) [3 points] What is the probability of generating the sequence $dacb$?

In [None]:
# YOUR CODE HERE
#P(d)=0.3
p_d = 0.3

#P(a|d)=0.4
p_ad = 0.4

#P(c|d,a)=P(c|a)=0.6
p_ca = 0.6

#P(b|d,a,c) = P(b|c) = 0.2
p_bc = 0.2

#### please assign p_dacb to the probability

#P(dacb) = P(d)*P(a|d)*P(c|a)*P(b|c)
p_dacb = p_d*p_ad*p_ca*p_bc
print(p_dacb)

In [None]:
### THIS CELL CONTAINS AUTOMATED TESTS OF YOUR SOLUTION; DO NOT DELETE IT!


c) [10 points] Let us now define a random variable $X$ that counts the occurrences of a given letter in a two-letter sequence. Write the function $f$ defined below, which calculates $P(X)$, the probability mass function of $X$. Verify the results of your function by assuming that $X$ counts the number of *a*s in a two-letter sequence; for this case, calculate $P(X)$ by hand and make sure that your function returns the correct values.

*Hint 1*: Feel free to define any additional helper functions that might simplify your calculations.

*Hint 2*: A test case for your function is given in the cell below. You should obtain the probabilities $P(X=0) = 0.6$, $P(X=1) = 0.38$, and $P(X=2) = 0.02$ for that test case.

In [2]:
def f(priors: np.array,
      conditionals: np.array,
      letter_idx: int) -> float:
    p_x = np.array([0., 0., 0.])

    ### Write your code here ###
    # YOUR CODE HERE
    #we know that for the two-letter sequence with repetition, it has 2^4 i.e. 16 possible combinations
    #so we are iterating over all the rows and colums of the given probability table
    for row,data in enumerate(conditionals):
        for column,element in enumerate(data):
            #let say P(X=a) is the probability of variable is 'a' in generated sequence then
            #this condition will identify the probability of both the letter in generated sequence is X.
            #and our result be P(X=2)= P(X=a)*P(X=a|X=a) 
            if(row==letter_idx and column==letter_idx):
                p_x[2]+=priors[letter_idx]*element
            #next two condition will identify that either one of the letter in generated sequence is X.
            #it will count all the probability where P(X=1) = P(X)*P(b, c, d|X) or P(X=1) = P(X)*P(X|b, c, d)
            elif(row==letter_idx):
                p_x[1]+=priors[letter_idx]*element
            elif(column==letter_idx):
                p_x[1]+=priors[row]*element
            #if there is no X in generated sequence, then sum up all those probability which gives us 
            #P(X=0) = P(X)*P(b,c,d|b,c,d)
            else:
                p_x[0]+=priors[row]*element

    return p_x

letters = ['a', 'b', 'c', 'd']
priors = np.array([0.1, 0.4, 0.2, 0.3])
conditionals = np.array([[0.2, 0.1, 0.6, 0.1], [0.4, 0.2, 0.1, 0.3], \
                         [0.1, 0.2, 0.4, 0.3], [0.4, 0.4, 0.2, 0.]])

pmf = f(priors, conditionals, letters.index('a'))
assert np.allclose(pmf, np.array([0.6, 0.38, 0.02]))

In [None]:
### THIS CELL CONTAINS AUTOMATED TESTS OF YOUR SOLUTION; DO NOT DELETE IT!


## Exercise 3: Random Variables [20 points]

In this exercise, you'll create a very simple Python library for discrete random variables. In particular, given a random variable $X$ whose probability mass function $P(X)$ is known, your library will be able to:

* [2 points] *Calculate the expected value of $X$*
* [2 points] *Calculate the variance of $X$*
* [3 points] *Calculate the expected value of a function of $X$*
* [5 points] *Calculate the **conditional expectation** of $X$ given an event $Y = y$*: Given an event $Y = y$, the PMF of $X$ changes to the conditional PMF $P_{X|Y=y}(x|y)$; the conditional expectation is thus defined as
    \begin{equation*}
        E[X|Y=y] = \sum_{x}{xP_{X|Y=y}(x|y)}
    \end{equation*}
* [3 points] *Create the **cumulative distribution function (CDF)** of $X$*: The cumulative distribution function $F(X)$ returns the probability that $X$ is less than or equal to a given value $x$, i.e.
    \begin{equation*}
        F(X) = P(X \leq x)
    \end{equation*}
* [5 points] *Generate samples from the PMF*: Sampling from the PMF is best explained by an example. Let's say that we have a discrete random variable $X$ that takes the values $1$, $2$, and $3$, whose probabilities are $P(X=1) = 0.2$, $P(X=2) = 0.5$, and $P(X=3) = 0.3$. If we generate a lot of samples from this distribution (say a thousand), we would expect that roughly $200$ of those are equal to $1$, $500$ are equal to $2$, and $300$ are equal to $3$. *Hint*: You may **not** use existing library functions (e.g. in numpy) when implementing the sampling function.

Define your library functions in the class *DiscreteRandomVariable*, whose interface is given below. Verify that your functions are working correctly by running the test cases below and making sure that none of the assertions fail.

In [None]:
class DiscreteRandomVariable(object):
    def __init__(self, values: np.ndarray, pmf: np.ndarray):
        '''Keyword arguments:
        values: np.ndarray -- allowed values for the random variable
        pmf: np.ndarray -- probabiility mass function of the random variable

        '''
        self.values = np.array(values)
        self.pmf = np.array(pmf)

    def expectation(self) -> float:
        '''Calculates the expected value of the random variable.
        '''
        e_x = 0.

        ### Write your code here ###
        # YOUR CODE HERE
        for i,value in enumerate(self.values):
            e_x += value*self.pmf[i]

        return e_x

    def variance(self) -> float:
        '''Calculates the variance of the random variable.
        '''
        var_x = 0.

        ### Write your code here ###
        # YOUR CODE HERE
        ex_x = self.expectation()
        for i,value in enumerate(self.values):
            var_x += ((value-ex_x)**2 * self.pmf[i])

        return var_x

    def function_expectation(self, g: typing.Callable) -> float:
        '''Calculates the expectation of a function of the random variable.

        Keyword arguments:
        g: typing.Callable -- function for transforming the values
                              of the random variable

        '''
        e_g_x = 0.

        ### Write your code here ###
        # YOUR CODE HERE
        for i,value in enumerate(self.values):
            e_g_x += g(value)*self.pmf[i]

        return e_g_x

    def conditional_expectation(self, y_values: np.ndarray,
                                conditional_pmfs: np.ndarray,
                                observed_y: int) -> float:
        '''Calculates the conditional expectation of the random variable
        given a particular value of another random variable.

        Keyword arguments:
        y_values: np.array -- a list of possible y values
        conditional_pmfs: np.ndarray -- a 2D array with as many rows as
                                        there are values in y_values,
                                        such that each row represents
                                        the conditional pmf of x given y
        observed_y: the observed value of y

        '''
        conditional_e_x = 0.

        ### Write your code here ###
        # YOUR CODE HERE
        index = np.where(y_values==observed_y)
        for i,value in enumerate(self.values):
            conditional_e_x +=value*conditional_pmfs[index,i]

        return conditional_e_x

    def cdf(self) -> np.ndarray:
        '''Returns a numpy array representing the cumulative
        distribution function of the random variable.
        '''
        cdf = np.zeros(len(self.pmf) + 1)

        ### Write your code here ###
        # YOUR CODE HERE
        cdf_sum = 0
        for i in range(len(self.values)+1):
            if i==0:
                cdf[i] = 0
            else:
                cdf_sum+=self.pmf[i-1]
                cdf[i] = cdf_sum

        return cdf

    def sample(self, number_of_samples: int) -> np.ndarray:
        '''Samples values from the random variable.
        Returns a numpy array with the generated samples.

        Keyword arguments:
        number_of_samples: int -- number of samples to generate

        '''
        samples = list()

        ### Write your code here ###
        # YOUR CODE HERE
        samples = np.zeros(number_of_samples)
        #generating 10000 random samples
        numbers = np.random.random(number_of_samples)
        sum_probability = 0
        #iterating to our values and checking pmf
        for i,value in enumerate(self.values):
            #checking all generated random which is less than pmf[0] i.e. less than 0.4
            #replacing all those numbers with our first value i.e. -2
            if (i==0):
                data = np.where(numbers<=self.pmf[i])
                samples[data] = value
                sum_probability += self.pmf[i]
            #now we will sum up the probability and will check random number in range of sum_probability 
            # and pmf[i] i.e. probability of our next value
            else:
                data = np.where((numbers>sum_probability) & (numbers<(sum_probability+self.pmf[i])))
                samples[data] = value
                sum_probability += self.pmf[i]

        return np.array(samples)

X = DiscreteRandomVariable(np.array([-2, -1, 0, 1, 2]),
                           np.array([0.4, 0.1, 0.2, 0.2, 0.1]))

# expected value test case
e_x = X.expectation()
assert abs(e_x-(-0.5)) < 1e-5

# variance test case
var_x = X.variance()
assert abs(var_x-2.05) < 1e-5

# expected value of a function test case
g = lambda x: x**2
e_g_x = X.function_expectation(g)
assert abs(e_g_x-2.3) < 1e-5

# conditional expectation test case
y_values = np.array([-2, 1, 4])
conditional_pmfs = np.array([[0.3, 0., 0.3, 0.1, 0.3], \
                             [0.2, 0.1, 0., 0.6, 0.1], [0., 0.3, 0.5, 0.2, 0.]])
observed_y = 4
conditional_e_x = X.conditional_expectation(y_values, conditional_pmfs, observed_y)
assert abs(conditional_e_x-(-0.1)) < 1e-5

# CDF test case
cdf = X.cdf()
assert np.all(np.abs(cdf - np.array([0., 0.4, 0.5, 0.7, 0.9, 1.])) < 1e-5)

# sampling test case
samples = np.array(X.sample(10000))
counts = np.zeros(5)
counts[0] = len(np.where(samples==-2)[0])
counts[1] = len(np.where(samples==-1)[0])
counts[2] = len(np.where(samples==0)[0])
counts[3] = len(np.where(samples==1)[0])
counts[4] = len(np.where(samples==2)[0])
assert counts[0] > 3750 and counts[0] < 4250
assert counts[1] > 750 and counts[1] < 1250
assert counts[2] > 1750 and counts[2] < 2250
assert counts[3] > 1750 and counts[3] < 2250
assert counts[4] > 750 and counts[4] < 1250

### THE FOLLOWING CELLS CONTAIN AUTOMATED TESTS OF YOUR SOLUTION; DO NOT DELETE THEM!

## Exercise 4: Common Discrete Random Variables [30 points]

In this exercise, you will investigate some properties of a few common discrete random variables (please look at the lab class material for an introduction to these).

a) [5 points] Find the mean and variance of a Bernoulli random variable.

If a random variable X has the following distribution

$ p(X==1) = p $ and $ p(X==0) = 1-p$ then X is called Bernoulli random variable

mean of the Bernoulli random variable X can be given by $E(X)=\sum_xxP(x)$, <br/>
$ E(X) = 1* p + 0 * (1-p) $ <br/>
$ E(X) = p$ 

Variance of X can be given as, <br/>
$Variance(X) = E(X^{2}) - E(X)^{2}$<br/>
$ E(X^{2}) = 1^{2} * p + 0^{2} * (1-p) = p $ <br/>
$ E(X)^2 = p^2$ <br/>
$ Variance(X) = p - p^{2} = p(1-p) $

b) [5 points] Find the mean of a geometric random variable.

For a Geometric distribution $P(X=x) = (1 - p)^{x} * p$

Mean is nothing but expectance of random variable X

$ E(X) = \Sigma^{\infty}_{0} x * P(X=x)  = \Sigma^{\infty}_{0} x * (1 - p)^{x} * p = p(1-p) + 2 * p(1-p)^{2} + ..... $ 

let 1-p = q

$ E(X) = pq + 2pq^{2}+ 3pq^{3}...... $

$ qE(X) = pq^{2} + 2pq^{3}+ 3pq^{4}...... $

$ E(X) - qE(X) = (pq + 2pq^{2}+ 3pq^{3}......) - (pq^{2} + 2pq^{3}+ 3pq^{4}......) $

$ E(X) (1 - q)  = pq + pq^{2} + pq^{3}..... = p(q + q^{2} + q^{3} + .... )$

$ E(X) (1 - q)  = \frac{pq}{1-q} $

we know that, p = 1 - q

$ E(X)p = \frac{pq}{1-q} $

$ E(X) = \frac{q}{1-q} = \frac{1-p}{p} $

Variance : 

$ E(X^{2}) = \Sigma^{\infty}_{0} x^{2} * P(X=x^{2})  = \Sigma^{\infty}_{0} x^{2} * (1 - p)^{x} * p = p(1-p) + 4 * p(1-p)^{4} + ..... $ 

$ E(X^{2}) = p(1-p) + 4 * p(1-p)^{4} + 9 * p (1-p)^{9} .....  = pq + 4pq^{4} + 9pq^{9} + ..... = \frac{2 - 3p + p^{2}}{p^{2}}$

variance = $ E(X^{2}) - [E(X)]^{2} = \frac{2 - 3p + p^{2}}{p^{2}} - (\frac{1-p}{p})^{2} = \frac{1-p}{p^{2}}$




c) [10 points] Let's suppose that we have a robot throwing darts at a dartboard. A dartboard is divided into 20 regions for points from 1 to 20 (for the sake of this problem, we will ignore the bullseye and the regions for doubles and triples). Our robot is not very proficient at playing the game: we know that each throw will finish within the dartboard, but hitting any of the regions is equally likely.

Calculate the expected number of throws for the robot to hit a 20.

Probability of the throw hitting 20 $p=1/20$ \
Probability of the throw hitting 20 $1-p=19/20$ \
let $q =1-p$, \
Probability Function $P(X) = q^{x-1} \cdot p$ \
Expected number, \
$ E(X) = \Sigma^{\infty}_{x=1} x * P(X)$\
$ E(X) = \Sigma^{\infty}_{x=1} x * q^{x-1} \cdot p$\
$ E(X) = p \cdot \Sigma^{\infty}_{x=1} x * q^{x-1}$\
$ E(X) = p \cdot (1+2q+3q^2+ \cdots)$\
$ E(X) = p \frac{1}{(1-q)^2}$\
Substituting the values of $p$ and $q$,\
$ E(X) = 0.05 \frac{1}{(1- 0.95)^2}$\
$ E(X) = 20$

c) [10 points] In the analysis of randomised algorithms, we often talk about *indicator random variables*. For a certain event of interest $A$, an indicator random variable $I$ is one that takes the values $1$ and $0$ with probabilities $P(A)$ and $1 - P(A)$ respectively; indicator random variables are thus Bernoulli random variables. Indicator random variables can often simplify the calculation of what could otherwise be a relatively complicated problem.

Let's suppose that we have a fleet of $n$ mobile robots and we command them all to move from the same starting location to the same destination. Each one of these robots is known to have motion problems, so it only reaches the goal with a probability $p = 0.8$. Find the expected number of robots that will reach the destination successfully. *Hint*: Model the problem using indicator random variables.

Let the event of robot reaching the destination successfully be 'A'

For a single robot $ E(A) = 0.8 $

Let N be the number of Robots reach the goal, 

Expectancy of n robot reaching the destination is written as 

$ E(A_{n}) = E[\Sigma^{n}_{0}A_{i}] = \Sigma^{n}_{0}E[A_{i}]$

So $ E(A_{n}) =  \Sigma^{n}_{0}0.8 $

$ E(A_{n}) = 0.8 + 0.8 + 0.8 + ..... n-times = 0.8*n $

$ E(A_{n}) = 0.8*n $

## Exercise 5: Bayesian Hypothesis Testing [10 points]

Consider a case in which Lucy is picking chocolates from one of two bowls, both of which have 30 chocolates each.

The first bowl has the following distribution of chocolates:
* 12 Bounty bars
* 7 Mars bars
* 6 Snickers bars
* 5 KitKat bar

The second bowl has the following distribution of chocolates:
* 8 Bounty bars
* 8 Mars Bars
* 10 Snickers bar
* 4 KitKat bars

The robot now selects one of the bowls at random and then picks 10 chocolates from it, such that we want to know which of the bowls is more likely to have been selected. The following chocolates were picked from the bowl:

    Bounty, Bounty, Mars, Bounty, Snickers, Snickers, Snickers, Mars, Snickers, Bounty

Which bowl did the robot select - bowl 1 or bowl 2? Test both hypotheses and find out which one is more likely.

In [4]:
# YOUR CODE HERE
# Distribution of chocolates
bowl1= {"Bounty": 12, "Mars": 7,"Snickers": 6 , "Kitkat":5}
bowl2= {"Bounty": 8 , "Mars": 8,"Snickers": 10, "Kitkat":4}
#  Robot selection
selected_chocolates = ['Bounty','Bounty','Mars','Bounty','Snickers',
                       'Snickers','Snickers','Mars','Snickers','Bounty']
# probability of picking both the bowls are equal
P_bowl1 = 0.5
P_bowl2 = 0.5
for chocolate in selected_chocolates:
    
    quantity_bowl1 = bowl1[chocolate] # number of desired chocolate
    total_chocolate_bowl1 = sum(bowl1.values()) # Total chocolates
    # probability of desired chocolate intersection bowl1
    P_chocolate_bowl1 = (quantity_bowl1/total_chocolate_bowl1) * P_bowl1 
    bowl1[chocolate] -=1 # subtracting picked chocolate from the distribution
    
    quantity_bowl2 = bowl2[chocolate] # number of desired chocolate
    total_chocolate_bowl2 = sum(bowl2.values()) # Total chocolates
    # probability of desired chocolate intersection bowl2
    P_chocolate_bowl2 = (quantity_bowl2/total_chocolate_bowl2) * P_bowl2 
    bowl2[chocolate] -=1 # subtracting picked chocolate from the distribution
    
    # probability of chocolate picked from the bowl
    P_bowl1 = P_chocolate_bowl1/ (P_chocolate_bowl1 + P_chocolate_bowl2)
    P_bowl2 = P_chocolate_bowl2/ (P_chocolate_bowl1 + P_chocolate_bowl2)
# please assign likelier_hypothesis to 1 or 2 depending on
# which hypothesis you find to be more likely
if P_bowl1 > P_bowl2 : 
    likelier_hypothesis = 1
elif P_bowl2 > P_bowl1 :
    likelier_hypothesis = 2
print("probability of the chocolates picked from the bowl 1", P_bowl1)
print("probability of the chocolates picked from the bowl 2", P_bowl2)
print("Bowl",likelier_hypothesis,"is more likely to be selected by the robot")

probability of the chocolates picked from the bowl 1 0.274745605920444
probability of the chocolates picked from the bowl 2 0.725254394079556
Bowl 2 is more likely to be selected by the robot


In [None]:
### THIS CELL CONTAINS AUTOMATED TESTS OF YOUR SOLUTION; DO NOT DELETE IT!
