# Dealing with multiple random variables

In the previous exercises we have dealt mostly with single random variables.  In fact the only exercise where we have really dealt with pairs of random variables was the one on Bayes Theorem.  In that exercise we had two Bernoulli random variables $X$ and $Y$ and we noted that it was important to understand that the questions gave us conditional probababilities such as:

$$
P(X=1|Y=0) = \frac{P(X=1 \wedge Y=0)}{P(P=0)}
$$

In other words, in the Bayes theorem exercise it was importat <b> to understand that $Y$s being equal to zero had an effect on the probability that $X=1$ </b>.  In more precise statistical terms the values of the random variables $X$ and $Y$ are correlated.  

There are many branches of statistics that are concerned with quantifying the degree to which random variables are correlated (see https://www.youtube.com/watch?v=pR-RwDXCqe0).  Furthermore, you studied this topic in the statistics modules you did last year. In this module we are not overly concerned with assessing degrees of correlation.  It is nevertheless important that we understand what it means when we state that two variables are correlated and what it means when we state that two variables are independent.  In this exercise we will thus do some simple exercises to visualize correlated and independent random variables.       

As always there is some code below that will allow you do produce some plots we need for this exercise.  Press shift and enter on this cell now in order to load this code.

In [44]:
import math
%matplotlib notebook
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.animation as anim
from matplotlib import rc
from IPython.display import HTML
import random
rc('animation', html='none')

class plotobj(object) :
    def __init__(self, ngen, xlist, ylist ) :
        self.ngen=ngen
        self.xlist, self.ylist = xlist, ylist
        self.fig = plt.figure()
        self.ax = plt.axes(xlim=(0, 1.0), ylim=(min(ylist), max(ylist)))
    
    def setup(self):
        self.xdata=[]
        self.ydata=[]
        boline, = self.ax.plot(self.xlist,self.ylist,'r-')    
        self.line, = self.ax.plot([],[],'.')
        return self.line,
    
    def run(self,data):
        x,y = data
        self.xdata.append(x)
        self.ydata.append(y)
        self.line.set_data(self.xdata, self.ydata )
        return self.line,
    
def raw( ngen, myvar ):
    cnt = 0
    while cnt < ngen :
        cnt += 1
        yield myvar() 
        
            
def dynamicplot( ngen, myvar, xlist, ylist ):
    myplot = plotobj( ngen, xlist, ylist )
    return anim.FuncAnimation(fig=myplot.fig, func=myplot.run, frames=raw( ngen, myvar ), 
                                init_func=myplot.setup, interval=10, blit=False, repeat=False)

# Generating pairs of (independent) random variables

The purpose of this exercise is to generate random variables that are correlated to various degrees.  Before we get on to that, however, we need to understand how to generate pairs of random variables and plot them.  The cell below shows you how to generate two <b>independent</b>, uniform random variables $X$ and $Y$.  These are then returned and a dynamic plotting function is used to plot these random values.  This dynamic plot function draws a dot at the point which with coordinates $(X,Y)$, where $X$ and $Y$ are the values of our random variables.  This process is repeated multiple times and we see that the variables are scattered throughout the whole of the space.  Press shift and enter on the cell below now in order to see what I mean.

In [20]:
def myvar():
    X = random.uniform(0,1)
    Y = random.uniform(0,1)
    return X,Y

dynamicplot( 200, myvar, [], [] )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x10f475080>

In order to see if you have understood the above function works try the exercise below.  I would like you to write a function that generates variables that are uniformly distributed but that are inside the red box shown in the plane below.    

In [16]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def myfunc():
    return 0,0

In [26]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0,1,1,0,0], [0.2,0.2,0,0,0.2]
dynamicplot( 200, myfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x110654f60>

Now write a program to generate random numbers in the red box shown here:

In [27]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [28]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0,0,0.2,0.2,0], [0,1,1,0,0]
dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x110a63358>

Now what about this box here:

In [29]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [32]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0,0,0.6,0.6,0], [0,0.4,0.4,0,0]
dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x1112d5d30>

And this one:

In [33]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [34]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0.2,0.2,0.8,0.8,0.2], [0,0.4,0.4,0,0]
dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x111038da0>

And last but not least.  What about this one:

In [35]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [36]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0.2,0.2,0.8,0.8,0.2], [0.2,0.6,0.6,0.2,0.2]
dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x111889b38>

# Linearly correlated random variables

Lets make things more difficult now by introducing correlation.  Can you generate uniform random variables between the red lines shown here. Hint: If your first random variable were called $X$ where would points be plotted if your function returned $X$,$X$.  

In [37]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [45]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0,1.2,1,0], [-0.2,1.0,1.2,0.2]
dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x10e784710>

Can you do it now?

In [40]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [46]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0,1.0,1,0], [-0.2,0.3,0.7,0.2]
dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x1126489b0>

And this one

In [None]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [53]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv, yboxv = [0,1.0,1,0], [0.3,0.8,1.2,0.7]
dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x10e7609e8>

# Non-linearly correlated variables

Now try to generate uniformly distributed random variables in the part of the plane that is contained within the red lines.  Hint the red lines shown here are the functions:

$$
y_1 = x^2 - 0.2 \qquad \qquad \qquad y_2 = x^2 + 0.2
$$

In [None]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [48]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv,yboxv = [],[]
for i in range(0,201):
    x = i*1./200.
    xboxv.append(x)
    yboxv.append(x*x+0.2)
    
for i in range(0,200):
    x = 1. - i*1./200.
    xboxv.append(x)
    yboxv.append(x*x-0.2)

dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x112be77f0>

And the last one - how do we generate random numbers in this region.  Hint: The red lines here are plots of the functions:

$$
y_1 = \sin(4*x) - 0.1 \qquad \qquad \qquad y_2 = \sin(4*x) + 0.1
$$

In [49]:
# You need to modify this function as at the moment 
# it just returns 0,0 which is not very random
def mynfunc():
    return 0,0

In [52]:
# Once you are convinced you have a working version of the function above run
# this cell again
xboxv,yboxv = [],[]
for i in range(0,201):
    x = i*1./200.
    xboxv.append(x)
    yboxv.append(math.sin(4*x)+0.1)
    
for i in range(0,200):
    x = 1. - i*1./200.
    xboxv.append(x)
    yboxv.append(math.sin(4*x)-0.1)

dynamicplot( 200, mynfunc, xboxv, yboxv )

<IPython.core.display.Javascript object>

<matplotlib.animation.FuncAnimation at 0x113192940>

If you have got this far well done.  Now in your notes explain how you would assess if two random variables $X$ and $Y$ are related via the following three equations:

$$
\begin{aligned}
Y & = mX + c \\
Y & = mX^2 + c \\
Y & = m\sin(X) + c
\end{aligned}
$$

where in these expressions $m$ and $c$ are parameters, which you would identify by fitting the model