# Convolving Classes

In this notebook, we will continue to explore classes by starting the creation of a ``ConvolvedDiscreteAndDiscrete`` class, per problem set 1. A reminder on what classes are-- classes are python _objects_ that organize information (**attributes**) and operations (**methods**). Classes exist in the abstract, and then are given particular values upon **instantiation**. Classes pop up all over the place in python programming; whether or not you write them yourself, you will not avoid using them through the packages you know and love like ``numpy`` or ``scikit-learn``. Let's remember the anatomy of a python class by taking a look at the ``ConvolvedContinuousAndDiscrete`` class that Ethan created in class.

In [None]:
# first, some package imports
from scipy.stats import distributions as iid

In [None]:
# Reproduction of Ethan's class
class ConvolvedContinuousAndDiscrete(iid.rv_continuous):
    """Convolve (add) a continuous rv x and a discrete rv s,
       returning the resulting cdf."""

    def __init__(self,f,s):
        self.continuous_rv = f
        self.discrete_rv = s
        super(ConvolvedContinuousAndDiscrete, self).__init__(name="ConvolvedContinuousAndDiscrete")
        
    def _cdf(self,z):
        F=0
        s = self.discrete_rv
        x = self.continuous_rv
        
        for k in range(len(s.xk)):
            F = F + x.cdf(z-s.xk[k])*s.pk[k]
        return F

    def _pdf(self,z):
        f=0
        s = self.discrete_rv
        x = self.continuous_rv
        
        for k in range(len(s.xk)):
            f = f + x.pdf(z-s.xk[k])*s.pk[k]
        return f


## Class anatomy 1: Inheritance

Why is the class definition a function of ``iid.rv_coninuous``, and what's that line that starts with ``super()`` in the constructor?

This class is doing something rather advanced in Object-Oriented Programming (OOP) which is called **inheritance**. Inheritance is when a class takes methods and attributes from a  "parent" class and builds on them. For example, we could conceive of a ``Vehicles()`` class with attributes like horsepower or gas mileage, and then a ``Truck()`` class that inherits these methods from ``Vehicles()``, then adds more information like the volume of the truckbed. We won't worry too much about inheritance here since it is a fairly advanced technique, but I want to show you what it's good for. Let's try something...

In [None]:
# First, create a continuous and a discrete distribution to feed into the class
Omega = (-1,0,1)
Pr = (1/3.,1/2.,1/6.)

my_discrete_distrib = iid.rv_discrete(values=(Omega,Pr))
my_continuous_distrib = iid.norm()

In [None]:
# instantiate the class
my_convolution = ConvolvedContinuousAndDiscrete(my_continuous_distrib, my_discrete_distrib)
# call a method: median
my_convolution.median()

Whaaat? How was I able to call the ``median()`` method?? I was able to because of inheritance! ``median()`` is a method of ``iid.rv_continuous``, and since this class inherits all the methods from that class, I can use them here too! Now let's get back to the stuff we are more familiar with.

## Class anatomy 2: Attributes
Test your knowledge:
- How many attributes does this class have (not counting inherited attributes)? What are they? 

What data type is the ``continuous_rv`` attribute of ``my_convolution``? In the next cell, print the median of the continuous random variable used to instantiate ``my_convolution``.

## Class anatomy 3: Methods
Remember, methods are functions that belong to a class. 
- How many methods does this class have (again, ignore any inherited methods)?

In the next cell, print the CDF of ``my_convolution`` at ``x = 2``

## Under a microscope: What is this class doing?
Let's look closer at the ``cdf`` method of this new class. What is it doing? First, we need some math. Let $G()$ be the CDF function of the continuous variable, and let $P(x_n)$ be the probability of a term $x_n$ in the discrete distribution. Then the CDF of a convolution of two variables is given by:

$$F(z) = \sum_{n} P(x_n)G(z-x_n)$$

The intuition is the following:

Suppose you are drawing random numbers from the discrete distribution, and then given that you draw another one from the continuous distribution and add them. Conditional on a draw $x_n$ from the discrete distribution, the probability that your sum is $\leq$ z is the cumulative probaility that you draw a number $\leq z-x_n$ from $G()$. Adding these conditional probabilities up, each multiplied by the probability that you chose $x_n$, $x_j$, $x_i$, etc, is the cumulative probability.

So in code, what elements do we need to implement this formula?
1. The function $G()$ - we will access this through the iid.rv_continuous.cdf() method
2. The probabilities at each $x_n$ of the discrete distribution - we will access these through an attribute of the iid.rv_discrete() class
3. The points $x_n$ of the discrete distribution - we will access these through an attribute of the iid.rv_discrete() class

Let's see how to access these:

In [None]:
# The CDF method from our continuous variable calculated at 0
my_continuous_distrib.cdf(0)

In [None]:
# The support of the discrete distribution
my_discrete_distrib.xk

In [None]:
# the probabilities of each point in the support
my_discrete_distrib.pk

Now that we have these elements, let's see how Ethan implements this:

In [None]:
def _cdf(self,z):
    # Start at 0
    F=0
    # Grab our random variables via self.
    s = self.discrete_rv
    x = self.continuous_rv
    
    # for each k = 0, 1, 2, ... to the end of the x.sk vector
    for k in range(len(s.xk)):
        # (starting at 0...) iteratively add P(x_n)*CDF(z-x_n)
        F = F + x.cdf(z-s.xk[k])*s.pk[k]
    return F

## Now moving towards a ConvolvedDiscreteAndDiscrete
To create this class, let's start by copying the shell of Ethan's class. First, we need to think about instantiation.
1. What inputs should we require to this class upon instantiation?
2. What kind of class should we inherit from?
3. What attributes will we want to define?

In [None]:
class ConvolvedDiscreteAndDiscrete(): # no inherited type yet!
    """Convolve (add) a discrete rv x and a discrete rv s,
       returning the resulting cdf."""

    def __init__(self): # no inputs yet!
        print("No constructor yet!")
        
    def _cdf(self,z):
        print("No cdf yet!")

    def _pdf(self,z):
        print("No pdf yet!")


## The PMF function

Discrete random variables have pmf, or probability mass functions- this is the probability that $Z = Y + X$ is $= z$? Again, some intution. Let's say you drew a value $x_n$ from the $X$ distribution. Conditional on this draw, the probaility that you draw a $y$ from $Y$ such that $x+y\leq z$ is:

$$P(x+y = z | x) = P(y= z-x)$$

Again we'll use the fact that we can add up conditional probabilities, times the probability of the condition, to get the unconditional probability. Now let $P_x(x)=P(X=x)$ and $P_y(y) = P(Y=y)$. Then:

$$P(x+y = z) = \sum_{n} P_x(x_n)P_y(z-x_n)$$

So I am going to replace the pdf function with a pmf function (thanks to Gary for pointing out the incorrect nomenclature on a previous version of this). In code:


In [None]:
class ConvolvedDiscreteAndDiscrete(iid.rv_discrete):
    """Convolve (add) a discrete rv x and a discrete rv s,
       returning the resulting cdf."""

    def __init__(self,X,Y):
        self.X = X
        self.Y = Y
        super(ConvolvedDiscreteAndDiscrete, self).__init__(name="ConvolvedDiscreteAndDiscrete")

        
    def pmf(self,z):
        print("No cdf yet!")