### Happy Birthday, Raf! I hope these functions spark joy.

![happy_little_bday.png](attachment:happy_little_bday.png)

# Assessments

* write a function to find the mean of a list of numbers
* write a function that calculates a dot product
* write a function that centers an array on the mean 
* write a function to calculate the standard deviation of a list of a numbers (preferably using a dot product)
* write a function to calculate the correlation and covariance of two lists
* write a function to calculate the cost between ytrue and ypred

In [59]:
from math import sqrt
from sklearn.metrics import mean_squared_error
import numpy as np
import random

In [27]:
lst1 = random.sample(range(1, 100), 10) 
lst2 = random.sample(range(1, 100), 10)
lst1[:5], lst2[:5]

([88, 81, 7, 63, 52], [25, 14, 60, 83, 87])

In [28]:
def mean_center(lst1):
    """
    write a function that takes in a list
    and returns that list but centered on the mean
    
    in other words x -> (x - mu)
    """
    mean = sum(lst1)/len(lst1)
    means = []
    for item in lst1:
        means.append(round(item - mean, 3))
    return means


def dot_product(lst1, lst2):
    """
    write a function that takes in a list 
    of numbers and returns their dot product
    """
    prod = []
    for i in range(len(list(zip(lst1, lst2)))):
        prods.append(lst1[i]*lst2[i])
    return sum(prod)

In [29]:
mean_center(lst1)

[27.7, 20.7, -53.3, 2.7, -8.3, 34.7, 38.7, -23.3, -18.3, -21.3]

In [52]:
# check (me vs. numpy)
dot_product(lst1, lst2), np.dot(lst1, lst2)

(33166, 33166)

# Standard Deviation Formula
![](images/standard-deviation.png)

In [40]:
def standard_deviation(lst1):
    """
    write a function that takes in a list
    of numbers and returns its standard deviation
    """
    mean = sum(lst1)/len(lst1)
    variances = []
    for i in range(len(lst1)):
        variances.append(abs(lst1[i]-mean)**2)
    return sqrt(sum(variances)/len(lst1))
                                        

In [49]:
# check (me vs. numpy)
standard_deviation(lst1), np.std(lst1)

(28.54137347781287, 28.54137347781287)

# Covariance Formula
![](images/covariance.png)

In [42]:
def covariance(lst1, lst2):
    """
    write an function that takes in two lists
    of numbers and returns their covariance
    """
    n1 = mean_center(lst1)
    n2 = mean_center(lst2)
    return dot_product(n1, n2)/(len(lst1) - 1)

In [51]:
# check (me vs. numpy)
covariance(lst1, lst2), np.cov(lst1, lst2)[0][1]

(-375.0888888888889, -375.08888888888885)

# Correlation Formula
![](images/correlation.png)

In [57]:
def correlation(lst1, lst2):
    """
    write an function that takes in two lists 
    of numbers and returns their correlation
    """
    # calc numerator as in covariance above
    mn1 = mean_center(lst1)
    mn2 = mean_center(lst2)
    numer = dot_product(mn1, mn2)
    
    # calc denominator by taking sum of squares of normalized means
    d1 = sum(list(map(lambda x: x**2, mn1)))
    d2 = sum(list(map(lambda x: x**2, mn2)))
    denom = sqrt(d1*d2)
    return numer/denom

In [58]:
# check (me vs. numpy)
correlation(lst1, lst2), np.corrcoef(lst1, lst2)[0][1]

(-0.4912727399578928, -0.4912727399578928)

# RMSE Formula
![](images/rmse.png)

In [104]:
def rmse(ytrue, ypred):
    """
    write a function that takes in ytrue and ypred
    and returns their root mean squared error
    """
    n = len(ytrue)
    errors = []
    for i in range(n):
        error = (ypred[i] - ytrue[i])**2
        errors.append(error)
    return sqrt(sum(errors)/n)

        

In [110]:
# check (me vs. numpy)
ytrue = random.sample(range(2, 75), 20) 
ypred = random.sample(range(1, 100), 20)
rmse(ytrue, ypred), sqrt(mean_squared_error(ytrue, ypred))

(40.88581661163196, 40.88581661163196)

# RSS Formula 
![](images/rss.png)

In [111]:
def rss(ytrue, ypred):
    """
    write a function that takes in ytrue and ypred
    and returns their rss
    """
    n = len(ytrue)
    errors = []
    for i in range(n):
        error = (ytrue[i] - ypred[i])**2
        errors.append(error)
    return sum(errors)

In [112]:
# check (me vs. numpy)
rss(ytrue, ypred), (mean_squared_error(ytrue, ypred)*len(ytrue))

(33433, 33433.0)