# Convert common stats formulas to numpy

Please convert the following common statistics formulas to numpy. I have provided a core Python implementation, which you can use to confirm that your solution is correct. Your task is use standard numpy techniques.

You are not allowed to use `.mean()`, `.average()`, `.std()` or `.var()` for standard deviation and variance formulas (otherwise numpy makes it too easy to write many of these formulas and defeats the purpose of you thinking like an array programmer).

In [40]:
import math
import numpy as np

#### Sum
$\sum _{i=1}^{n}a_{i}$

In [41]:
np.random.seed(1)
scores_sum = np.random.randint(0, 100, size=25)

In [42]:
sum(scores_sum)

np.int32(1007)

In [43]:
total = 0
for i in scores_sum:
    total += i

total


np.int32(1007)

In [44]:
#Answer
total = np.add.reduce(scores_sum)
total

np.int64(1007)

#### Expected value (equally likely)
${\frac {1}{n}}\sum _{i=1}^{n}a_{i}$

In [45]:
np.random.seed(2)
scores_ev = np.random.randint(0, 100, size=25)

In [46]:
total = 0
for i in scores_ev:
    total += i

expected_value = total/len(scores_ev)
expected_value

np.float64(48.92)

In [47]:
#Answer
total = np.add.reduce(scores_ev)
expected_value = total/np.size(scores_ev)
expected_value

np.float64(48.92)

#### Sum of squares

$\sum (y - \hat{y})^2$

In [48]:
np.random.seed(3)

scores_ssq = np.random.randint(0, 100, size=25)
scores_yhat_ssq = np.random.randint(0, 100, size=25)

In [49]:
total = 0
for y, y_hat in zip(scores_ssq, scores_yhat_ssq):
    total += (y - y_hat) ** 2

total

np.int32(41573)

In [50]:
#Answer
differences = np.subtract(scores_ssq, scores_yhat_ssq)
sum_square_differences = np.add.reduce(np.square(differences))
sum_square_differences


np.int64(41573)

#### Variance

${\frac{1}{N}\sum\limits_{i = 1}^N {\left( {x_i - \bar x} \right)^2 } }$

In [51]:
np.random.seed(4)
scores_var = np.random.randint(0, 100, size=25)

In [52]:
#Calculate mean
total = 0
for i in scores_var:
    total += i

mean = total / len(scores_var)

total = 0
for i in scores_var:
    total += (i - mean)**2
    
variance = total / len(scores_var)
variance

np.float64(616.4864000000001)

In [53]:
#Answer
mean = np.add.reduce(scores_var)/np.size(scores_var)
variance = np.add.reduce(np.square(np.subtract(scores_var,mean)))/np.size(scores_var)
variance

np.float64(616.4864)

#### Standard Deviation
$\sqrt {\frac{1}{N}\sum\limits_{i = 1}^N {\left( {x_i - \bar x} \right)^2 } }$

In [54]:
np.random.seed(5)
scores_stddev = np.random.randint(0, 100, size=25)

In [55]:
#Calculate mean
total = 0
for i in scores_stddev:
    total += i

mean = total / len(scores_stddev)

total = 0
for i in scores_stddev:
    total += (i - mean)**2
    
variance = total / len(scores_stddev)
std_dev = math.sqrt(variance)
std_dev

27.735205065043232

In [56]:
#Answer
mean = np.add.reduce(scores_stddev)/np.size(scores_stddev)
variance = np.add.reduce(np.square(np.subtract(scores_stddev,mean)))/np.size(scores_stddev)
std_dev = np.sqrt(variance)
std_dev

np.float64(27.73520506504324)

#### Expected value, with provided probabilities
$\sum xP$

In [57]:
np.random.seed(6)
scores_ev2 = np.random.randint(0, 100, size=25)

probs_ev2 = np.random.random(25)
probs_ev2 = probs_ev2 / probs_ev2.sum()

In [58]:
expected_value = 0
for score, prob in zip(scores_ev2, probs_ev2):
    expected_value += score * prob

expected_value

np.float64(52.04016029579092)

In [59]:
#Answer

expected_value = np.add.reduce(np.multiply(scores_ev2,probs_ev2))
expected_value

np.float64(52.04016029579092)

**Latex formulas resource**
* https://latex.codecogs.com/eqneditor/editor.php
* https://equplus.net/eqninfo/Equation-264.html
* https://gist.github.com/derekmcloughlin/896da22518ef2f3d81b0
* https://www.overleaf.com/learn/latex/Integrals%2C_sums_and_limits#Sums_and_products