# Coding exercises
Exercises 1-3 are thought exercises that don't require coding. If you need a Python crash-course/refresher, work through the [`python_101.ipynb`](./python_101.ipynb) notebook in chapter 1.

## Exercise 4: Generate the data by running this cell
This will give you a list of numbers to work with in the remaining exercises.

In [3]:
conda install -c conda-forge statistics

^C

Note: you may need to restart the kernel to use updated packages.


In [2]:
from collections import Counter
import math as m
import statistics as stat

In [4]:
import random

random.seed(0)
salaries = [round(random.random()*1000000, -3) for _ in range(100)]

## Exercise 5: Calculating statistics and verifying
### mean

In [24]:
mean = sum(salaries)/100
mean

585690.0

In [33]:
stat.mean(salaries)

585690.0

### median

In [9]:
salaries.sort()

In [10]:
(salaries[49] + salaries[50])/2

589000.0

In [35]:
stat.median(salaries)

589000.0

### mode

In [17]:
Counter(salaries).most_common(1)

[(477000.0, 3)]

In [36]:
stat.mode(salaries)

477000.0

### sample variance
Remember to use Bessel's correction.

In [24]:
sample_variance = sum([(x - mean)**2 for x in salaries])/99

In [25]:
sample_variance

70664054444.44444

In [37]:
stat.variance(salaries)

70664054444.44444

### sample standard deviation
Remember to use Bessel's correction.

In [30]:
standard_deviation = m.sqrt(sample_variance)

In [31]:
standard_deviation

265827.11382484

In [45]:
stat.stdev(salaries)

265827.11382484

## Exercise 6: Calculating more statistics
### range

In [17]:
salaries_range = max(salaries) - min(salaries)
salaries_range

995000.0

### coefficient of variation
Make sure to use the sample standard deviation.

In [42]:
cv = standard_deviation/mean
cv

0.45386998894439035

### interquartile range

In [None]:
def quantile(x, pct):
    x.sort()
    index = (len(x) + 1) * pct - 1
    if len(x) % 2:
        # odd, so grab the value at index
        return x[int(index)]
    else:
        return (x[m.floor(index)] + x[m.ceil(index)]) / 2

In [10]:
sum([x < quantile(salaries, 0.25) for x in salaries]) / len(salaries)

0.25

In [11]:
sum([x < quantile(salaries, 0.75) for x in salaries]) / len(salaries)

0.75

In [13]:
q1, q3 = quantile(salaries, 0.25), quantile(salaries, 0.75)
iqr = q3 - q1
iqr

417500.0

### quartile coefficent of dispersion

In [16]:
qcd = iqr/(q1 + q3)
qcd

0.3417928776094965

## Exercise 7: Scaling data
### min-max scaling

In [23]:
min_salary = min(salaries)
salaries_minmax = [(x - min_salary) / salaries_range for x in salaries]
salaries_minmax[:5]

[0.0,
 0.01306532663316583,
 0.07939698492462312,
 0.0814070351758794,
 0.08944723618090453]

### standardizing

In [25]:
from statistics import mean, stdev

mean_salary = mean(salaries)
salary_std = stdev(salaries)

salaries_standardized = [(x - mean_salary) / salary_std for x in salaries]
salaries_standardized[:5]

[-2.199512275430514,
 -2.150608309943509,
 -1.9023266390094862,
 -1.8948029520114855,
 -1.8647082040194827]

## Exercise 8: Calculating covariance and correlation
### covariance

In [27]:
import numpy as np

np.cov(salaries_minmax, salaries_standardized)

array([[0.07137603, 0.26716293],
       [0.26716293, 1.        ]])

In [29]:
from statistics import mean

running_total = [
    (x - mean(salaries_minmax)) * (y - mean(salaries_standardized))
    for x, y in zip(salaries_minmax, salaries_standardized)
]

cov = mean(running_total)
cov

0.26449129918250414

### Pearson correlation coefficient ($\rho$)

In [31]:
from statistics import stdev

correlation = cov / (stdev(salaries_minmax) * stdev(salaries_standardized))
correlation

0.9900000000000001

<hr>
<div style="overflow: hidden; margin-bottom: 10px;">
    <div style="float: left;">
        <a href="./python_101.ipynb">
            <button>Python 101</button>
        </a>
    </div>
    <div style="float: right;">
        <a href="../../solutions/ch_01/solutions.ipynb">
            <button>Solutions</button>
        </a>
        <a href="../ch_02/1-pandas_data_structures.ipynb">
            <button>Chapter 2 &#8594;</button>
        </a>
    </div>
</div>
<hr>