# Coding exercises
Exercises 1-3 are thought exercises that don't require coding. If you need a Python crash-course/refresher, work through the [`python_101.ipynb`](./python_101.ipynb) notebook in chapter 1.

## Exercise 4: Generate the data by running this cell
This will give you a list of numbers to work with in the remaining exercises.

In [1]:
import random

random.seed(0)
salaries = [round(random.random()*1000000, -3) for _ in range(100)]

## Exercise 5: Calculating statistics and verifying
### mean

In [8]:
n = len(salaries)
mean = sum(salaries)/n
print(mean)

585690.0


### median

In [9]:
import statistics as stat
median = stat.median(salaries)
print(median)

589000.0


### mode

In [10]:
mode = stat.mode(salaries)
print(mode)

477000.0


### sample variance
Remember to use Bessel's correction.

In [12]:
samp_var = sum((x - mean)**2 for x in salaries)/(n-1)
print(samp_var)

70664054444.44444


### sample standard deviation
Remember to use Bessel's correction.

In [13]:
stdevv = samp_var ** 0.5
print(stdevv)

265827.11382484


## Exercise 6: Calculating more statistics
### range

In [14]:
range = max(salaries) - min(salaries)
print(range)

995000.0


### coefficient of variation
Make sure to use the sample standard deviation.

In [16]:
cov = stdevv * 100/mean
print(f'{cov:.2f}%')

45.39%


### interquartile range

In [26]:
import numpy as np
from sklearn.preprocessing import MinMaxScaler,StandardScaler
q1 = np.percentile(salaries,25)
q3 = np.percentile(salaries,75)
q3-q1

413250.0

### quartile coefficent of dispersion

In [27]:
qcod = (q3 -q1)/(q3+q1)
print(qcod)

0.338660110633067


## Exercise 7: Scaling data
### min-max scaling

In [31]:
scaler = MinMaxScaler()
salaries = np.array(salaries).reshape(-1,1)
minmaxed = scaler.fit_transform(salaries)
print(minmaxed)

[[0.84723618]
 [0.76080402]
 [0.42211055]
 [0.25929648]
 [0.51256281]
 [0.40603015]
 [0.78693467]
 [0.30351759]
 [0.47839196]
 [0.58492462]
 [0.91155779]
 [0.50653266]
 [0.28241206]
 [0.75879397]
 [0.6201005 ]
 [0.25125628]
 [0.91356784]
 [0.98693467]
 [0.81306533]
 [0.90552764]
 [0.31055276]
 [0.73266332]
 [0.90251256]
 [0.68643216]
 [0.47336683]
 [0.10050251]
 [0.43517588]
 [0.61306533]
 [0.91658291]
 [0.97085427]
 [0.47839196]
 [0.86834171]
 [0.26030151]
 [0.8080402 ]
 [0.55075377]
 [0.01306533]
 [0.72261307]
 [0.4       ]
 [0.8281407 ]
 [0.67035176]
 [0.        ]
 [0.49547739]
 [0.87135678]
 [0.24422111]
 [0.32562814]
 [0.87336683]
 [0.19095477]
 [0.56984925]
 [0.23919598]
 [0.9718593 ]
 [0.80603015]
 [0.44924623]
 [0.07939698]
 [0.32060302]
 [0.50954774]
 [0.93668342]
 [0.10854271]
 [0.55276382]
 [0.70954774]
 [0.54874372]
 [0.81708543]
 [0.54170854]
 [0.9678392 ]
 [0.60502513]
 [0.58994975]
 [0.44623116]
 [0.59798995]
 [0.38592965]
 [0.57788945]
 [0.29045226]
 [0.18894472]
 [0.18

### standardizing

In [32]:
sscaler = StandardScaler()
standard = sscaler.fit_transform(salaries)
standard

array([[ 0.97661715],
       [ 0.65146878],
       [-0.62265912],
       [-1.23514791],
       [-0.28238758],
       [-0.68315184],
       [ 0.74976945],
       [-1.06879293],
       [-0.41093461],
       [-0.01017034],
       [ 1.21858803],
       [-0.30507235],
       [-1.14818962],
       [ 0.64390719],
       [ 0.12215749],
       [-1.26539427],
       [ 1.22614962],
       [ 1.50214765],
       [ 0.84807012],
       [ 1.19590326],
       [-1.04232737],
       [ 0.54560652],
       [ 1.18456087],
       [ 0.37168995],
       [-0.42983858],
       [-1.83251351],
       [-0.57350879],
       [ 0.09569192],
       [ 1.237492  ],
       [ 1.44165493],
       [-0.41093461],
       [ 1.05601384],
       [-1.23136711],
       [ 0.82916615],
       [-0.13871737],
       [-2.16144268],
       [ 0.50779857],
       [-0.70583661],
       [ 0.90478204],
       [ 0.31119723],
       [-2.21059301],
       [-0.34666109],
       [ 1.06735623],
       [-1.29185983],
       [-0.98561544],
       [ 1

## Exercise 8: Calculating covariance and correlation
### covariance

In [37]:
cov_matrix = np.cov(minmaxed.T,standard.T)
cov = cov_matrix[0,1]
cov

0.2685088459449036

### Pearson correlation coefficient ($\rho$)

In [39]:
pcc_matrix = np.corrcoef(minmaxed.T,standard.T)
pcc = pcc_matrix[0,1]
pcc

1.0

<hr>
<div style="overflow: hidden; margin-bottom: 10px;">
    <div style="float: left;">
        <a href="./python_101.ipynb">
            <button>Python 101</button>
        </a>
    </div>
    <div style="float: right;">
        <a href="../../solutions/ch_01/solutions.ipynb">
            <button>Solutions</button>
        </a>
        <a href="../ch_02/1-pandas_data_structures.ipynb">
            <button>Chapter 2 &#8594;</button>
        </a>
    </div>
</div>
<hr>