## Descriptive Statistics

 Import **NumPy**, **SciPy**, and **Pandas**

In [5]:
import numpy as np
import pandas as pd
from scipy import stats

 Randomly generate 1,000 samples from the normal distribution using `np.random.normal()`(mean = 100, standard deviation = 15)

In [11]:
samples = np.random.normal(loc= 100, scale = 15,size=1000)
samples

array([100.99052424, 103.14597072,  86.45064325,  99.58017262,
       118.21045024,  69.98519391,  86.30447936, 116.27320123,
        96.99369128,  92.89981847,  78.20143842,  91.1813365 ,
        97.69591439,  83.30602811,  75.70952157, 127.61661253,
        81.18703888, 109.79245278,  90.39484956, 104.4602251 ,
       125.32119307,  78.89741424, 108.04640155,  96.85471546,
        95.66734602,  73.58942957, 121.17457121, 115.55058228,
       100.32354832, 120.14500888, 113.72716518,  95.18478249,
        94.2690799 , 105.13438223, 100.38542392,  77.34247901,
        99.07611129,  89.61668439,  80.99059199, 106.49313889,
        98.24961977, 115.57116933,  98.03968728, 103.87174598,
        90.08522556,  78.85984668, 112.15632   ,  85.75989203,
        82.58585304, 101.36597274, 126.86085663,  90.93984712,
        85.81831017, 101.154161  , 109.85045498,  97.70244539,
       108.79650043, 115.08083892, 110.08159393, 101.08731372,
        94.31980978, 101.46349092, 144.8834995 ,  93.71

Compute the **mean**, **median**, and **mode**

In [14]:
mean = np.mean(samples)
median = np.median(samples)
mode = stats.mode(samples)

In [16]:
mean

100.07195856784823

In [17]:
median

100.00490799735296

In [18]:
mode

ModeResult(mode=array([45.5215659]), count=array([1]))

Compute the **min**, **max**, **Q1**, **Q3**, and **interquartile range**

In [21]:
min = np.min(samples)
max = np.max(samples)
q1 = np.percentile(samples,25)
q3 = np.percentile(samples,75)
iqr = q3-q1

In [22]:
min

45.521565902224864

In [23]:
max

155.6701409148465

In [24]:
q1

89.65275601070093

In [25]:
q3

110.0140695577609

In [26]:
iqr

20.36131354705998

Compute the **variance** and **standard deviation**

In [29]:
variance = np.var(samples)
std_dev = np.std(samples)

In [31]:
variance

222.59039543269748

In [32]:
std_dev

14.919463644270108

Compute the **skewness** and **kurtosis**

In [35]:
skewness = stats.skew(samples)
kurtosis = stats.kurtosis(samples)

## NumPy Correlation Calculation

Create an array x of integers between 10 (inclusive) and 20 (exclusive). Use `np.arange()`

In [41]:
x = np.arange(10,20)
x

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

Then use `np.array()` to create a second array y containing 10 arbitrary integers.

In [40]:
y = np.array([1,3,4,4,6,8,9,10,7,7])
y

array([ 1,  3,  4,  4,  6,  8,  9, 10,  7,  7])

Once you have two arrays of the same length, you can compute the **correlation coefficient** between x and y

In [43]:
r = np.corrcoef(x,y)
r

array([[1.        , 0.83170436],
       [0.83170436, 1.        ]])

## Pandas Correlation Calculation

Run the code below

In [44]:
x = pd.Series(range(10, 20))
y = pd.Series([2, 1, 4, 5, 8, 12, 18, 25, 96, 48])

Call the relevant method  to calculate Pearson's r correlation.

In [50]:
r = x.corr(y)
r

0.7586402890911867

In [51]:
y.corr(x)
r

0.7586402890911867

OPTIONAL. Call the relevant method to calculate Spearman's rho correlation.

In [52]:
rho = x.corr(y, method='spearman')  
rho

0.9757575757575757

## Seaborn Dataset Tips

Import Seaborn Library

In [None]:
import 

Load "tips" dataset from Seaborn

In [None]:
tips = sns.load_dataset("tips")

Generate descriptive statistics include those that summarize the central tendency, dispersion

Call the relevant method to calculate pairwise Pearson's r correlation of columns