## Descriptive Statistics

 Import **NumPy**, **SciPy**, and **Pandas**

In [1]:
import numpy as np
import scipy as ap
import pandas as pd
from scipy import stats

 Randomly generate 1,000 samples from the normal distribution using `np.random.normal()`(mean = 100, standard deviation = 15)

In [12]:
samples = np.random.normal(100, 15, 1000)

Compute the **mean**, **median**, and **mode**

In [13]:
mean = np.mean (samples)
print ("Mean: ", mean)
median = np.median (samples)
print ("Median: ", median)
mode = stats.mode (samples)
print ("Mode: ", mode)

Mean:  99.37718712999846
Median:  99.47538206453825
Mode:  ModeResult(mode=array([49.01927145]), count=array([1]))


Compute the **min**, **max**, **Q1**, **Q3**, and **interquartile range**

In [14]:
min = np.min (samples)
print("Min: ", min)

max = np.max (samples)
print("Max: ", max)

q1 = np.percentile (samples, 25)
print("Q1: ", q1)

q3 = np.percentile (samples, 75)
print("Q3: ", q3)

iqr = stats.iqr (samples)
print("IQR: ", iqr)

Min:  49.01927144807338
Max:  146.20571451282385
Q1:  89.46250993096379
Q3:  109.39006101426239
IQR:  19.927551083298596


Compute the **variance** and **standard deviation**

In [15]:
variance = np.var (samples)
print ("Var: ", variance)

std_dev = np.std(samples)
print ("Std: ", std_dev)

Var:  227.1402326138196
Std:  15.071172237547403


Compute the **skewness** and **kurtosis**

You can use [`scipy.stats.skew`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skew.html) and [`scipy.stats.kurtosis`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kurtosis.html)

In [16]:
from scipy.stats import kurtosis, skew

skewness = skew(samples)
print ("Skew: ", skewness)

kurtosis = kurtosis (samples)
print ("Kurtosis: ", kurtosis)

Skew:  -0.045693514068775275
Kurtosis:  0.14995635458798073


## NumPy Correlation Calculation

Create an array x of integers between 10 (inclusive) and 20 (exclusive). Use `np.arange()`

In [17]:
x = np.arange(10,20)
x

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

Then use `np.array()` to create a second array y containing 10 arbitrary integers.

In [34]:
y = np.random.randint(1,101,10)
y

array([76, 26, 86, 84, 84, 69, 51, 62, 41, 63])

In [35]:
r = ("Correlation: ", np.corrcoef (x, y))
r

('Correlation: ',
 array([[ 1.        , -0.22571011],
        [-0.22571011,  1.        ]]))

## Pandas Correlation Calculation

Run the code below

In [36]:
x = pd.Series(range(10, 20))
y = pd.Series([2, 1, 4, 5, 8, 12, 18, 25, 96, 48])

Call the relevant method  to calculate Pearson's r correlation.

In [37]:
r = np.corrcoef (x, y)
r

array([[1.        , 0.75864029],
       [0.75864029, 1.        ]])

OPTIONAL. Call the relevant method to calculate Spearman's rho correlation.

In [38]:
rho = stats.spearmanr (x, y)
rho

SpearmanrResult(correlation=0.9757575757575757, pvalue=1.4675461874042197e-06)

## Seaborn Dataset Tips

Import Seaborn Library

In [39]:
import seaborn as sns

Load "tips" dataset from Seaborn

In [42]:
tips = sns.load_dataset("tips")
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


Generate descriptive statistics include those that summarize the central tendency, dispersion

In [46]:
tips.describe()

Unnamed: 0,total_bill,tip,size
count,244.0,244.0,244.0
mean,19.785943,2.998279,2.569672
std,8.902412,1.383638,0.9511
min,3.07,1.0,1.0
25%,13.3475,2.0,2.0
50%,17.795,2.9,2.0
75%,24.1275,3.5625,3.0
max,50.81,10.0,6.0


Call the relevant method to calculate pairwise Pearson's r correlation of columns

In [52]:
pip install pingouin()

/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `/Users/yucelyavuz/anaconda3/bin/python -m pip install pingouin()'
Note: you may need to restart the kernel to use updated packages.


In [53]:
import pingouin as pg
pg.pairwise_corr(tips, method='pearson')

Unnamed: 0,X,Y,method,alternative,n,r,CI95%,p-unc,BF10,power
0,total_bill,tip,pearson,two-sided,244,0.675734,"[0.6, 0.74]",6.692471e-34,4.952e+30,1.0
1,total_bill,size,pearson,two-sided,244,0.598315,"[0.51, 0.67]",4.39351e-25,1.002e+22,1.0
2,tip,size,pearson,two-sided,244,0.489299,"[0.39, 0.58]",4.300543e-16,14720000000000.0,1.0
