## Descriptive Statistics

 Import **NumPy**, **SciPy**, and **Pandas**

In [1]:
import numpy as np
import pandas as pd
from scipy import stats

 Randomly generate 1,000 samples from the normal distribution using `np.random.normal()`(mean = 100, standard deviation = 15)

In [8]:
samples = np.random.normal(loc=100, scale=15, size=(10,10))

samples

array([[106.89604222,  70.89877698, 118.17292573, 106.32157907,
        120.73441402,  64.82146314,  79.68747689,  91.97300578,
        112.07848833,  90.5677615 ],
       [ 89.29940426,  62.41894124,  81.24277517, 108.36839061,
         86.70629766,  68.2492337 , 100.78044303,  95.87888804,
        134.50665085,  99.70743586],
       [118.33502917, 111.66470374, 103.96558951, 132.65178621,
        121.76545133,  95.27461086, 119.0388761 , 103.74656107,
         89.94256599, 120.01196075],
       [ 99.79557151,  88.33244705, 110.12614491,  97.77647563,
        112.54395135,  93.57222185,  88.95376708, 105.51555326,
         62.65043786, 105.04674743],
       [ 94.75421618,  84.7304507 , 112.33530462, 130.55983927,
        102.37067499, 112.73347857, 103.96505093, 110.99683277,
        109.36779185, 115.67626888],
       [ 96.79385407,  85.12265182, 103.37215461,  83.37570609,
         95.78092462, 104.02571187, 105.42611494, 106.37314827,
         88.13391428, 121.31074326],
       [10

Compute the **mean**, **median**, and **mode**

In [12]:
mean = np.mean(samples)
median = np.median(samples)
mode = stats.mode(samples)

print(f"mean is : {mean}, \nmedian is : {median}, \nmode is : {mode}")

mean is : 98.95229771256385, 
median is : 98.89173261502361, 
mode is : ModeResult(mode=array([[75.28035153, 62.41894124, 57.72446332, 83.37570609, 86.70629766,
        64.82146314, 79.68747689, 90.10385496, 62.65043786, 83.08344562]]), count=array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]))


Compute the **min**, **max**, **Q1**, **Q3**, and **interquartile range**

In [14]:
min_samp = np.min(samples)
max_samp = np.max(samples)
q1 = np.percentile(samples, 0.25)
q3 = np.percentile(samples, 0.75)
iqr = q3 - q1

print(f"min : {min_samp} \nmax : {max_samp} \nq1 : {q1} \nq3 : {q3} \niqr : {iqr}")

min : 57.72446331519604 
max : 134.50665085331272 
q1 : 58.8863466020526 
q3 : 61.21011317576573 
iqr : 2.3237665737131294


Compute the **variance** and **standard deviation**

In [18]:
variance = np.var(samples)
std_dev = np.std(samples)

print(f"variance : {variance} \nstd_dev : {std_dev}")

variance : 234.4462156599211 
std_dev : 15.31163660945234


Compute the **skewness** and **kurtosis**

In [19]:
skewness = stats.skew(samples)
kurtosis = stats.kurtosis(samples)

print(f"skewness: {skewness} \nkurtosis: {kurtosis}")

skewness: [ 0.12669944  0.57974767 -1.30971886  0.2980532  -0.01278278 -0.64406664
 -0.28549137  0.22685581 -0.30683337  0.26255485] 
kurtosis: [-0.04275775 -0.62750287  0.76846419 -1.17339015 -1.30036525 -0.9539645
 -0.23287156 -1.24165241  0.44556605 -1.13534686]


## NumPy Correlation Calculation

Create an array x of integers between 10 (inclusive) and 20 (exclusive). Use `np.arange()`

In [20]:
x = np.random.randn(10, 20)
x

array([[ 0.87334156,  0.16230892,  0.51669036,  0.82448869, -0.72021676,
         2.3010147 ,  0.09263645, -0.66004451,  1.94313007,  1.07642101,
         2.34523642, -0.48970409,  0.82822158,  1.20900405, -0.68233651,
        -1.40215264, -0.24183127,  0.11526879, -0.74304192, -0.44427391],
       [-0.48713292, -0.29950733,  0.53815012,  1.22050525, -0.03384957,
        -0.37416805,  0.69563501, -1.47210311,  1.36210031, -1.15020061,
        -0.27210306,  0.23918726, -0.45027641, -1.27585571,  0.58173659,
         0.98330309,  0.51814624,  1.17002317, -0.24432588, -1.03450101],
       [ 0.18730049,  0.98309934, -0.57870583, -0.04231255,  1.65824769,
         0.35895004, -0.22263061,  2.8187635 , -0.08780308,  0.76155302,
         0.89972106,  0.0800942 , -3.32705079, -1.14145492, -1.25069508,
         1.09465029,  0.49518412, -0.35201788, -0.18075482, -2.07492325],
       [ 0.54954077, -0.32705097,  0.33381083, -1.07516329, -0.50310368,
         0.33236543, -0.01788588, -0.3768305 , -

Then use `np.array()` to create a second array y containing 10 arbitrary integers.

In [22]:
y = np.random.randn(10)

y

array([-1.07657113,  0.41382234, -0.57061143,  1.10118075,  0.24549596,
        0.15678382,  0.94883192, -0.51894945,  0.92619862, -0.51671094])

Once you have two arrays of the same length, you can compute the **correlation coefficient** between x and y

In [None]:
r = 

## Pandas Correlation Calculation

Run the code below

In [None]:
x = pd.Series(range(10, 20))
y = pd.Series([2, 1, 4, 5, 8, 12, 18, 25, 96, 48])

Call the relevant method  to calculate Pearson's r correlation.

In [None]:
r =

OPTIONAL. Call the relevant method to calculate Spearman's rho correlation.

In [None]:
rho =

## Seaborn Dataset Tips

Import Seaborn Library

In [6]:
import seaborn as sns

Load "tips" dataset from Seaborn

In [7]:
tips = sns.load_dataset("tips")

Generate descriptive statistics include those that summarize the central tendency, dispersion

Call the relevant method to calculate pairwise Pearson's r correlation of columns