<a href="https://colab.research.google.com/github/OviedoVR/CloudSolved/blob/main/Std_Var.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Variance and Standard Deviation:** *Numpy* **versus** *Pandas*

---

> **Tips dataset** (from [Seaborn](https://seaborn.pydata.org/))

* total_bill	
* **tip**	
* sex	
* smoker
* day	
* time	
* size


### **Imports**

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns

### **Looking at the data**

In [3]:
tip = sns.load_dataset('tips')
tip.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


### **Variance**

POPULATION:

$\sigma^2 = \sum \frac{(xi - \mu)^2}{N}$

&nbsp;

SAMPLE:

$s^2 = \sum \frac{(xi - \bar{x})^2}{(n-1)}$

* Pandas

In [5]:
tip['tip'].var()

1.914454638062471

* Numpy

In [6]:
np.var(tip['tip'])

1.9066085124966412

**Documentation**

[np.var()](https://numpy.org/doc/stable/reference/generated/numpy.var.html)

[df.var()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.var.html)

```python
# Numpy:
np.var(..., ddof=1)

# Pandas
df['column'].var()
```

**Then**

In [7]:
var_np = np.var(tip['tip'], ddof=1)
var_pd = tip['tip'].var()

print(f'Numpy: {var_np}')
print(f'Pandas: {var_pd}')

Numpy: 1.914454638062471
Pandas: 1.914454638062471


### **Standard Deviation**

POPULATION:

$\sigma = \sqrt{ \sum \frac{(xi - \mu)^2}{N}}$ 

&nbsp;

SAMPLE:

$s = \sqrt{  \sum \frac{(xi - \bar{x})^2}{(n-1)} }$

* Pandas

In [8]:
tip['tip'].std()

1.3836381890011822

* Numpy

In [9]:
np.std(tip['tip'])

1.3807999538298954

**Documentation**

[np.std()](https://numpy.org/doc/stable/reference/generated/numpy.std.html)

[df.std()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.std.html)

```python
# Numpy:
np.std(..., ddof=1)

# Pandas
df['column'].std()
```


**Then**

In [10]:
std_np = np.std(tip['tip'], ddof=1)
std_pd = tip['tip'].std()

print(f'Numpy: {std_np}')
print(f'Pandas: {std_pd}')

Numpy: 1.3836381890011822
Pandas: 1.3836381890011822
