# Quartiles and Outliers

## Instructions

* Take a look at the list in the in the `samples` notebook. Identify the median, upper quartile, and lower quartiles by hand.

* Use code to determine the lower and upper quartiles and be sure to account for both odd and even lengths of a data set.

  * Reference <https://www.wikihow.com/Calculate-Outliers> to choose a method for best handling this.

  * Use [numpy.percentile](https://het.as.utexas.edu/HET/Software/Numpy/reference/generated/numpy.percentile.html) to help with the calculations.

* The difference between the upper and lower quartile is called the **interquartile range**, or IQR.

  * Like the standard deviation, the IQR describes how "spread out" the data set is.

  * Calculate the IQR for this list.

* Determine the outliers

  * Lower outliers are points that fall below the result of the equation `Q1 - 1.5 * IQR`

  * Upper outliers are points that above the result of the equation `Q3 + 1.5 * IQR`

* Finally create a box plot of that data.


In [52]:
%matplotlib notebook

In [53]:
# Dependencies
import matplotlib.pyplot as plt
from statistics import median
import numpy as np

In [54]:
### Data Points
arr = np.array([2.3, 10.2,11.2, 12.3, 14.5, 14.6, 15.0, 15.1, 19.0, 24.0])
arr

array([ 2.3, 10.2, 11.2, 12.3, 14.5, 14.6, 15. , 15.1, 19. , 24. ])

In [55]:
# Find the median
median
median([2.3, 10.2, 11.2, 12.3, 14.5, 14.6, 15. , 15.1, 19. , 24.])

14.55

In [56]:
# Use numpy to create quartiles
#Q1
Q_1=np.percentile(arr, 25)

In [57]:
#Q3
Q_3=np.percentile(arr, 75)

In [58]:
# Print the quartiles
Q_1

11.474999999999998

In [59]:
# Print the quartiles
Q_3

15.075

In [60]:
# Calculate the interquartile range
IQR=Q_3-Q_1
IQR

3.6000000000000014

In [61]:
# Find lower boundary
# Q1 - 1.5 * IQR
Lower=Q_1-(1.5*IQR)
Lower

6.074999999999996

In [62]:
# Find upper boundary
# Q3 + 1.5 * IQR
Upper=Q_3+(1.5*IQR)
Upper

20.475

In [63]:
# Check for any lower outliers
for num in arr<Lower:
    print(num)

True
False
False
False
False
False
False
False
False
False


In [64]:
# Check for any upper outliers
for num in arr>Upper:
    print(num)

False
False
False
False
False
False
False
False
False
True


In [65]:
# Create box plot
fig, ax = plt.subplots()
ax.boxplot(arr)

<IPython.core.display.Javascript object>

{'whiskers': [<matplotlib.lines.Line2D at 0x123d54dd0>,
  <matplotlib.lines.Line2D at 0x12375a250>],
 'caps': [<matplotlib.lines.Line2D at 0x123d5a850>,
  <matplotlib.lines.Line2D at 0x123d5ad90>],
 'boxes': [<matplotlib.lines.Line2D at 0x123d541d0>],
 'medians': [<matplotlib.lines.Line2D at 0x123d5a7d0>],
 'fliers': [<matplotlib.lines.Line2D at 0x12371c190>],
 'means': []}