# Summary: Quartiles and Interquartile Range (IQR) in Data Sets

## Understanding Quartiles
Quartiles divide a data set into four equal parts, providing insights into the distribution of data.

### Calculation of Quartiles
- **First Quartile (Q1)**: Also known as the lower quartile or 25th percentile, it is the median of the lower half of the data.
- **Second Quartile (Q2)**: This is the median of the data set, also the 50th percentile.
- **Third Quartile (Q3)**: Known as the upper quartile or 75th percentile, it is the median of the upper half of the data.

## Example: Golf Scores
- Data Set: Golf scores of 18 players, ranging from 66 to 75.
- Median (Q2) = 69 (calculated as the mean of the two middle values in an ordered data set).
- Q1 and Q3 are calculated as the median of the respective lower and upper halves of the data set.

## Five Number Summary
A concise description of a data set, which includes:
1. **Minimum Value**
2. **First Quartile (Q1)**
3. **Median (Q2)**
4. **Third Quartile (Q3)**
5. **Maximum Value**

- Provides a view of the data's center and spread.
- Range of the data set is defined as the maximum value minus the minimum value.

## Interquartile Range (IQR)
- IQR = Q3 - Q1
- Represents the middle 50% of the data.
- A smaller IQR indicates data points clustered near the median, while a larger IQR suggests a wider spread.

## Identifying Outliers with IQR
- Outliers can be identified using quartiles and IQR.
- Lower Bound for Outliers: Q1 - 1.5 * IQR
- Upper Bound for Outliers: Q3 + 1.5 * IQR
- Values outside these bounds are considered outliers.

## Conclusion
Quartiles and IQR provide valuable insights into the distribution of a data set, highlighting the spread and identifying potential outliers.


In [1]:
import numpy as np

# Example data: Golf scores of 18 players
golf_scores = np.array([66, 67, 67, 68, 68, 68, 68, 69, 69, 69, 69, 70, 70, 71, 71, 72, 73, 75])

# Calculate quartiles
Q1 = np.percentile(golf_scores, 25)
Q2 = np.percentile(golf_scores, 50)
Q3 = np.percentile(golf_scores, 75)

# Calculate Interquartile Range (IQR)
IQR = Q3 - Q1

# Determine outliers
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

# Identify potential outliers
outliers = golf_scores[(golf_scores < lower_bound) | (golf_scores > upper_bound)]

# Print results
print(f"Quartiles: Q1 = {Q1}, Q2 = {Q2}, Q3 = {Q3}")
print(f"Interquartile Range (IQR) = {IQR}")
print(f"Lower Bound for Outliers = {lower_bound}")
print(f"Upper Bound for Outliers = {upper_bound}")
print(f"Potential Outliers: {outliers}")


Quartiles: Q1 = 68.0, Q2 = 69.0, Q3 = 70.75
Interquartile Range (IQR) = 2.75
Lower Bound for Outliers = 63.875
Upper Bound for Outliers = 74.875
Potential Outliers: [75]
