#### Distribution Plots

- We use Distribution plots to visualize the distribution of quantitative data.
- Plots that show how numerical values are spread, instead of counting categories. (like total_bill, tip, size) - check tips.csv


- The **tips** dataset is a sample dataset included with Seaborn that contains information about restaurant bills, tips, gender, smoking status, day, time, and group size.
It’s mainly used for learning data visualization and EDA.

- I've been download for reference in csv format now you see this file in directory 

`tips.csv`

In [1]:
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")

print(tips)          # print whole tips dataset

print(tips.head())   # first 5 rows(default)
print(tips.head(8))  # head(n) - shows the first n rows.

     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]
   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  F

### Types of Distribution plots

### Distribution Plots (Cheatsheet)

#### 1. Univariate Distribution Plots (Single Variable)

| Plot        | Alternate / Second Name       | Example Code |
|-------------|-------------------------------|--------------|
| Histogram   | Frequency Plot, Bar Plot      | `sns.histplot(tips["total_bill"])` |
| KDE Plot    | Density Plot, Smooth Histogram| `sns.kdeplot(tips["tip"])` |
| ECDF Plot   | Cumulative Distribution Plot  | `sns.ecdfplot(tips["total_bill"])` |
| Box Plot    | Whisker Plot, 5-Number Summary| `sns.boxplot(x=tips["tip"])` |
| Violin Plot | KDE + Boxplot                 | `sns.violinplot(y=tips["total_bill"])` |
| Rug Plot    | Strip of Ticks                | `sns.rugplot(tips["total_bill"])` |
| Strip Plot  | Jitter Plot                   | `sns.stripplot(y=tips["tip"])` |

---

#### 2. Bivariate Distribution Plots (Two Variables)

| Plot        | Alternate / Second Name  | Example Code |
|-------------|--------------------------|--------------|
| Joint Plot  | Bivariate Distribution Plot | `sns.jointplot(x="total_bill", y="tip", data=tips)` |
| Hexbin Plot | 2D Histogram (in jointplot) | `sns.jointplot(x="total_bill", y="tip", data=tips, kind="hex")` |
| 2D KDE Plot | Density Contour Plot        | `sns.kdeplot(x="total_bill", y="tip", data=tips, fill=True)` |

---

#### 3. Multivariate Distribution Plots (Many Variables)

| Plot        | Alternate / Second Name  | Example Code |
|-------------|--------------------------|--------------|
| Pair Plot   | Scatterplot Matrix       | `sns.pairplot(tips, vars=["total_bill","tip","size"])` |
| FacetGrid   | Small Multiples (with distplots) | `sns.displot(tips, x="total_bill", col="day")` |

---

#### Summary

- **Univariate** → Histogram, KDE, ECDF, Box, Violin, Rug, Strip.  
- **Bivariate** → Joint Plot, Hexbin, 2D KDE.  
- **Multivariate** → Pair Plot, FacetGrid.  


### Top 5 Industry-Level Distribution Plots

These 5 plots are the most widely used in **industry-level data analysis, ML, and dashboards**.

| Plot        | Use Case (Why Industry Uses It) | Example Code |
|-------------|---------------------------------|--------------|
| **Histogram** | First step in EDA, shows how values are spread (frequency). Very common in reports. | `sns.histplot(tips["total_bill"])` |
| **KDE Plot (Density Plot)** | Smooth curve for probability density. Useful when comparing distributions. | `sns.kdeplot(tips["tip"])` |
| **Box Plot** | Detects outliers, median, and spread quickly. Standard in dashboards. | `sns.boxplot(x=tips["day"], y=tips["total_bill"])` |
| **Violin Plot** | Combines Boxplot + KDE. Used for comparing category-wise distributions. | `sns.violinplot(x="day", y="tip", data=tips)` |
| **Pair Plot (Scatterplot Matrix)** | Shows relationships among multiple numeric variables. Standard in EDA. | `sns.pairplot(tips, vars=["total_bill","tip","size"])` |

---

#### Summary
- **Histogram** → always first step in data understanding.  
- **KDE Plot** → smooth distribution view.  
- **Box Plot** → spread + outliers (very common).  
- **Violin Plot** → distribution per category.  
- **Pair Plot** → quick multi-variable insights.  

👉 These **5 plots cover 80–90% of real-world industry use cases** in data analysis & visualization.
