# 📘 Seaborn Introduction

## 🔧 1. Setup and Imports

In [None]:

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

# Load sample dataset
tips = sns.load_dataset("tips")
tips.head()


**Explanation:**
- Imports the essential libraries: Seaborn, Pandas, and Matplotlib.
- Loads Seaborn's built-in `tips` dataset for demonstration.
- Displays the first few rows to show the structure of the data.

## 📈 2A. Scatter Plot

In [None]:

sns.scatterplot(x="total_bill", y="tip", data=tips)
plt.title("Total Bill vs Tip")
plt.show()


**Explanation:**
- Visualizes the relationship between total bill and tip using a scatter plot.
- Each dot represents a record from the dataset.
- The upward trend shows a positive correlation between the two variables.

## 📈 2B. Histogram with KDE

In [None]:

sns.histplot(tips["total_bill"], kde=True, bins=20)
plt.title("Distribution of Total Bill")
plt.show()


**Explanation:**
- Plots the distribution of the `total_bill` variable using a histogram.
- Includes a KDE (Kernel Density Estimate) curve to show the data's density.
- Uses 20 bins for better visualization of the distribution.

## 📈 2C. Box Plot

In [None]:

sns.boxplot(x="day", y="total_bill", data=tips)
plt.title("Total Bill by Day")
plt.show()


**Explanation:**
- Compares the distribution of total bill amounts for each day of the week using a box plot.
- The box shows the interquartile range (IQR), which contains the middle 50% of the data.
- The line inside the box represents the median value for each day.
- Whiskers extend to show the range of the data, excluding outliers.
- Dots outside the whiskers indicate potential outliers in the dataset.

## 📈 2D. Violin Plot

In [None]:

sns.violinplot(x="day", y="total_bill", data=tips)
plt.title("Violin Plot of Total Bill by Day")
plt.show()


**Explanation:**
- Displays the distribution of total bill amounts for each day using a violin plot.
- Combines aspects of box plots and KDE (Kernel Density Estimate) plots for richer visualization.
- The width of each violin at a given value shows the density of data points at that value.
- The central box and line indicate the interquartile range and median, similar to a box plot.
- Useful for comparing both the spread and shape of the data across categories.

## 📈 2E. Bar Plot

In [None]:

sns.barplot(x="day", y="tip", data=tips, errorbar='sd')
plt.title("Average Tip by Day")
plt.show()


**Explanation:**
- Shows the average tip amount for each day of the week using a bar plot.
- The height of each bar represents the mean tip value for that day.
- Error bars indicate the standard deviation, providing a sense of variability in tips.
- Useful for comparing average tips across different days and understanding consistency.

## 📈 2F. Count Plot

In [None]:

sns.countplot(x="day", data=tips)
plt.title("Number of Records per Day")
plt.show()


**Explanation:**
- Counts the number of records (visits) for each day using a count plot.
- The height of each bar shows how many times each day appears in the dataset.
- Useful for visualizing the frequency of categorical variables and identifying imbalances in the data.

## 📈 2G. Pair Plot

In [None]:

sns.pairplot(tips, hue="sex")
plt.suptitle("Pairwise Plots by Sex", y=1.02)
plt.show()


**Explanation:**
- Generates pairwise scatterplots and histograms for all numerical columns in the dataset.
- Colors the points by the 'sex' column to compare distributions between genders.
- Helps identify relationships, trends, and potential outliers across multiple variables.
- Useful for exploratory data analysis (EDA) and understanding overall data structure.

## 🎨 3. Set Seaborn Theme

In [None]:

sns.set_theme(style="darkgrid")


**Explanation:**
- Sets the default visual theme for all Seaborn plots to 'darkgrid'.
- Improves readability and aesthetics of charts by applying consistent styling.
- Makes it easier to interpret data visualizations, especially in presentations and reports.

## 🧩 4. Box Plot by Gender and Smoker

In [None]:

sns.boxplot(x="sex", y="tip", hue="smoker", data=tips)
plt.title("Tip by Gender and Smoking Status")
plt.show()


**Explanation:**
- Compares tipping behavior across gender and smoking status using a grouped box plot.
- Each box shows the distribution of tips for a specific gender and smoker/non-smoker group.
- Highlights differences in median, spread, and outliers between groups.
- Useful for visualizing how multiple categorical variables affect a numerical outcome.

## 🧠 Summary Table


| Function         | Purpose                                       |
|------------------|-----------------------------------------------|
| `scatterplot`    | Relationship between two numerical variables  |
| `histplot`       | Distribution of a single numerical variable   |
| `boxplot`        | Summary stats and outliers by category        |
| `violinplot`     | Distribution + density                        |
| `barplot`        | Mean and error bars for categories            |
| `countplot`      | Frequency counts for categories               |
| `pairplot`       | Scatterplots for multiple variable pairs      |
