# 🧑‍🏫 Data Storytelling Projects with Titanic & Tips Datasets


## 📌 Objective:
Explore and analyze two popular datasets – Titanic and Tips – as **two separate data storytelling projects**.  
Each project must end with a **summary dashboard** created using **matplotlib** and **seaborn**. Use of `plotly` is optional for interactivity.


## 📁 PROJECT 1: Titanic Dataset Analysis

### 🔹 Step 1: Load & Explore Data
- Load Titanic dataset using seaborn
- Check missing values, data types, and basic statistics

In [1]:
import seaborn as sns
import matplotlib.pyplot as plt

df_titanic = sns.load_dataset("titanic")
df_titanic.info()
df_titanic.describe()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 15 columns):
 #   Column       Non-Null Count  Dtype   
---  ------       --------------  -----   
 0   survived     891 non-null    int64   
 1   pclass       891 non-null    int64   
 2   sex          891 non-null    object  
 3   age          714 non-null    float64 
 4   sibsp        891 non-null    int64   
 5   parch        891 non-null    int64   
 6   fare         891 non-null    float64 
 7   embarked     889 non-null    object  
 8   class        891 non-null    category
 9   who          891 non-null    object  
 10  adult_male   891 non-null    bool    
 11  deck         203 non-null    category
 12  embark_town  889 non-null    object  
 13  alive        891 non-null    object  
 14  alone        891 non-null    bool    
dtypes: bool(2), category(2), float64(2), int64(4), object(5)
memory usage: 80.7+ KB


Unnamed: 0,survived,pclass,age,sibsp,parch,fare
count,891.0,891.0,714.0,891.0,891.0,891.0
mean,0.383838,2.308642,29.699118,0.523008,0.381594,32.204208
std,0.486592,0.836071,14.526497,1.102743,0.806057,49.693429
min,0.0,1.0,0.42,0.0,0.0,0.0
25%,0.0,2.0,20.125,0.0,0.0,7.9104
50%,0.0,3.0,28.0,0.0,0.0,14.4542
75%,1.0,3.0,38.0,1.0,0.0,31.0
max,1.0,3.0,80.0,8.0,6.0,512.3292


### 🔹 Step 2: Data Cleaning
- Handle missing values (e.g., `age`, `embarked`, `deck`)
- Convert categorical variables if necessary
- Filter or clean as needed

### 🔹 Step 3: Thematic Analysis
1. **Survival rate by gender** → Bar plot `sex` vs `survived`  
2. **Survival rate by passenger class** → Stacked bar or countplot  
3. **Age distribution by survival** → KDE or histogram  
4. **Embarked location vs survival** → Countplot with hue  


### 🔹 Step 4: Write Insights
- Provide 2–4 sentences explaining each chart.
- Ask: Who had higher survival chances? Which features mattered most?

### 🔹 Step 5: Create Summary Dashboard
- Combine the four charts into one dashboard layout using `matplotlib` subplots.
- Annotate key insights on the charts.

📌 **Tools:** `matplotlib`, `seaborn`  
🛠️ **Optional:** `plotly` for enhanced interactivity


## 📁 PROJECT 2: Tips Dataset Analysis

### 🔹 Step 1: Load & Explore Data

In [2]:
df_tips = sns.load_dataset("tips")
df_tips.info()
df_tips.describe()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   total_bill  244 non-null    float64 
 1   tip         244 non-null    float64 
 2   sex         244 non-null    category
 3   smoker      244 non-null    category
 4   day         244 non-null    category
 5   time        244 non-null    category
 6   size        244 non-null    int64   
dtypes: category(4), float64(2), int64(1)
memory usage: 7.4 KB


Unnamed: 0,total_bill,tip,size
count,244.0,244.0,244.0
mean,19.785943,2.998279,2.569672
std,8.902412,1.383638,0.9511
min,3.07,1.0,1.0
25%,13.3475,2.0,2.0
50%,17.795,2.9,2.0
75%,24.1275,3.5625,3.0
max,50.81,10.0,6.0


### 🔹 Step 2: Thematic Analysis
1. **Total bill vs tip** → Scatterplot with trendline  
2. **Tip by gender and day** → Grouped barplot  
3. **Tip ratio (tip/total_bill) by party size** → Boxplot or barplot  
4. **Tips by time (Lunch vs Dinner)** → Violinplot or boxplot  


### 🔹 Step 3: Write Insights
- Explain which factors affect tip amount.
- Are there differences by gender, time, or day?

### 🔹 Step 4: Create Summary Dashboard
- Combine charts into a single dashboard using `matplotlib` subplots.
- Add annotations and highlights of key findings.

📌 **Tools:** `matplotlib`, `seaborn`  
🛠️ **Optional:** `plotly` for interactive visualizations


## 📦 SUBMISSION REQUIREMENTS
Each project should be submitted in a separate folder:
- 📄 `project_report.md` or PDF: Summary of analysis and insights
- 📊 Code file (`.py` or `.ipynb`)
- 📷 Dashboard image (PNG/JPG)

**Evaluation Criteria:**
- Correct data analysis process  
- Clear and well-labeled visualizations  
- Meaningful insights  
- Organized and readable dashboard
