# Day 3: Python for Data Analysis - Starter Notebook

Welcome to Day 3 of the Data Analysis path! This notebook provides a starting point for your exercises.

## Learning Objectives
- Work with Jupyter notebooks for interactive analysis
- Manipulate data using pandas and numpy
- Clean and preprocess data
- Visualize data using matplotlib or seaborn

## Instructions
Complete each exercise section below. Refer to the documentation in `docs/day_3_python_data_analysis.md` for detailed guidance and resources.

---
## Setup
Run the cell below to import the required libraries.

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Display settings
%matplotlib inline
pd.set_option('display.max_columns', None)

print("Libraries imported successfully!")

---
## Exercise 1: Jupyter Notebook Setup

**Deliverables:**
1. Launch a Jupyter notebook and run basic Python code.

**Success Criteria:**
- Jupyter notebook runs without errors
- Basic Python code executes successfully

Try running some basic Python code below:

In [None]:
# TODO: Write your code here
# Try basic Python operations: print statements, variables, simple calculations


---
## Exercise 2: pandas DataFrames

**Deliverables:**
1. Load a CSV file into a pandas DataFrame and perform basic analysis.

**Success Criteria:**
- DataFrame loads correctly
- Basic analysis functions return expected results

**Dataset:** Use the Titanic dataset located at `../data/titanic.csv`

In [None]:
# TODO: Load the Titanic dataset
# Hint: Use pd.read_csv()

df = None  # Replace with your code

In [None]:
# TODO: Display the first few rows
# Hint: Use df.head()


In [None]:
# TODO: Get summary statistics
# Hint: Use df.describe()


In [None]:
# TODO: Get DataFrame info
# Hint: Use df.info()


---
## Exercise 3: numpy Arrays

**Deliverables:**
1. Use numpy to perform array operations.

**Success Criteria:**
- Arrays are created and manipulated as expected
- Basic operations work correctly

In [None]:
# TODO: Create a numpy array
# Hint: Use np.array()


In [None]:
# TODO: Perform array operations (sum, mean, etc.)
# Hint: Use np.sum(), np.mean()


---
## Exercise 4: Data Cleaning

**Deliverables:**
1. Clean the dataset by handling missing values and filtering rows.

**Success Criteria:**
- Missing values are handled
- Data is filtered based on conditions

In [None]:
# TODO: Check for missing values
# Hint: Use df.isnull().sum()


In [None]:
# TODO: Handle missing values
# Hint: Use df.dropna() or df.fillna()


In [None]:
# TODO: Filter data based on conditions
# Hint: Use boolean indexing like df[df['column'] > value]


---
## Exercise 5: Data Visualization

**Deliverables:**
1. Create a simple plot using matplotlib or seaborn.

**Success Criteria:**
- Plot is generated and displayed
- Visualization is clear and labeled

In [None]:
# TODO: Create a visualization
# Hint: Use plt.plot(), sns.histplot(), or df.plot()


---
## Validation Checklist

Before proceeding to Day 4, verify:
- [ ] Jupyter notebook runs without errors
- [ ] DataFrame loaded and explored
- [ ] numpy arrays created and manipulated
- [ ] Missing values handled
- [ ] Visualization created