# 🛠️ Setup Instructions
To run this notebook locally, you'll need Python and Jupyter Notebook installed.

**Step 1: Install Python**  
Download and install Python from the official site: https://www.python.org/downloads/

**Step 2: Install Jupyter Notebook**  
After Python is installed, run this in terminal:
```bash
pip install notebook
jupyter notebook
```
Or use [Google Colab](https://colab.research.google.com/) to run notebooks online without installing anything!

## 📦 Python Data Structures
**Common built-in data structures:**
- `list`: Ordered, mutable collection: `[1, 2, 3]`
- `tuple`: Ordered, immutable collection: `(1, 2, 3)`
- `set`: Unordered, unique elements: `{1, 2, 3}`
- `dict`: Key-value pairs: `{'name': 'Alice', 'age': 25}`

In [None]:
# Examples of basic data structures
my_list = [10, 20, 30]
my_tuple = (10, 20, 30)
my_set = {10, 20, 30}
my_dict = {'name': 'Alice', 'age': 25}

print("List:", my_list)
print("Tuple:", my_tuple)
print("Set:", my_set)
print("Dictionary:", my_dict)

# Day 1: Python for Data Science – The Groundwork
Welcome to Day 1 of the Data Science with Python workshop!

**Objectives:**
- Understand Python basics
- Learn about NumPy and Pandas
- Load and explore a dataset
- Perform basic data analysis


In [None]:
# Basic Python: Variables, Lists, Loops, and Functions
my_list = [1, 2, 3, 4, 5]
print("List elements:", my_list)

def square(x):
    return x * x

squared_list = [square(i) for i in my_list]
print("Squared List:", squared_list)

## NumPy Basics
NumPy is used for numerical operations in Python.

**Why NumPy?**
- Faster and more efficient than Python lists for numerical computations
- Supports powerful operations like broadcasting and matrix manipulation

In [None]:
import numpy as np

# NumPy arrays are more efficient than Python lists for numerical operations
arr = np.array([1, 2, 3, 4, 5])
print("NumPy Array:", arr)
print("Mean:", np.mean(arr))
print("Standard Deviation:", np.std(arr))

# Element-wise operations
arr2 = arr * 2
print("Doubled Array:", arr2)

# Matrix operations
matrix = np.array([[1, 2], [3, 4]])
print("Matrix:", matrix)
print("Transpose:", matrix.T)

## Pandas Basics
Pandas helps us work with tabular data like CSV files.

**Why Pandas?**
- Handles structured data efficiently (like spreadsheets)
- Makes data cleaning, transformation, and analysis easy
- Integrates well with visualization and ML libraries

In [None]:
import pandas as pd

# Load the Titanic dataset
url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)
df.head()

## Exploring the Data
Let's look at the basic structure and summary statistics.

In [None]:
df.info()
df.describe()
df.isnull().sum()
df.groupby('Pclass')['Survived'].mean()
df.sort_values(by='Age').head()

## Basic Analysis with Pandas
How many people survived? What is the average age?

In [None]:
df['Survived'].value_counts()
df['Age'].mean()
df.groupby('Sex')['Survived'].mean()

## ✏️ Exercises
Try solving these on your own:
1. Count the number of male and female passengers.
2. Find the average fare paid by each passenger class.
3. Plot the age distribution of passengers using a histogram.
4. Calculate the survival rate for passengers under 18 years old.
5. Find the top 5 oldest passengers and their survival status.