# DAY 6: Jupyter Notebooks - Interactive Python

Jupyter notebooks are PERFECT for:
- Data analysis
- Learning Python
- Creating reports
- Visualizing data
- Running code step-by-step

## What is a Jupyter Notebook?
A Jupyter notebook combines:
- **Code cells** - Run Python code
- **Markdown cells** - Write formatted text, notes, explanations
- **Output** - See results immediately below each cell

This is much better than running entire .py files!

## Running This Notebook

### Step 1: Install Jupyter
```bash
pip install jupyter
```

### Step 2: Start Jupyter
```bash
jupyter notebook
```

### Step 3: Open this file
Navigate to `day6_jupyter/python_basics.ipynb`

### Step 4: Run cells
- Click a cell and press `Shift + Enter` to run it
- Or click the play button ‚ñ∂Ô∏è

### Step 5: See the output
Output appears directly below the cell!

---
# Let's Start: Loading and Displaying Data

In [None]:
# First, let's import the libraries we'll use
import pandas as pd
import csv

print("‚úì Libraries imported successfully")

## Create Sample Data

Let's create a sample CSV with WindshieldHub orders, then load it into pandas DataFrame

In [None]:
# Create sample data if it doesn't exist
import os

# Check if CSV exists, if not create it
csv_file = "orders_sample.csv"

if not os.path.exists(csv_file):
    # Create sample CSV
    orders_data = [
        ["order_id", "customer_name", "service_type", "city", "amount", "status", "date"],
        [1, "Ali Hassan", "windshield_replacement", "Lahore", 3500, "completed", "2024-01-15"],
        [2, "Fatima Khan", "windshield_repair", "Karachi", 1500, "pending", "2024-01-20"],
        [3, "Hassan Ali", "windshield_replacement", "Islamabad", 3500, "completed", "2024-01-18"],
        [4, "Ayesha Malik", "windshield_repair", "Lahore", 1500, "in_progress", "2024-01-22"],
        [5, "Muhammad Karim", "windshield_replacement", "Rawalpindi", 3500, "pending", "2024-01-25"],
        [6, "Sara Ahmed", "windshield_replacement", "Karachi", 3500, "completed", "2024-01-19"],
        [7, "Usman Khan", "windshield_repair", "Islamabad", 1500, "completed", "2024-01-21"],
        [8, "Zainab Ali", "windshield_replacement", "Lahore", 3500, "pending", "2024-01-24"],
        [9, "Omar Hassan", "windshield_repair", "Rawalpindi", 1500, "in_progress", "2024-01-23"],
    ]
    
    with open(csv_file, 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerows(orders_data)
    
    print(f"‚úì Created {csv_file}")
else:
    print(f"‚úì {csv_file} already exists")

## Load Data into Pandas DataFrame

In [None]:
# Load CSV into a pandas DataFrame
df = pd.read_csv(csv_file)

print(f"‚úì Loaded {len(df)} orders into DataFrame")
print(f"\nDataFrame info:")
print(f"  Rows: {len(df)}")
print(f"  Columns: {len(df.columns)}")

## Display Data

### First 10 records

In [None]:
# Display first 10 rows
df.head(10)

### Display last 5 records

In [None]:
# Display last 5 rows
df.tail(5)

### Info about DataFrame

In [None]:
# Get info about DataFrame
df.info()

---
# Data Analysis Examples

## Basic Statistics

In [None]:
# Get statistics about numeric columns
df.describe()

## Filter Data

In [None]:
# Get only completed orders
completed_orders = df[df['status'] == 'completed']

print(f"üìä Completed Orders: {len(completed_orders)}")
print(f"\nCompleted orders:")
print(completed_orders[['order_id', 'customer_name', 'amount']])

## Get Specific Columns

In [None]:
# Get only some columns
customer_info = df[['customer_name', 'city', 'amount']]
customer_info

## Aggregations (Totals, Counts)

In [None]:
# Calculate totals
print("üìä Revenue Statistics:")
print(f"  Total Revenue: Rs.{df['amount'].sum()}")
print(f"  Average Order: Rs.{df['amount'].mean():.0f}")
print(f"  Min Order: Rs.{df['amount'].min()}")
print(f"  Max Order: Rs.{df['amount'].max()}")
print(f"\n  Total Orders: {len(df)}")

## Group By City

In [None]:
# Group by city and calculate totals
city_stats = df.groupby('city')['amount'].sum().sort_values(ascending=False)

print("üìä Revenue by City:")
for city, revenue in city_stats.items():
    print(f"  {city}: Rs.{revenue}")

## Group By Status

In [None]:
# Count orders by status
status_counts = df['status'].value_counts()

print("üìä Orders by Status:")
for status, count in status_counts.items():
    print(f"  {status}: {count} orders")

---
# Practice: Data Visualization

Jupyter is great for inline charts! Let's try some visualizations.

In [None]:
import matplotlib.pyplot as plt

# Simple bar chart of orders by city
city_counts = df['city'].value_counts()

plt.figure(figsize=(10, 5))
city_counts.plot(kind='bar', color='steelblue')
plt.title('Orders by City')
plt.xlabel('City')
plt.ylabel('Number of Orders')
plt.tight_layout()
plt.show()

print("‚úì Chart displayed above!")

---
# Key Takeaways

‚úÖ Jupyter notebooks are **interactive** - run code cell by cell
‚úÖ You can see **results immediately** below each cell
‚úÖ Perfect for **data analysis** and **learning**
‚úÖ Can combine **code** + **text** + **charts** in one document
‚úÖ Easy to **document** your work

## Next Steps

1. Run this notebook: `jupyter notebook day6_jupyter/python_basics.ipynb`
2. Execute each cell (Shift + Enter)
3. Modify the code and experiment
4. Add your own cells with your own analysis

**Day 7 is the mini-project where you'll use everything you've learned!**