# Lesson 04 Walkthrough: SQL Basics Examples

## Database Applications Development

**Purpose:** This notebook shows a few example queries to get you started.

**Most of your work will be in the TASK file!** This is just to see how it's done.

### What You'll See:
- SELECT examples
- WHERE examples  
- ORDER BY examples
- Combined queries
- pandas comparisons

### Then You'll Do:
- Open `dbApps04_Task_TrackA.ipynb` or `dbApps04_Task_TrackB.ipynb`
- Complete the exercises
- Use the SQL Reference Guide as needed

Let's see some examples! 

---

## Setup: Import, Connect, Load Dataset, Convert Dataset to Database

In [None]:
import pandas as pd
import sqlite3

# Connect to the database (created in Lesson 03)
conn = sqlite3.connect('titanic.db')

print("✅ Connected to titanic.db")

In [None]:
# Load the Titanic dataset from CSV
titanic_df = pd.read_csv('Titanic Dataset.csv')

# Display basic information about the dataset
print(f"Dataset shape: {titanic_df.shape[0]} rows, {titanic_df.shape[1]} columns")
print(f"\nColumn names: {list(titanic_df.columns)}")

In [None]:
# Let's look at the first few rows
titanic_df.head()

In [None]:
# Convert the DataFrame to a SQL table
titanic_df.to_sql(
    name='passengers',           # The name of the SQL table we want to create in the database
    con=conn,                    # The active database connection (created with sqlite3.connect)
    if_exists='replace',         # If a table named 'passengers' already exists, delete it and write a new one
    index=False                  # Don't store the DataFrame's index as its own column in the database
)


print("Data written to database (no errors reported).")
print(f"Expected rows written: {len(titanic_df)}")
print("Next steps will validate/verify the new db using SQL queries!")

---

## Example 1: SELECT Specific Columns

**Goal:** Get just Name and Age columns

In [None]:
# SQL way
query = """
SELECT Name, Age
FROM passengers
LIMIT 5
"""

result_sql = pd.read_sql(query, conn)
print("SQL Result:")
display(result_sql)

In [None]:
# pandas comparison
titanic = pd.read_csv('Titanic Dataset.csv')
result_pandas = titanic[['name', 'age']].head(5)   # remember, Python is case sensitive!!
print("pandas Result (same thing!):")
display(result_pandas)

**Notice:** Same result! Just different syntax.

---

## Example 2: WHERE Clause - Filtering

**Goal:** Find passengers older than 30

In [None]:
# SQL way
query = """
SELECT Name, Age, Sex
FROM passengers
WHERE Age > 30
LIMIT 10
"""

result_sql = pd.read_sql(query, conn)
print("SQL Result:")
display(result_sql)

In [None]:
# pandas comparison
result_pandas = titanic[titanic['age'] > 30][['name', 'age', 'sex']].head(10)
print("pandas Result (same thing!):")
display(result_pandas)

**Key Point:** 
- pandas: `df[df['Age'] > 30]`
- SQL: `WHERE Age > 30`

---

## Example 3: Text Filtering (Remember the Quotes!)

**Goal:** Find all male passengers

In [None]:
# SQL way - notice the single quotes around 'male'
query = """
SELECT Name, Age, Sex
FROM passengers
WHERE Sex = 'male'
LIMIT 5
"""

result = pd.read_sql(query, conn)
print("Male passengers:")
display(result)

**Critical:** Text values MUST have single quotes!
- ✅ `WHERE Sex = 'male'`
- ❌ `WHERE Sex = male` (Error!)

---

## Example 4: Multiple Conditions with AND

**Goal:** Find female passengers over age 30

In [None]:
# SQL way
query = """
SELECT Name, Age, Sex
FROM passengers
WHERE Age > 30 AND Sex = 'female'
LIMIT 10
"""

result_sql = pd.read_sql(query, conn)
print("SQL Result:")
display(result_sql)

In [None]:
# pandas comparison
result_pandas = titanic[(titanic['age'] > 30) & (titanic['sex'] == 'female')][['name', 'age', 'sex']].head(10)
print("pandas Result:")
display(result_pandas)

**Key Point:**
- pandas: `(condition1) & (condition2)`
- SQL: `condition1 AND condition2`

---

## Example 5: ORDER BY - Sorting

**Goal:** Sort passengers by age (youngest first)

In [None]:
# SQL way
query = """
SELECT Name, Age
FROM passengers
WHERE Age IS NOT NULL
ORDER BY Age
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Youngest passengers:")
display(result)

**Note:** `WHERE Age IS NOT NULL` filters out passengers with missing age.

**Key Point:**
- pandas: `df.sort_values('Age')`
- SQL: `ORDER BY Age`

---

## Example 6: ORDER BY DESC - Reverse Sort

**Goal:** Sort by age (oldest first)

In [None]:
# SQL way
query = """
SELECT Name, Age
FROM passengers
WHERE Age IS NOT NULL
ORDER BY Age DESC
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Oldest passengers:")
display(result)

**Key Point:** Add `DESC` for descending order (high to low)

---

## Example 7: Putting It All Together

**Goal:** Find male passengers under 18, sorted by age

In [None]:
# SQL way - using all clauses!
query = """
SELECT Name, Age, Sex, Pclass
FROM passengers
WHERE Sex = 'male' AND Age < 18
ORDER BY Age DESC
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Young male passengers (oldest first):")
display(result)

**Notice the query structure:**
1. SELECT - which columns
2. FROM - which table
3. WHERE - filter rows
4. ORDER BY - sort results
5. LIMIT - control size

---

## Example 8: OR Condition

**Goal:** Find passengers in first OR second class

In [None]:
# SQL way
query = """
SELECT Name, Pclass, Fare
FROM passengers
WHERE Pclass = 1 OR Pclass = 2
ORDER BY Pclass, Fare DESC
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Upper class passengers:")
display(result)

**Key Point:** OR means "either condition can be true"

---

## Your Turn in the Task File!

In [None]:
# Remember to close your connection!

In [None]:
# Always close your connection when done
conn.close()
print("✅ Connection closed")

---

## Summary

You just saw examples of:
- ✅ SELECT specific columns
- ✅ WHERE filtering (numbers and text)
- ✅ Multiple conditions (AND, OR)
- ✅ ORDER BY sorting (ASC and DESC)
- ✅ LIMIT controlling results
- ✅ Combining all clauses together

### Next Steps:

1. **Keep the SQL Reference Guide open** - you'll need it!
2. **Open your task file:**
   - Track A: `dbApps04_Task_TrackA.ipynb`
   - Track B: `dbApps04_Task_TrackB.ipynb`
3. **Complete the exercises** - write actual queries!
4. **Submit via GitHub** when done

### Remember:
- You already know these concepts from pandas
- You're just learning SQL syntax
- It's okay to look things up - that's what professionals do!
- Build queries step by step
- Test after each clause

