# Lesson 04 Walkthrough: SQL Basics Examples

## Database Applications Development

**Purpose:** This notebook shows a few example queries to get you started.

**Most of your work will be in the TASK file!** This is just to see how it's done.

### What You'll See:
- SELECT examples
- WHERE examples  
- ORDER BY examples
- Combined queries
- pandas comparisons

### Then You'll Do:
- Open `dbApps04_Task_TrackA.ipynb` or `dbApps04_Task_TrackB.ipynb`
- Complete the exercises
- Use the SQL Reference Guide as needed

Let's see some examples! 

---

## Setup: Import, Connect, Load Dataset, Convert Dataset to Database

In [1]:
import pandas as pd
import sqlite3

# Connect to the database (created in Lesson 03)
conn = sqlite3.connect('titanic.db')

print("✅ Connected to titanic.db")

✅ Connected to titanic.db


In [2]:
# Load the Titanic dataset from CSV
titanic_df = pd.read_csv('Titanic Dataset.csv')

# Display basic information about the dataset
print(f"Dataset shape: {titanic_df.shape[0]} rows, {titanic_df.shape[1]} columns")
print(f"\nColumn names: {list(titanic_df.columns)}")

Dataset shape: 1309 rows, 14 columns

Column names: ['pclass', 'survived', 'name', 'sex', 'age', 'sibsp', 'parch', 'ticket', 'fare', 'cabin', 'embarked', 'boat', 'body', 'home.dest']


In [3]:
# Let's look at the first few rows
titanic_df.head()

Unnamed: 0,pclass,survived,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked,boat,body,home.dest
0,1,1,"Allen, Miss. Elisabeth Walton",female,29.0,0,0,24160,211.3375,B5,S,2.0,,"St Louis, MO"
1,1,1,"Allison, Master. Hudson Trevor",male,0.92,1,2,113781,151.55,C22 C26,S,11.0,,"Montreal, PQ / Chesterville, ON"
2,1,0,"Allison, Miss. Helen Loraine",female,2.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON"
3,1,0,"Allison, Mr. Hudson Joshua Creighton",male,30.0,1,2,113781,151.55,C22 C26,S,,135.0,"Montreal, PQ / Chesterville, ON"
4,1,0,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON"


In [4]:
# Convert the DataFrame to a SQL table
titanic_df.to_sql(
    name='passengers',           # The name of the SQL table we want to create in the database
    con=conn,                    # The active database connection (created with sqlite3.connect)
    if_exists='replace',         # If a table named 'passengers' already exists, delete it and write a new one
    index=False                  # Don't store the DataFrame's index as its own column in the database
)


print("Data written to database (no errors reported).")
print(f"Expected rows written: {len(titanic_df)}")
print("Next steps will validate/verify the new db using SQL queries!")

Data written to database (no errors reported).
Expected rows written: 1309
Next steps will validate/verify the new db using SQL queries!


---

## Example 1: SELECT Specific Columns

**Goal:** Get just Name and Age columns

In [5]:
# SQL way
query = """
SELECT Name, Age
FROM passengers
LIMIT 5
"""

result_sql = pd.read_sql(query, conn)
print("SQL Result:")
display(result_sql)

SQL Result:


Unnamed: 0,name,age
0,"Allen, Miss. Elisabeth Walton",29.0
1,"Allison, Master. Hudson Trevor",0.92
2,"Allison, Miss. Helen Loraine",2.0
3,"Allison, Mr. Hudson Joshua Creighton",30.0
4,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",25.0


In [6]:
# pandas comparison
titanic = pd.read_csv('Titanic Dataset.csv')
result_pandas = titanic[['name', 'age']].head(5)   # remember, Python is case sensitive!!
print("pandas Result (same thing!):")
display(result_pandas)

pandas Result (same thing!):


Unnamed: 0,name,age
0,"Allen, Miss. Elisabeth Walton",29.0
1,"Allison, Master. Hudson Trevor",0.92
2,"Allison, Miss. Helen Loraine",2.0
3,"Allison, Mr. Hudson Joshua Creighton",30.0
4,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",25.0


**Notice:** Same result! Just different syntax.

---

## Example 2: WHERE Clause - Filtering

**Goal:** Find passengers older than 30

In [7]:
# SQL way
query = """
SELECT Name, Age, Sex
FROM passengers
WHERE Age > 30
LIMIT 10
"""

result_sql = pd.read_sql(query, conn)
print("SQL Result:")
display(result_sql)

SQL Result:


Unnamed: 0,name,age,sex
0,"Anderson, Mr. Harry",48.0,male
1,"Andrews, Miss. Kornelia Theodosia",63.0,female
2,"Andrews, Mr. Thomas Jr",39.0,male
3,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",53.0,female
4,"Artagaveytia, Mr. Ramon",71.0,male
5,"Astor, Col. John Jacob",47.0,male
6,"Barkworth, Mr. Algernon Henry Wilson",80.0,male
7,"Baxter, Mrs. James (Helene DeLaudeniere Chaput)",50.0,female
8,"Bazzani, Miss. Albina",32.0,female
9,"Beattie, Mr. Thomson",36.0,male


In [8]:
# pandas comparison
result_pandas = titanic[titanic['age'] > 30][['name', 'age', 'sex']].head(10)
print("pandas Result (same thing!):")
display(result_pandas)

pandas Result (same thing!):


Unnamed: 0,name,age,sex
5,"Anderson, Mr. Harry",48.0,male
6,"Andrews, Miss. Kornelia Theodosia",63.0,female
7,"Andrews, Mr. Thomas Jr",39.0,male
8,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",53.0,female
9,"Artagaveytia, Mr. Ramon",71.0,male
10,"Astor, Col. John Jacob",47.0,male
14,"Barkworth, Mr. Algernon Henry Wilson",80.0,male
17,"Baxter, Mrs. James (Helene DeLaudeniere Chaput)",50.0,female
18,"Bazzani, Miss. Albina",32.0,female
19,"Beattie, Mr. Thomson",36.0,male


**Key Point:** 
- pandas: `df[df['Age'] > 30]`
- SQL: `WHERE Age > 30`

---

## Example 3: Text Filtering (Remember the Quotes!)

**Goal:** Find all male passengers

In [9]:
# SQL way - notice the single quotes around 'male'
query = """
SELECT Name, Age, Sex
FROM passengers
WHERE Sex = 'male'
LIMIT 5
"""

result = pd.read_sql(query, conn)
print("Male passengers:")
display(result)

Male passengers:


Unnamed: 0,name,age,sex
0,"Allison, Master. Hudson Trevor",0.92,male
1,"Allison, Mr. Hudson Joshua Creighton",30.0,male
2,"Anderson, Mr. Harry",48.0,male
3,"Andrews, Mr. Thomas Jr",39.0,male
4,"Artagaveytia, Mr. Ramon",71.0,male


**Critical:** Text values MUST have single quotes!
- ✅ `WHERE Sex = 'male'`
- ❌ `WHERE Sex = male` (Error!)

---

## Example 4: Multiple Conditions with AND

**Goal:** Find female passengers over age 30

In [10]:
# SQL way
query = """
SELECT Name, Age, Sex
FROM passengers
WHERE Age > 30 AND Sex = 'female'
LIMIT 10
"""

result_sql = pd.read_sql(query, conn)
print("SQL Result:")
display(result_sql)

SQL Result:


Unnamed: 0,name,age,sex
0,"Andrews, Miss. Kornelia Theodosia",63.0,female
1,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",53.0,female
2,"Baxter, Mrs. James (Helene DeLaudeniere Chaput)",50.0,female
3,"Bazzani, Miss. Albina",32.0,female
4,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",47.0,female
5,"Bidois, Miss. Rosalie",42.0,female
6,"Bissette, Miss. Amelia",35.0,female
7,"Bonnell, Miss. Elizabeth",58.0,female
8,"Bowen, Miss. Grace Scott",45.0,female
9,"Brown, Mrs. James Joseph (Margaret Tobin)",44.0,female


In [11]:
# pandas comparison
result_pandas = titanic[(titanic['age'] > 30) & (titanic['sex'] == 'female')][['name', 'age', 'sex']].head(10)
print("pandas Result:")
display(result_pandas)

pandas Result:


Unnamed: 0,name,age,sex
6,"Andrews, Miss. Kornelia Theodosia",63.0,female
8,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",53.0,female
17,"Baxter, Mrs. James (Helene DeLaudeniere Chaput)",50.0,female
18,"Bazzani, Miss. Albina",32.0,female
21,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",47.0,female
23,"Bidois, Miss. Rosalie",42.0,female
28,"Bissette, Miss. Amelia",35.0,female
33,"Bonnell, Miss. Elizabeth",58.0,female
35,"Bowen, Miss. Grace Scott",45.0,female
41,"Brown, Mrs. James Joseph (Margaret Tobin)",44.0,female


**Key Point:**
- pandas: `(condition1) & (condition2)`
- SQL: `condition1 AND condition2`

---

## Example 5: ORDER BY - Sorting

**Goal:** Sort passengers by age (youngest first)

In [12]:
# SQL way
query = """
SELECT Name, Age
FROM passengers
WHERE Age IS NOT NULL
ORDER BY Age
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Youngest passengers:")
display(result)

Youngest passengers:


Unnamed: 0,name,age
0,"Dean, Miss. Elizabeth Gladys ""Millvina""",0.17
1,"Danbom, Master. Gilbert Sigvard Emanuel",0.33
2,"Thomas, Master. Assad Alexander",0.42
3,"Hamalainen, Master. Viljo",0.67
4,"Baclini, Miss. Eugenie",0.75
5,"Baclini, Miss. Helene Barbara",0.75
6,"Peacock, Master. Alfred Edward",0.75
7,"Caldwell, Master. Alden Gates",0.83
8,"Richards, Master. George Sibley",0.83
9,"Aks, Master. Philip Frank",0.83


**Note:** `WHERE Age IS NOT NULL` filters out passengers with missing age.

**Key Point:**
- pandas: `df.sort_values('Age')`
- SQL: `ORDER BY Age`

---

## Example 6: ORDER BY DESC - Reverse Sort

**Goal:** Sort by age (oldest first)

In [13]:
# SQL way
query = """
SELECT Name, Age
FROM passengers
WHERE Age IS NOT NULL
ORDER BY Age DESC
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Oldest passengers:")
display(result)

Oldest passengers:


Unnamed: 0,name,age
0,"Barkworth, Mr. Algernon Henry Wilson",80.0
1,"Cavendish, Mrs. Tyrell William (Julia Florence...",76.0
2,"Svensson, Mr. Johan",74.0
3,"Artagaveytia, Mr. Ramon",71.0
4,"Goldschmidt, Mr. George B",71.0
5,"Connors, Mr. Patrick",70.5
6,"Crosby, Capt. Edward Gifford",70.0
7,"Mitchell, Mr. Henry Michael",70.0
8,"Straus, Mr. Isidor",67.0
9,"Wheadon, Mr. Edward H",66.0


**Key Point:** Add `DESC` for descending order (high to low)

---

## Example 7: Putting It All Together

**Goal:** Find male passengers under 18, sorted by age

In [14]:
# SQL way - using all clauses!
query = """
SELECT Name, Age, Sex, Pclass
FROM passengers
WHERE Sex = 'male' AND Age < 18
ORDER BY Age DESC
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Young male passengers (oldest first):")
display(result)

Young male passengers (oldest first):


Unnamed: 0,name,age,sex,pclass
0,"Carrau, Mr. Jose Pedro",17.0,male,1
1,"Thayer, Mr. John Borland Jr",17.0,male,1
2,"Deacon, Mr. Percy William",17.0,male,2
3,"Calic, Mr. Jovo",17.0,male,3
4,"Calic, Mr. Petar",17.0,male,3
5,"Culumovic, Mr. Jeso",17.0,male,3
6,"Davies, Mr. Joseph",17.0,male,3
7,"Dika, Mr. Mirko",17.0,male,3
8,"Elias, Mr. Joseph Jr",17.0,male,3
9,"Jensen, Mr. Svend Lauritz",17.0,male,3


**Notice the query structure:**
1. SELECT - which columns
2. FROM - which table
3. WHERE - filter rows
4. ORDER BY - sort results
5. LIMIT - control size

---

## Example 8: OR Condition

**Goal:** Find passengers in first OR second class

In [15]:
# SQL way
query = """
SELECT Name, Pclass, Fare
FROM passengers
WHERE Pclass = 1 OR Pclass = 2
ORDER BY Pclass, Fare DESC
LIMIT 10
"""

result = pd.read_sql(query, conn)
print("Upper class passengers:")
display(result)

Upper class passengers:


Unnamed: 0,name,pclass,fare
0,"Cardeza, Mr. Thomas Drake Martinez",1,512.3292
1,"Cardeza, Mrs. James Warburton Martinez (Charlo...",1,512.3292
2,"Lesurer, Mr. Gustave J",1,512.3292
3,"Ward, Miss. Anna",1,512.3292
4,"Fortune, Miss. Alice Elizabeth",1,263.0
5,"Fortune, Miss. Ethel Flora",1,263.0
6,"Fortune, Miss. Mabel Helen",1,263.0
7,"Fortune, Mr. Charles Alexander",1,263.0
8,"Fortune, Mr. Mark",1,263.0
9,"Fortune, Mrs. Mark (Mary McDougald)",1,263.0


**Key Point:** OR means "either condition can be true"

---

## Your Turn in the Task File!

In [16]:
# Remember to close your connection!

In [17]:
# Always close your connection when done
conn.close()
print("✅ Connection closed")

✅ Connection closed


---

## Summary

You just saw examples of:
- ✅ SELECT specific columns
- ✅ WHERE filtering (numbers and text)
- ✅ Multiple conditions (AND, OR)
- ✅ ORDER BY sorting (ASC and DESC)
- ✅ LIMIT controlling results
- ✅ Combining all clauses together

### Next Steps:

1. **Keep the SQL Reference Guide open** - you'll need it!
2. **Open your task file:**
   - Track A: `dbApps04_Task_TrackA.ipynb`
   - Track B: `dbApps04_Task_TrackB.ipynb`
3. **Complete the exercises** - write actual queries!
4. **Submit via GitHub** when done

### Remember:
- You already know these concepts from pandas
- You're just learning SQL syntax
- It's okay to look things up - that's what professionals do!
- Build queries step by step
- Test after each clause

