<a href="https://colab.research.google.com/github/sethkipsangmutuba/SQL/blob/main/3c_Using_value_based_window_functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#  Objective

We’ll simulate **value-based window functions** (`LAG`) and **rate of change calculations**  
using **SQLite** on the Titanic dataset in Google Colab.

This will help us:

- Access previous row values (e.g., previous fare or age)
- Calculate change or growth over sequential records

---

## Step-by-Step Implementation in Colab (with `sqlite3`)

---

###  Load Titanic Dataset & Save to SQLite DB

Begin by loading the Titanic dataset into a Pandas DataFrame, then save it into a local SQLite database for SQL queries.

- Ensure columns like `fare`, `age`, and `pclass` are included
- This sets up the environment for using `LAG()` and performing calculations


In [13]:
import sqlite3
import pandas as pd
import seaborn as sns

# Load Titanic dataset
df = sns.load_dataset("titanic")

# Connect to SQLite (in-memory or file-based)
conn = sqlite3.connect("titanic.db")

# Write DataFrame to SQL table
df.to_sql("titanic", conn, if_exists="replace", index=False)


891

 Check Basic Table Structure

In [14]:
pd.read_sql("SELECT pclass, fare FROM titanic LIMIT 5;", conn)


Unnamed: 0,pclass,fare
0,3,7.25
1,1,71.2833
2,3,7.925
3,1,53.1
4,3,8.05


Use LAG() Window Function to Add Previous Fare

In [15]:
query_lag = """
SELECT
    pclass,
    fare,
    LAG(fare) OVER (PARTITION BY pclass ORDER BY fare ASC) AS prev_fare
FROM titanic
LIMIT 10;
"""

pd.read_sql(query_lag, conn)


Unnamed: 0,pclass,fare,prev_fare
0,1,0.0,
1,1,0.0,0.0
2,1,0.0,0.0
3,1,0.0,0.0
4,1,0.0,0.0
5,1,5.0,0.0
6,1,25.5875,5.0
7,1,25.925,25.5875
8,1,25.9292,25.925
9,1,25.9292,25.9292


Add Rate of Change Column (fare - prev_fare)

In [16]:
query_arc = """
SELECT
    pclass,
    fare,
    LAG(fare) OVER (PARTITION BY pclass ORDER BY fare ASC) AS prev_fare,
    fare - LAG(fare) OVER (PARTITION BY pclass ORDER BY fare ASC) AS fare_change
FROM titanic
LIMIT 10;
"""

pd.read_sql(query_arc, conn)


Unnamed: 0,pclass,fare,prev_fare,fare_change
0,1,0.0,,
1,1,0.0,0.0,0.0
2,1,0.0,0.0,0.0
3,1,0.0,0.0,0.0
4,1,0.0,0.0,0.0
5,1,5.0,0.0,5.0
6,1,25.5875,5.0,20.5875
7,1,25.925,25.5875,0.3375
8,1,25.9292,25.925,0.0042
9,1,25.9292,25.9292,0.0
