<a href="https://colab.research.google.com/github/sethkipsangmutuba/SQL/blob/main/2b_Transform_Columns_Using_Numeric_Functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Titanic Dataset — Transform Columns Using Numeric Functions

In this notebook, we demonstrate how to use SQL numeric functions like `ROUND`, `LOG`, and `SQRT` to transform columns in a table — using the Titanic dataset in Google Colab.

---

## Step 1: Setup in Google Colab


In [22]:
import pandas as pd
import sqlite3
import seaborn as sns
import numpy as np

# Load Titanic dataset
df = sns.load_dataset('titanic')

# Drop rows with nulls in 'fare' to avoid math errors (for LOG, SQRT)
df = df.dropna(subset=['fare'])

# Create in-memory SQLite database
conn = sqlite3.connect(':memory:')
df.to_sql('titanic', conn, index=False, if_exists='replace')


891

---

## Learning Objectives

By the end of this notebook, you should be able to:

- Apply `ROUND`, `LOG`, and `SQRT` functions on columns  
- Analyze transformations of continuous variables  
- Combine numeric function transformations into a single SQL query  

---

## Exercise: Column Transformations in SQL

**1. What is the fare paid by each passenger (raw)?**


In [23]:
query = """
SELECT
    survived,
    class,
    fare
FROM titanic
LIMIT 10;
"""
pd.read_sql_query(query, conn)


Unnamed: 0,survived,class,fare
0,0,Third,7.25
1,1,First,71.2833
2,1,Third,7.925
3,1,First,53.1
4,0,Third,8.05
5,0,Third,8.4583
6,0,First,51.8625
7,0,Third,21.075
8,1,Third,11.1333
9,1,Second,30.0708


**2. What are the rounded-off fare values?**


In [24]:
query = """
SELECT
    survived,
    class,
    fare,
    ROUND(fare, 2) AS rounded_fare
FROM titanic
LIMIT 10;
"""
pd.read_sql_query(query, conn)


Unnamed: 0,survived,class,fare,rounded_fare
0,0,Third,7.25,7.25
1,1,First,71.2833,71.28
2,1,Third,7.925,7.93
3,1,First,53.1,53.1
4,0,Third,8.05,8.05
5,0,Third,8.4583,8.46
6,0,First,51.8625,51.86
7,0,Third,21.075,21.08
8,1,Third,11.1333,11.13
9,1,Second,30.0708,30.07


**3. What is the logarithm of the fare paid (`LOG`)?**  
⚠️ *We need to make sure `fare > 0` to avoid `LOG(0)` errors.*


In [25]:
query = """
SELECT
    survived,
    class,
    fare,
    LOG(fare) AS log_fare
FROM titanic
WHERE fare > 0
LIMIT 10;
"""
pd.read_sql_query(query, conn)


Unnamed: 0,survived,class,fare,log_fare
0,0,Third,7.25,0.860338
1,1,First,71.2833,1.852988
2,1,Third,7.925,0.898999
3,1,First,53.1,1.725095
4,0,Third,8.05,0.905796
5,0,Third,8.4583,0.927283
6,0,First,51.8625,1.714853
7,0,Third,21.075,1.323768
8,1,Third,11.1333,1.046624
9,1,Second,30.0708,1.478145


**4. What is the square root of the fare paid (`SQRT`)?**


In [26]:
query = """
SELECT
    survived,
    class,
    fare,
    SQRT(fare) AS sqrt_fare
FROM titanic
WHERE fare >= 0
LIMIT 10;
"""
pd.read_sql_query(query, conn)


Unnamed: 0,survived,class,fare,sqrt_fare
0,0,Third,7.25,2.692582
1,1,First,71.2833,8.442944
2,1,Third,7.925,2.815138
3,1,First,53.1,7.286975
4,0,Third,8.05,2.837252
5,0,Third,8.4583,2.908316
6,0,First,51.8625,7.201562
7,0,Third,21.075,4.590752
8,1,Third,11.1333,3.33666
9,1,Second,30.0708,5.483685


---

## Combine All Transformations into One Query

Write a single SQL query that returns the following columns:

- Raw fare
- Rounded fare using `ROUND`
- Logarithm of fare using `LOG` (only for `fare > 0`)
- Square root of fare using `SQRT`

Use SQL numeric functions to transform the `fare` column and display all results together.


In [27]:
query = """
SELECT
    survived,
    class,
    fare,
    ROUND(fare, 2) AS rounded_fare,
    LOG(fare) AS log_fare,
    SQRT(fare) AS sqrt_fare
FROM titanic
WHERE fare > 0
LIMIT 10;
"""
pd.read_sql_query(query, conn)


Unnamed: 0,survived,class,fare,rounded_fare,log_fare,sqrt_fare
0,0,Third,7.25,7.25,0.860338,2.692582
1,1,First,71.2833,71.28,1.852988,8.442944
2,1,Third,7.925,7.93,0.898999,2.815138
3,1,First,53.1,53.1,1.725095,7.286975
4,0,Third,8.05,8.05,0.905796,2.837252
5,0,Third,8.4583,8.46,0.927283,2.908316
6,0,First,51.8625,51.86,1.714853,7.201562
7,0,Third,21.075,21.08,1.323768,4.590752
8,1,Third,11.1333,11.13,1.046624,3.33666
9,1,Second,30.0708,30.07,1.478145,5.483685
