### Tutorial 04: Joining Table

In this tutorial, we will learn how to join tables in SQL using the participants and status tables. Joining tables allows you to combine rows from two or more tables based on a related column, which is essential for querying normalized databases.
**NOTE** The RIGHT JOIN and FULL OUTER JOIN are not supported in SQLite.


**INNER JOIN**: Combining Rows with Matching Values
An INNER JOIN returns only the rows where there is a match in both tables. If a participant has a status in the status table, the query will return that participant’s details along with their status.

**LEFT JOIN**: Including All Rows from the Left Table
A LEFT JOIN returns all rows from the left table, and the matching rows from the right table. If there is no match, NULL values are returned for columns from the right table.

**SELF JOIN**: Joining a Table with Itself
Sometimes, you may want to join a table with itself. This is known as a SELF JOIN. For example, if we want to compare the status of two participants, we can perform a self-join on the participants table.

----
**Example 1**: Get ID, Age and status of Participants from Bhutan. 

**Example 2**: Get ID, Age, and status of all Participants who currently live in Myanmar.

**Example 3**: Get ID, Age and Gender who are In Progress (status). All participants (both Bhutan and Myanmar) should be included. 

**Example 4**: Get the number of Participants with each status.

**Example 5**: Compare the ages of Two Participants with the Same status 


In [None]:
import sqlite3
import pandas as pd

db_path = './database/mmdt.db3'



In [None]:
query = """
        SELECT ID, 2024-substr(BOD, 7,4) as age, status
        FROM bhutan as b
        INNER JOIN status as s
        ON b.ID = s.PARTICIPANT_ID;
"""
df = pd.read_sql_query(query, f"sqlite:///{db_path}")
df

In [None]:
query = """
        SELECT ID, 2024-BOD as age, status, Country
        FROM participants as p
        INNER JOIN status as s
        on p.ID = s.PARTICIPANT_ID
        WHERE Country LIKE 'Myanmar%';
"""
df = pd.read_sql_query(query, f"sqlite:///{db_path}")
df

In [None]:
query = """
        SELECT p.ID, COALESCE(2024-p.BOD, 2024-substr(b.BOD,7,4)) as age, 
        COALESCE(p.Gender, b.Gender) as gender, s.status
        FROM participants as p
        LEFT JOIN bhutan as b
        on p.ID = b.ID
        LEFT JOIN status as s
        on p.ID = s.PARTICIPANT_ID
        WHERE status LIKE '%In Progress%'
        LIMIT 20 OFFSET 0;
"""
df = pd.read_sql_query(query, f"sqlite:///{db_path}")
df

In [None]:
query = """
        SELECT status, count(*) as Number
        FROM status
        GROUP BY status
        ORDER BY count(*) DESC;
        """
df = pd.read_sql_query(query, f"sqlite:///{db_path}")
df

In [None]:
query = """
        SELECT 
        p1.ID as person1_ID, 
        p2.ID as person2_ID,
        p1.age as person1_age,
        p2.age as person2_age
        
        FROM 
        (SELECT p.ID, 
        COALESCE(2024-p.BOD, 2024-substr(b.BOD, 7, 4)) as age, 
        s.status
        FROM participants as p
        LEFT JOIN bhutan as b
        ON p.ID = b.ID
        LEFT JOIN status as s
        On p.ID = s.PARTICIPANT_ID) AS p1
        INNER JOIN

         (SELECT p.ID, 
        COALESCE(2024-p.BOD, 2024-substr(b.BOD, 7, 4)) as age, 
        s.status
        FROM participants as p
        LEFT JOIN bhutan as b
        ON p.ID = b.ID
        LEFT JOIN status as s
        On p.ID = s.PARTICIPANT_ID) AS p2
        ON p1.status = p2.status
        WHERE p1.ID != p2.ID
        LIMIT 10
        ;
        """
df = pd.read_sql_query(query, f"sqlite:///{db_path}")
df