# SQL JOIN Queries

**SQL JOIN** queries combine rows from two or more tables based on a related column between them. There are several types of **JOINs**, including:

- **INNER JOIN**: Returns matching rows.
- **LEFT JOIN**: Returns all rows from the left table and matching rows from the right.
- **RIGHT JOIN**: Opposite of LEFT JOIN.
- **FULL JOIN**: Returns all rows when there's a match in either table.

**JOINs** are fundamental for working with relational databases, allowing users to:
- Retrieve data from multiple tables in a single query.
- Establish relationships between tables.
- Perform complex data analysis across related datasets.

In [14]:
import sqlite3
import pandas as pd

In [15]:
# Connect to the SQLite database

conn =  sqlite3.connect("../database/db.sqlite3")

# Create a cursor object
cursor = conn.cursor()

# tables = [table[0] for table in tables]
# # print the tables
# print(tables)

In [16]:

# show tables
cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
tables = cursor.fetchall()
# prinitin the tables of db in tabular format


from prettytable import PrettyTable
x = PrettyTable()
x.field_names = ["Table Name"]
for table in tables:
    x.add_row([table[0]])
print(x)


+--------------------+
|     Table Name     |
+--------------------+
|     department     |
|  sqlite_sequence   |
|        role        |
|      employee      |
|      students      |
|   student_phones   |
|      courses       |
|    enrollments     |
| course_assignments |
|      EMP1_NY       |
|      EMP2_ATL      |
|      EMP3_MIA      |
|    DEPT1_SMALL     |
|    DEPT2_LARGE     |
+--------------------+


## INNER JOIN 
> Returns only the rows that have matching values in both tables.




In [None]:
 
# Inner join between students and enrollments tables
cursor.execute("""
SELECT 
    s.name as student_name,
    s.email as student_email,
    e.enrollment_date,
    c.course_name
FROM students s
INNER JOIN enrollments e ON s.student_id = e.student_id
INNER JOIN courses c ON e.course_id = c.course_id;
""")

# Fetch the results
results = cursor.fetchall()

# Create a DataFrame from the results
df = pd.DataFrame(results, columns=['Student Name', 'Student Email', 'Enrollment Date', 'Course Name'])

# Display the DataFrame
print("\nStudents and their enrolled courses:")
#print(df)

#making the output more beatuful using pretty table
from prettytable import PrettyTable
x = PrettyTable()
x.field_names = ["Student Name", "Student Email", "Enrollment Date", "Course Name"]
for row in results:
    x.add_row(row)
print(x)





Students and their enrolled courses:
+------------------+------------------------------+-----------------+----------------------------------+
|   Student Name   |        Student Email         | Enrollment Date |           Course Name            |
+------------------+------------------------------+-----------------+----------------------------------+
| Alice Wonderland |      alice@example.com       |    2023-10-25   | Introduction to Computer Science |
| Alice Wonderland |      alice@example.com       |    2024-02-01   |       Advanced Mathematics       |
| Aishwarya Sharma | aishwarya.sharma@example.com |    2023-10-26   |       Physics I: Mechanics       |
|   Rahul Kumar    |   rahul.kumar@example.com    |    2023-10-27   | Introduction to Computer Science |
|   Priya Patel    |   priya.patel@example.com    |    2024-04-10   |   Chemistry: Organic Chemistry   |
|   Vikram Singh   |   vikram.singh@example.com   |    2023-10-28   |      Biology: Cell Biology       |
|   Sneha Reddy  

> **This query:**

✨ Joins the `students` table with `enrollments` using `student_id`
🔗 Further joins with `courses` table using `course_id`
📊 Shows each student's:
  - name
  - email
  - enrollment date
  - course name

🔍 Uses `INNER JOIN` so it will only show:
  - students who have enrollments
  - courses that exist

💡 **Understanding the Query:**
- In the `SELECT` statement, we specify the columns we want: student `name`, `email`, `enrollment date`, and `course name`
- In the `FROM` clause,we specify the first table to join (also referred to as the left table). We start with the `students` table (aliased as 's')
- Using `INNER JOIN`,  we specify the second table to join (also referred to as the right table). We connect:
  - `enrollments` (aliased as 'e') with `students` using `student_id`
  - `courses` (aliased as 'c') with `enrollments` using `course_id`

📝 Table aliases (s, e, c) make the query more readable and reduce typing

> **LEFT JOIN**

We’ll start our overview of `OUTER joins` with the **LEFT JOIN**. You should apply this JOIN type when you want to keep all records from the left table and only the matched records from the right table.


💡 **Key Differences from INNER JOIN:**
- INNER JOIN only shows students who are enrolled in courses
- LEFT JOIN shows ALL students, including those not enrolled


In [22]:
from prettytable import PrettyTable

# LEFT JOIN between students and enrollments tables
cursor.execute("""
SELECT 
    s.name as student_name,
    s.email as student_email,
    e.enrollment_date,
    c.course_name
FROM students s
LEFT JOIN enrollments e ON s.student_id = e.student_id
LEFT JOIN courses c ON e.course_id = c.course_id;
""")

# Fetch the results
left_join_results = cursor.fetchall()

# Create a DataFrame from the results
left_join_df = pd.DataFrame(left_join_results, columns=['Student Name', 'Student Email', 'Enrollment Date', 'Course Name'])

# Display the DataFrame
print("\nAll students and their enrolled courses (if any):")
x = PrettyTable()
x.field_names = ["Student Name", "Student Email", "Enrollment Date", "Course Name"]
for row in left_join_results:
    x.add_row(row)
print(x)


All students and their enrolled courses (if any):
+--------------------+--------------------------------+-----------------+----------------------------------+
|    Student Name    |         Student Email          | Enrollment Date |           Course Name            |
+--------------------+--------------------------------+-----------------+----------------------------------+
|  Alice Wonderland  |       alice@example.com        |    2023-10-25   | Introduction to Computer Science |
|  Alice Wonderland  |       alice@example.com        |    2024-02-01   |       Advanced Mathematics       |
|  Aishwarya Sharma  |  aishwarya.sharma@example.com  |    2025-01-04   | Introduction to Computer Science |
|  Aishwarya Sharma  |  aishwarya.sharma@example.com  |    2023-10-26   |       Physics I: Mechanics       |
|    Rahul Kumar     |    rahul.kumar@example.com     |    2023-10-27   | Introduction to Computer Science |
|    Priya Patel     |    priya.patel@example.com     |    2024-04-10   |   C

📊 **Use Cases of LEFT JOIN**
- Finding students who haven't enrolled in any courses


In [29]:
# Execute the SQL query to find students without enrollments

cursor.execute("""
SELECT s.name as student_name, s.email as student_email
FROM students s
LEFT JOIN enrollments e ON s.student_id = e.student_id
WHERE e.enrollment_id IS NULL;
""")

# Fetch the results
no_enrollment_results = cursor.fetchall()

# Create a DataFrame from the results
no_enrollment_df = pd.DataFrame(no_enrollment_results, columns=['Student Name', 'Student Email'])

# Display the DataFrame
print("\nStudents without enrollments:\n")
print(no_enrollment_df)



Students without enrollments:

         Student Name                   Student Email
0  Deepika Chatterjee  deepika.chatterjee@example.com


> **RIGHT JOIN**

`RIGHT JOIN` is very similar to `LEFT JOIN`. I bet you guessed that the only difference is that `RIGHT JOIN` keeps all of the records from the right table, even if they cannot be matched to the left table. If you did, you’re correct!


In [23]:
# RIGHT JOIN equivalent: Swap the tables in a LEFT JOIN
cursor.execute("""
SELECT 
    s.name as student_name,
    s.email as student_email,
    e.enrollment_date,
    c.course_name
FROM courses c
LEFT JOIN enrollments e ON c.course_id = e.course_id
LEFT JOIN students s ON e.student_id = s.student_id;
""")

# Fetch the results
right_join_results = cursor.fetchall()

# Create a DataFrame from the results
right_join_df = pd.DataFrame(right_join_results, columns=['Student Name', 'Student Email', 'Enrollment Date', 'Course Name'])

# Display the DataFrame
print("\nAll courses and their enrolled students (if any):")
x = PrettyTable()
x.field_names = ["Student Name", "Student Email", "Enrollment Date", "Course Name"]
for row in right_join_results:
    x.add_row(row)
print(x)


All courses and their enrolled students (if any):
+------------------+------------------------------+-----------------+----------------------------------+
|   Student Name   |        Student Email         | Enrollment Date |           Course Name            |
+------------------+------------------------------+-----------------+----------------------------------+
| Alice Wonderland |      alice@example.com       |    2023-10-25   | Introduction to Computer Science |
| Aishwarya Sharma | aishwarya.sharma@example.com |    2025-01-04   | Introduction to Computer Science |
|   Rahul Kumar    |   rahul.kumar@example.com    |    2023-10-27   | Introduction to Computer Science |
| Alice Wonderland |      alice@example.com       |    2024-02-01   |       Advanced Mathematics       |
| Aishwarya Sharma | aishwarya.sharma@example.com |    2023-10-26   |       Physics I: Mechanics       |
|   Priya Patel    |   priya.patel@example.com    |    2024-04-10   |   Chemistry: Organic Chemistry   |
|   

In [30]:
from prettytable import PrettyTable

# Query to find courses with no enrollments
cursor.execute("""
SELECT 
    c.course_name
FROM courses c
LEFT JOIN enrollments e ON c.course_id = e.course_id
WHERE e.enrollment_id IS NULL;
""")

# Fetch the results
no_enrollment_courses = cursor.fetchall()

# Display the results in a PrettyTable
x = PrettyTable()
x.field_names = ["Course Name"]
for course in no_enrollment_courses:
    x.add_row(course)
print("\nCourses with no enrollments:")
print(x)


Courses with no enrollments:
+--------------------------------+
|          Course Name           |
+--------------------------------+
| Data Structures and Algorithms |
|        Web Development         |
+--------------------------------+


> **FULL OUTER JOIN**

A `FULL OUTER JOIN` in SQL combines the results of both `LEFT` and `RIGHT OUTER JOIN`s. 
It returns all rows from both tables, matching records where the join condition is met and including unmatched rows from both tables with `NULL` values in place of missing data. 
This join type is useful when you need to see all data from both tables, 
- regardless of whether there are matching rows, 
- and is particularly valuable for identifying missing relationships or 
- performing data reconciliation between two tables.

In [31]:
from prettytable import PrettyTable

# FULL OUTER JOIN simulation using UNION of LEFT JOIN and RIGHT JOIN
cursor.execute("""
SELECT 
    s.name AS student_name,
    s.email AS student_email,
    e.enrollment_date,
    c.course_name
FROM students s
LEFT JOIN enrollments e ON s.student_id = e.student_id
LEFT JOIN courses c ON e.course_id = c.course_id

UNION

SELECT 
    s.name AS student_name,
    s.email AS student_email,
    e.enrollment_date,
    c.course_name
FROM courses c
LEFT JOIN enrollments e ON c.course_id = e.course_id
LEFT JOIN students s ON e.student_id = s.student_id;
""")

# Fetch the results
full_outer_join_results = cursor.fetchall()

# Display the results using PrettyTable
x = PrettyTable()
x.field_names = ["Student Name", "Student Email", "Enrollment Date", "Course Name"]
for row in full_outer_join_results:
    x.add_row(row)
print("\nFull Outer Join Results:")
print(x)


Full Outer Join Results:
+--------------------+--------------------------------+-----------------+----------------------------------+
|    Student Name    |         Student Email          | Enrollment Date |           Course Name            |
+--------------------+--------------------------------+-----------------+----------------------------------+
|        None        |              None              |       None      |  Data Structures and Algorithms  |
|        None        |              None              |       None      |         Web Development          |
|  Aishwarya Sharma  |  aishwarya.sharma@example.com  |    2023-10-26   |       Physics I: Mechanics       |
|  Aishwarya Sharma  |  aishwarya.sharma@example.com  |    2025-01-04   | Introduction to Computer Science |
|  Alice Wonderland  |       alice@example.com        |    2023-10-25   | Introduction to Computer Science |
|  Alice Wonderland  |       alice@example.com        |    2024-02-01   |       Advanced Mathematics  

# 🔄 Self Join in SQL

> **A self join is when a table is joined with itself. It's particularly useful when a table contains hierarchical or recursive data.**

## 💡 Key Points
- Requires unique aliases for each instance of the table
- Can use any type of join (INNER, LEFT, RIGHT)
- Helps establish relationships within the same table
- Useful for:
  - Organizational hierarchies
  - Sequential relationships
  - Peer-to-peer relationships
  - Recursive queries

## 📌 Example
> **Suppose we have a table called `employees` with the following structure:**

| employee_id | name | manager_id |
| --- | --- | --- |
| 1 | Alice | NULL |
| 2 | Bob | 1 |
| 3 | Charlie | 1 |
| 4 | David | 2 |
| 5 | Eve | 2 |


> **We want to find out who manages whom.**

## 📝 SQL Query
```sql
SELECT 
    e1.name AS employee_name,
    e2.name AS manager_name
FROM 
    employees e1
JOIN 
    employees e2 ON e1.manager_id = e2.employee_id;
```


### 🔍 Example Scenarios

In [33]:
from prettytable import PrettyTable
# Query to find students in the same department
cursor.execute("""
SELECT 
    s1.name as student1,
    s2.name as student2
FROM students s1
JOIN students s2 ON s1.department_id = s2.department_id
WHERE s1.student_id < s2.student_id;
""")

# Fetch the results
same_department_results = cursor.fetchall()

# Display the results using PrettyTable

x = PrettyTable()
x.field_names = ["Student 1", "Student 2"]
for row in same_department_results:
    x.add_row(row)
print("\nStudents in the same department:")
print(x)


Students in the same department:
+------------------+--------------------+
|    Student 1     |     Student 2      |
+------------------+--------------------+
| Alice Wonderland |  Aishwarya Sharma  |
| Alice Wonderland | Deepika Chatterjee |
| Alice Wonderland |    Vikram Singh    |
| Aishwarya Sharma | Deepika Chatterjee |
| Aishwarya Sharma |    Vikram Singh    |
|   Rahul Kumar    |    Sneha Reddy     |
|   Priya Patel    |    Arjun Menon     |
|   Vikram Singh   | Deepika Chatterjee |
+------------------+--------------------+


### 🔍 Explanation of the Query

The query performs a **Self Join** on the `students` table to find pairs of students who belong to the same department. Here's a breakdown of what happens:

1. **Self Join**:
    - The `students` table is joined with itself using the condition `s1.department_id = s2.department_id`.
    - This ensures that only students from the same department are paired.

2. **Aliasing**:
    - The table is aliased as `s1` and `s2` to differentiate between the two instances of the same table.

3. **Condition to Avoid Duplicate Pairs**:
    - The condition `s1.student_id < s2.student_id` ensures that each pair is listed only once (e.g., if Alice and Bob are in the same department, only one row is returned for this pair).

4. **Columns Selected**:
    - The query selects the names of the two students (`s1.name` as `student1` and `s2.name` as `student2`).

5. **Result**:
    - The query returns a list of student pairs who belong to the same department.

### 📊 Use Case
This query is useful for:
- Identifying relationships or collaborations between students in the same department.
- Analyzing departmental groupings or connections.

### 📝 Example Output
| Student 1          | Student 2          |
|---------------------|--------------------|
| Alice Wonderland    | Aishwarya Sharma  |
| Alice Wonderland    | Deepika Chatterjee|
| Alice Wonderland    | Vikram Singh      |
| Aishwarya Sharma    | Deepika Chatterjee|
| Aishwarya Sharma    | Vikram Singh      |
| Rahul Kumar         | Sneha Reddy       |
| Priya Patel         | Arjun Menon       |
| Vikram Singh        | Deepika Chatterjee|