# Class 10

Materials prepared by dr Maciej Świtała and Ewa Weychert

### 1. Introduction

**SQL** (Structured Query Language) is a query language used to manage relational databases. Python offers various libraries for integrating with SQL databases, the most popular of which are:
* SQLite (a lightweight database built into Python),
* MySQL,
* PostgreSQL,

although of course there are more libraries designed for this purpose.

These libraries provide the ability to create, read, update, and delete data directly from within Python. Most importantly, they enable interoperability — the ability to combine SQL data with Python-based analyses.

### 2. SQLite

In this section, we will focus on using SQLite, because it does not require installing additional components. If you are using a different database, most steps will be similar.

please isntall sqllite browser:https://sqlitebrowser.org/dl/


In [2]:
import sqlite3

To use MySQL in Python, you need to install the `mysql-connector-python` or `pymysql` library:

### 2.1. Creating databases and tables

In [4]:
# Connecting to the database (creates the file if it doesn't exist)
conn = sqlite3.connect('00_outputs/07/my_database.db')

# Creating a cursor
cur = conn.cursor()

# Creating a table
cur.execute('''CREATE TABLE IF NOT EXISTS employees (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    department TEXT NOT NULL,
    salary REAL
)''')

# Saving the changes
conn.commit()

# Closing the connection
conn.close()


### 2.2. CRUD Operations

CRUD refers to the operations of creating (Create), reading (Read), updating (Update), and deleting (Delete) data. We will discuss these operations below.

### 2.2.1. CREATE (adding data)


In [5]:
# Connecting to the database
conn = sqlite3.connect('00_outputs/07/my_database.db')
cur = conn.cursor()

# Inserting data without parameterization
cur.execute("INSERT INTO employees (name, department, salary) VALUES ('Jessy James', 'HR', 50000)")

# Adding another record
cur.execute("INSERT INTO employees (name, department, salary) VALUES ('Robert Ford', 'HR', 75000)")

# Saving the changes
conn.commit()

# Closing the connection
conn.close()


You can also add several records at once:

In [6]:
# Connecting to the database
conn = sqlite3.connect('00_outputs/07/my_database.db')
cur = conn.cursor()

# Inserting multiple records without parameterization
cur.execute("""
    INSERT INTO employees (name, department, salary)
    VALUES 
    ('Clint Eastwood', 'HR', 65000),
    ('Billy Kid', 'Engineering', 85000)
""")

# Saving the changes
conn.commit()

# Closing the connection
conn.close()


It is good practice to insert data using parameterization.

Parameterization in SQL means separating the input data (values) from the SQL query itself. The main reasons to use parameterization are security, readability, and performance.

SQL Injection is an attack in which a malicious user can provide input in a way that alters the structure of an SQL query and performs unwanted operations on the database (e.g., deleting data, stealing data). Using parameterization helps prevent this, because input values are separated from the SQL query and treated as data, not code.

Example of parameterization:


In [7]:
# Connecting to the database
conn = sqlite3.connect('00_outputs/07/my_database.db')
cur = conn.cursor()

# List of data to insert
data = [
    ('Forest Gump', 'Engineering', 115000),
    ('Tom Hanks', 'HR', 85000)
]

# Inserting multiple records using executemany
cur.executemany("INSERT INTO employees (name, department, salary) VALUES (?, ?, ?)", data)

# Saving the changes
conn.commit()

# Closing the connection
conn.close()


### 2.2.2. READ (read data)

In [8]:
conn = sqlite3.connect('00_outputs/07/my_database.db')
cur = conn.cursor()

# Reading all data
cur.execute("SELECT * FROM employees")

# Fetching and displaying results
rows = cur.fetchall()
for row in rows:
    print(row)

conn.close()


(2, 'Robert Ford', 'HR', 75000.0)
(3, 'Clint Eastwood', 'HR', 65000.0)
(4, 'Billy Kid', 'Engineering', 85000.0)
(5, 'Forest Gump', 'Engineering', 115000.0)
(6, 'Tom Hanks', 'HR', 85000.0)
(7, 'Jessy James', 'HR', 50000.0)
(8, 'Robert Ford', 'HR', 75000.0)
(9, 'Clint Eastwood', 'HR', 65000.0)
(10, 'Billy Kid', 'Engineering', 85000.0)
(11, 'Forest Gump', 'Engineering', 115000.0)
(12, 'Tom Hanks', 'HR', 85000.0)
(13, 'Jessy James', 'HR', 50000.0)
(14, 'Robert Ford', 'HR', 75000.0)
(15, 'Clint Eastwood', 'HR', 65000.0)
(16, 'Billy Kid', 'Engineering', 85000.0)
(17, 'Forest Gump', 'Engineering', 115000.0)
(18, 'Tom Hanks', 'HR', 85000.0)


### 2.2.3. UPDATE (updating data)

In [9]:
# Connecting to the database
conn = sqlite3.connect('00_outputs/07/my_database.db')
cur = conn.cursor()

# Updating data without parameterization
cur.execute("UPDATE employees SET salary = 75000 WHERE id = 1")

# Retrieving results
cur.execute("SELECT * FROM employees")  # Fetch all records from the employees table
rows = cur.fetchall()

# Displaying results
for row in rows:
    print(row)

# Saving the changes
conn.commit()

# Closing the connection
conn.close()


(2, 'Robert Ford', 'HR', 75000.0)
(3, 'Clint Eastwood', 'HR', 65000.0)
(4, 'Billy Kid', 'Engineering', 85000.0)
(5, 'Forest Gump', 'Engineering', 115000.0)
(6, 'Tom Hanks', 'HR', 85000.0)
(7, 'Jessy James', 'HR', 50000.0)
(8, 'Robert Ford', 'HR', 75000.0)
(9, 'Clint Eastwood', 'HR', 65000.0)
(10, 'Billy Kid', 'Engineering', 85000.0)
(11, 'Forest Gump', 'Engineering', 115000.0)
(12, 'Tom Hanks', 'HR', 85000.0)
(13, 'Jessy James', 'HR', 50000.0)
(14, 'Robert Ford', 'HR', 75000.0)
(15, 'Clint Eastwood', 'HR', 65000.0)
(16, 'Billy Kid', 'Engineering', 85000.0)
(17, 'Forest Gump', 'Engineering', 115000.0)
(18, 'Tom Hanks', 'HR', 85000.0)


### 2.2.4. DELETE (deleting data)

In [10]:
conn = sqlite3.connect('00_outputs/07/my_database.db')
cur = conn.cursor()

# Deleting data
cur.execute("DELETE FROM employees WHERE id = 1")

# Retrieving results
cur.execute("SELECT * FROM employees")  # Fetch all records from the employees table
rows = cur.fetchall()

# Displaying results
for row in rows:
    print(row)

conn.commit()
conn.close()


(2, 'Robert Ford', 'HR', 75000.0)
(3, 'Clint Eastwood', 'HR', 65000.0)
(4, 'Billy Kid', 'Engineering', 85000.0)
(5, 'Forest Gump', 'Engineering', 115000.0)
(6, 'Tom Hanks', 'HR', 85000.0)
(7, 'Jessy James', 'HR', 50000.0)
(8, 'Robert Ford', 'HR', 75000.0)
(9, 'Clint Eastwood', 'HR', 65000.0)
(10, 'Billy Kid', 'Engineering', 85000.0)
(11, 'Forest Gump', 'Engineering', 115000.0)
(12, 'Tom Hanks', 'HR', 85000.0)
(13, 'Jessy James', 'HR', 50000.0)
(14, 'Robert Ford', 'HR', 75000.0)
(15, 'Clint Eastwood', 'HR', 65000.0)
(16, 'Billy Kid', 'Engineering', 85000.0)
(17, 'Forest Gump', 'Engineering', 115000.0)
(18, 'Tom Hanks', 'HR', 85000.0)


### 2.3. JOIN (joining tables)

Let's start by creating two tables in one database.


In [11]:
# Connecting to the database
conn = sqlite3.connect('00_outputs/07/company.db')
cur = conn.cursor()

# Creating the departments table
cur.execute('''CREATE TABLE IF NOT EXISTS departments (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    budget INTEGER NOT NULL
)''')

# Creating the employees table
cur.execute('''CREATE TABLE IF NOT EXISTS employees (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    department_id INTEGER,
    salary REAL,
    FOREIGN KEY(department_id) REFERENCES departments(id)
)''')

# Adding data without parameterization
cur.execute("INSERT INTO departments (name, budget) VALUES ('HR', 100000)")
cur.execute("INSERT INTO departments (name, budget) VALUES ('Engineering', 250000)")
cur.execute("INSERT INTO departments (name, budget) VALUES ('Marketing', 150000)")

cur.execute("INSERT INTO employees (name, department_id, salary) VALUES ('John Doe', 1, 50000)")
cur.execute("INSERT INTO employees (name, department_id, salary) VALUES ('Jane Smith', 2, 60000)")
cur.execute("INSERT INTO employees (name, department_id, salary) VALUES ('Mike Johnson', 3, 55000)")

# Saving the changes
conn.commit()

# Closing the connection
conn.close()


Let's take a look at what these tables look like.

In [12]:
# Connecting to the database
conn = sqlite3.connect('00_outputs/07/company.db')
cur = conn.cursor()

# First table
cur.execute('''SELECT * FROM departments''')

# Fetching results
rows = cur.fetchall()

# Displaying results
for row in rows:
    print(row)

# Spacing for readability
print('\n')

# Second table
cur.execute('''SELECT * FROM employees''')

# Fetching results
rows = cur.fetchall()

# Displaying results
for row in rows:
    print(row)

# Closing the connection
conn.close()


(1, 'HR', 100000)
(2, 'Engineering', 250000)
(3, 'Marketing', 150000)
(4, 'HR', 100000)
(5, 'Engineering', 250000)
(6, 'Marketing', 150000)
(7, 'HR', 100000)
(8, 'Engineering', 250000)
(9, 'Marketing', 150000)


(1, 'John Doe', 1, 50000.0)
(2, 'Jane Smith', 2, 60000.0)
(3, 'Mike Johnson', 3, 55000.0)
(4, 'John Doe', 1, 50000.0)
(5, 'Jane Smith', 2, 60000.0)
(6, 'Mike Johnson', 3, 55000.0)
(7, 'John Doe', 1, 50000.0)
(8, 'Jane Smith', 2, 60000.0)
(9, 'Mike Johnson', 3, 55000.0)


Now let's join the tables using INNER JOIN.

In [13]:
# Connecting to the database
conn = sqlite3.connect('00_outputs/07/company.db')
cur = conn.cursor()

# Joining tables using INNER JOIN
cur.execute('''SELECT employees.name, departments.name, employees.salary, departments.budget 
               FROM employees
               JOIN departments ON employees.department_id = departments.id''')

# Fetching results
rows = cur.fetchall()

# Displaying results
for row in rows:
    print(row)

# Closing the connection
conn.close()


('John Doe', 'HR', 50000.0, 100000)
('Jane Smith', 'Engineering', 60000.0, 250000)
('Mike Johnson', 'Marketing', 55000.0, 150000)
('John Doe', 'HR', 50000.0, 100000)
('Jane Smith', 'Engineering', 60000.0, 250000)
('Mike Johnson', 'Marketing', 55000.0, 150000)
('John Doe', 'HR', 50000.0, 100000)
('Jane Smith', 'Engineering', 60000.0, 250000)
('Mike Johnson', 'Marketing', 55000.0, 150000)


Documentation for `sqlite3`: https://www.sqlite.org/.

In [14]:
import sqlite3
import pandas as pd

# ==========================================================================
# 1) CREATE IN-MEMORY SQLITE DB AND LOAD SAMPLE TABLES
# ==========================================================================
conn = sqlite3.connect(":memory:")
cur = conn.cursor()

cur.executescript("""
DROP TABLE IF EXISTS A;
DROP TABLE IF EXISTS B;

CREATE TABLE A (
    key INTEGER,
    val_A TEXT
);

CREATE TABLE B (
    key INTEGER,
    val_B TEXT
);

INSERT INTO A (key, val_A) VALUES
(1, 'A1'),
(2, 'A2'),
(3, 'A3'),
(4, 'A4');

INSERT INTO B (key, val_B) VALUES
(3, 'B3'),
(4, 'B4a'),
(4, 'B4b'),
(5, 'B5');
""")

def sql(q):
    return pd.read_sql_query(q, conn)

print("SQL Table A:\n", sql("SELECT * FROM A"), "\n")
print("SQL Table B:\n", sql("SELECT * FROM B"), "\n")

# ==========================================================================
# Helper to display SQL + pandas side by side
# ==========================================================================
def explain(title, sql_query, pandas_code, result):
    print("=" * 80)
    print(title)
    print("- SQL")
    print(sql_query)
    print("- pandas")
    print(pandas_code)
    print("- Result")
    print(result, "\n")


# ==========================================================================
# 2) PERFORM SQL JOINS + MATCHING PANDAS MERGE
# ==========================================================================

# LEFT JOIN (A left of B, red left circle)
sql_left = """
SELECT A.*, B.val_B
FROM A
LEFT JOIN B ON A.key = B.key;
"""
left_sql = sql(sql_left)
left_pd = left_sql.copy()  # Compare directly with SQL result
left_pd2 = pd.merge(sql("SELECT * FROM A"), sql("SELECT * FROM B"), on="key", how="left")

explain("LEFT JOIN",
        sql_left,
        'pd.merge(A, B, on="key", how="left")',
        left_pd2)


# RIGHT JOIN (A right join B, red right circle)
# SQLite has no RIGHT JOIN, so we simulate:
sql_right = """
SELECT A.key, A.val_A, B.val_B
FROM B
LEFT JOIN A ON A.key = B.key;
"""
right_sql = sql(sql_right)
right_pd2 = pd.merge(sql("SELECT * FROM A"), sql("SELECT * FROM B"), on="key", how="right")

explain("RIGHT JOIN",
        sql_right,
        'pd.merge(A, B, on="key", how="right")',
        right_pd2)


# INNER JOIN (middle intersection)
sql_inner = """
SELECT A.*, B.val_B
FROM A
INNER JOIN B ON A.key = B.key;
"""
inner_sql = sql(sql_inner)
inner_pd2 = pd.merge(sql("SELECT * FROM A"), sql("SELECT * FROM B"), on="key", how="inner")

explain("INNER JOIN",
        sql_inner,
        'pd.merge(A, B, on="key", how="inner")',
        inner_pd2)


# LEFT ANTI JOIN (A only, not in B)
sql_left_anti = """
SELECT A.*
FROM A
LEFT JOIN B ON A.key = B.key
WHERE B.key IS NULL;
"""
left_anti_sql = sql(sql_left_anti)

left_anti_pd = (
    pd.merge(sql("SELECT * FROM A"), sql("SELECT * FROM B"),
             on="key", how="left", indicator=True)
      .query("_merge == 'left_only'")
      .drop(columns="_merge")
)

explain("LEFT ANTI JOIN",
        sql_left_anti,
        """A.merge(B, how="left", on="key", indicator=True)
   .query("_merge == 'left_only'")""",
        left_anti_pd)


# RIGHT ANTI JOIN (B only, not in A)
sql_right_anti = """
SELECT B.*
FROM B
LEFT JOIN A ON A.key = B.key
WHERE A.key IS NULL;
"""
right_anti_sql = sql(sql_right_anti)

right_anti_pd = (
    pd.merge(sql("SELECT * FROM A"), sql("SELECT * FROM B"),
             on="key", how="right", indicator=True)
      .query("_merge == 'right_only'")
      .drop(columns="_merge")
)

explain("RIGHT ANTI JOIN",
        sql_right_anti,
        """A.merge(B, how="right", on="key", indicator=True)
   .query("_merge == 'right_only'")""",
        right_anti_pd)


# FULL OUTER JOIN (all rows from both)
# SQLite does NOT support FULL OUTER JOIN → simulate using UNION
sql_full_outer = """
SELECT A.key, A.val_A, B.val_B
FROM A
LEFT JOIN B ON A.key = B.key
UNION
SELECT A.key, A.val_A, B.val_B
FROM B
LEFT JOIN A ON A.key = B.key;
"""
full_outer_sql = sql(sql_full_outer)

full_outer_pd = pd.merge(
    sql("SELECT * FROM A"),
    sql("SELECT * FROM B"),
    on="key",
    how="outer"
)

explain("FULL OUTER JOIN",
        sql_full_outer,
        'pd.merge(A, B, on="key", how="outer")',
        full_outer_pd)


# FULL OUTER ANTI JOIN (rows only in A OR only in B)
sql_full_anti = """
SELECT *
FROM (
    SELECT A.key, A.val_A, NULL AS val_B
    FROM A LEFT JOIN B ON A.key = B.key
    WHERE B.key IS NULL

    UNION

    SELECT B.key, NULL AS val_A, B.val_B
    FROM B LEFT JOIN A ON A.key = B.key
    WHERE A.key IS NULL
);
"""
full_anti_sql = sql(sql_full_anti)

full_anti_pd = (
    pd.merge(sql("SELECT * FROM A"), sql("SELECT * FROM B"),
             on="key", how="outer", indicator=True)
      .query("_merge != 'both'")
      .drop(columns="_merge")
)

explain("FULL OUTER ANTI JOIN",
        sql_full_anti,
        """A.merge(B, how="outer", indicator=True)
   .query("_merge != 'both'")""",
        full_anti_pd)

print("Done.")


SQL Table A:
    key val_A
0    1    A1
1    2    A2
2    3    A3
3    4    A4 

SQL Table B:
    key val_B
0    3    B3
1    4   B4a
2    4   B4b
3    5    B5 

LEFT JOIN
- SQL

SELECT A.*, B.val_B
FROM A
LEFT JOIN B ON A.key = B.key;

- pandas
pd.merge(A, B, on="key", how="left")
- Result
   key val_A val_B
0    1    A1   NaN
1    2    A2   NaN
2    3    A3    B3
3    4    A4   B4a
4    4    A4   B4b 

RIGHT JOIN
- SQL

SELECT A.key, A.val_A, B.val_B
FROM B
LEFT JOIN A ON A.key = B.key;

- pandas
pd.merge(A, B, on="key", how="right")
- Result
   key val_A val_B
0    3    A3    B3
1    4    A4   B4a
2    4    A4   B4b
3    5   NaN    B5 

INNER JOIN
- SQL

SELECT A.*, B.val_B
FROM A
INNER JOIN B ON A.key = B.key;

- pandas
pd.merge(A, B, on="key", how="inner")
- Result
   key val_A val_B
0    3    A3    B3
1    4    A4   B4a
2    4    A4   B4b 

LEFT ANTI JOIN
- SQL

SELECT A.*
FROM A
LEFT JOIN B ON A.key = B.key
WHERE B.key IS NULL;

- pandas
A.merge(B, how="left", on="key", indicato

# Joining in Pandas vs SQL

### Explanation

The two code cells below demonstrate how relational joins can be performed using two different technologies:

1. **pandas** — for in-memory DataFrame operations  
2. **SQLite** — for SQL-style relational queries executed inside Python

Both cells operate on the **same pair of tables (`A` and `B`)**, making it easy to compare how identical join logic is expressed in pandas versus SQL.

In **pandas**, joins are performed with the `DataFrame.merge()` method, which provides a high-level and Pythonic interface for relational operations such as **LEFT JOIN**, **RIGHT JOIN**, **INNER JOIN**, and **FULL OUTER JOIN**. Pandas also makes analytical patterns like **anti-joins** straightforward using the `indicator=True` option, which helps identify rows that appear only in one of the two datasets.

In **SQL**, joins are expressed declaratively using `JOIN` clauses. SQLite supports **LEFT JOIN** and **INNER JOIN** natively, while operations such as **RIGHT JOIN** and **FULL OUTER JOIN** must be emulated with techniques like swapping tables or combining queries with `UNION`. Regardless of syntax, the underlying relational logic remains the same: determining how rows from two tables match based on shared key values.

Together, the two cells provide a clear, side-by-side comparison of how relational concepts translate into both a **programmatic workflow (pandas)** and a **declarative workflow (SQL)**. This makes the notebook a useful reference for learning, teaching, or debugging join behavior across different data tools.


In [17]:
import pandas as pd

# --------------------------------------------------------------------
# Sample data
# --------------------------------------------------------------------
A = pd.DataFrame({
    "key": [1, 2, 3, 4],
    "val_A": ["A1", "A2", "A3", "A4"]
})

B = pd.DataFrame({
    "key": [3, 4, 4, 5],
    "val_B": ["B3", "B4a", "B4b", "B5"]
})

print("Table A:")
print(A, "\n")
print("Table B:")
print(B, "\n")

def show(title, result_df):
    print("=" * 60)
    print(title)
    print(result_df, "\n")

# LEFT JOIN
left_join = A.merge(B, on="key", how="left")
show("LEFT JOIN", left_join)

# RIGHT JOIN
right_join = A.merge(B, on="key", how="right")
show("RIGHT JOIN", right_join)

# INNER JOIN
inner_join = A.merge(B, on="key", how="inner")
show("INNER JOIN", inner_join)

# LEFT ANTI JOIN
left_anti = (
    A.merge(B, on="key", how="left", indicator=True)
      .query("_merge == 'left_only'")
      .drop(columns="_merge")
)
show("LEFT ANTI JOIN", left_anti)

# RIGHT ANTI JOIN
right_anti = (
    A.merge(B, on="key", how="right", indicator=True)
      .query("_merge == 'right_only'")
      .drop(columns="_merge")
)
show("RIGHT ANTI JOIN", right_anti)

# FULL OUTER JOIN
full_outer = A.merge(B, on="key", how="outer")
show("FULL OUTER JOIN", full_outer)

# FULL OUTER ANTI JOIN / SYMMETRIC DIFFERENCE
full_outer_anti = (
    A.merge(B, on="key", how="outer", indicator=True)
      .query("_merge != 'both'")
      .drop(columns="_merge")
)
show("FULL OUTER ANTI JOIN", full_outer_anti)

print("Done.")


Table A:
   key val_A
0    1    A1
1    2    A2
2    3    A3
3    4    A4 

Table B:
   key val_B
0    3    B3
1    4   B4a
2    4   B4b
3    5    B5 

LEFT JOIN
   key val_A val_B
0    1    A1   NaN
1    2    A2   NaN
2    3    A3    B3
3    4    A4   B4a
4    4    A4   B4b 

RIGHT JOIN
   key val_A val_B
0    3    A3    B3
1    4    A4   B4a
2    4    A4   B4b
3    5   NaN    B5 

INNER JOIN
   key val_A val_B
0    3    A3    B3
1    4    A4   B4a
2    4    A4   B4b 

LEFT ANTI JOIN
   key val_A val_B
0    1    A1   NaN
1    2    A2   NaN 

RIGHT ANTI JOIN
   key val_A val_B
3    5   NaN    B5 

FULL OUTER JOIN
   key val_A val_B
0    1    A1   NaN
1    2    A2   NaN
2    3    A3    B3
3    4    A4   B4a
4    4    A4   B4b
5    5   NaN    B5 

FULL OUTER ANTI JOIN
   key val_A val_B
0    1    A1   NaN
1    2    A2   NaN
5    5   NaN    B5 

Done.


In [16]:
import sqlite3
import pandas as pd

# create in-memory DB
conn = sqlite3.connect(":memory:")
cur = conn.cursor()

# create tables
cur.execute("CREATE TABLE A (key INTEGER, val_A TEXT);")
cur.execute("CREATE TABLE B (key INTEGER, val_B TEXT);")

# insert data
cur.executemany("INSERT INTO A VALUES (?, ?);", [
    (1, "A1"), (2, "A2"), (3, "A3"), (4, "A4")
])

cur.executemany("INSERT INTO B VALUES (?, ?);", [
    (3, "B3"), (4, "B4a"), (4, "B4b"), (5, "B5")
])

def query(sql):
    return pd.read_sql_query(sql, conn)

# LEFT JOIN
left_join_sql = query("""
SELECT A.*, B.val_B
FROM A
LEFT JOIN B ON A.key = B.key;
""")

# RIGHT JOIN (SQLite emulation using LEFT JOIN swap)
right_join_sql = query("""
SELECT B.*, A.val_A
FROM B
LEFT JOIN A ON B.key = A.key;
""")

# INNER JOIN
inner_join_sql = query("""
SELECT A.*, B.val_B
FROM A
INNER JOIN B ON A.key = B.key;
""")

# LEFT ANTI JOIN
left_anti_sql = query("""
SELECT A.*
FROM A
LEFT JOIN B ON A.key = B.key
WHERE B.key IS NULL;
""")

# RIGHT ANTI JOIN
right_anti_sql = query("""
SELECT B.*
FROM B
LEFT JOIN A ON B.key = A.key
WHERE A.key IS NULL;
""")

# FULL OUTER JOIN (SQLite does not support it → union trick)
full_outer_sql = query("""
SELECT * FROM (
    SELECT A.key, A.val_A, B.val_B
    FROM A LEFT JOIN B ON A.key = B.key
    UNION ALL
    SELECT B.key, A.val_A, B.val_B
    FROM B LEFT JOIN A ON B.key = A.key
    WHERE A.key IS NULL
);
""")

# FULL OUTER ANTI JOIN
full_outer_anti_sql = query("""
SELECT * FROM (
    SELECT A.key, A.val_A
    FROM A LEFT JOIN B ON A.key = B.key
    WHERE B.key IS NULL
    UNION ALL
    SELECT B.key, B.val_B
    FROM B LEFT JOIN A ON B.key = A.key
    WHERE A.key IS NULL
);
""")

print("LEFT JOIN:\n", left_join_sql, "\n")
print("RIGHT JOIN:\n", right_join_sql, "\n")
print("INNER JOIN:\n", inner_join_sql, "\n")
print("LEFT ANTI JOIN:\n", left_anti_sql, "\n")
print("RIGHT ANTI JOIN:\n", right_anti_sql, "\n")
print("FULL OUTER JOIN:\n", full_outer_sql, "\n")
print("FULL OUTER ANTI JOIN:\n", full_outer_anti_sql, "\n")

conn.close()


LEFT JOIN:
    key val_A val_B
0    1    A1  None
1    2    A2  None
2    3    A3    B3
3    4    A4   B4a
4    4    A4   B4b 

RIGHT JOIN:
    key val_B val_A
0    3    B3    A3
1    4   B4a    A4
2    4   B4b    A4
3    5    B5  None 

INNER JOIN:
    key val_A val_B
0    3    A3    B3
1    4    A4   B4a
2    4    A4   B4b 

LEFT ANTI JOIN:
    key val_A
0    1    A1
1    2    A2 

RIGHT ANTI JOIN:
    key val_B
0    5    B5 

FULL OUTER JOIN:
    key val_A val_B
0    1    A1  None
1    2    A2  None
2    3    A3    B3
3    4    A4   B4a
4    4    A4   B4b
5    5  None    B5 

FULL OUTER ANTI JOIN:
    key val_A
0    1    A1
1    2    A2
2    5    B5 



### 4. Exercises (English Version)

Using the `sqlite3` library:

* Create a new database called **`district_court.db`**, and inside it create two tables:  
  - the first one should be called **`judges`** and contain the columns: `id`, `first_name`, `last_name`, `division`, `years_experience`;  
  - the second one should be called **`open_cases`** and contain the columns: `case_number`, `division`, `judge` (this last one is a **FOREIGN KEY** referring to `id` in the first table);

* Fill both tables, **`judges`** and **`open_cases`**, with the following records, respectively:  
  - **judges:**  
    (Kazimierz, Sprawiedliwy, I Civil, 15),  
    (Antoni, Rozjemca, I Civil, 12),  
    (Remigiusz, Surowy, I Criminal, 20),  
    (Janina, Sroga, I Criminal, 20);  
  - **open_cases:**  
    (I C 187/24, I Civil, 1),  
    (I C 116/24, I Civil, 2),  
    (I K 17/24, I Criminal, 3),  
    (I K 29/24, I Criminal, 4);  
  Use **parameterized queries** when inserting the records.

* Join the two tables using **INNER JOIN**; the `judge` column from the second table refers to the judge’s `id` from the first table. Display selected columns from the resulting joined table.
