# Week 9 Class 3


Alteryx

<img src="alteryx.png" alt="Alteryx Logo" width="600">


# SQL Review

## Creating a Database
To create a database:
```sql
CREATE DATABASE database_name;
```

## Altering a Database
To modify a database (e.g., changing its name):
```sql
ALTER DATABASE old_database_name MODIFY NAME = new_database_name;
```

## Dropping a Database
To delete a database:
```sql
DROP DATABASE database_name;
```

---

## Creating a Table
To create a table:
```sql
CREATE TABLE table_name (
    column1 datatype constraints,
    column2 datatype constraints,
    ...
);
```

## Altering a Table
To add, modify, or drop columns in a table:
```sql
-- Add a new column
ALTER TABLE table_name ADD column_name datatype;

-- Modify an existing column
ALTER TABLE table_name MODIFY column_name new_datatype;

-- Drop a column
ALTER TABLE table_name DROP COLUMN column_name;
```

## Dropping a Table
To delete a table:
```sql
DROP TABLE table_name;
```

---

## Inserting Rows
To insert data into a table:
```sql
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
```

## Updating Rows
To update existing data:
```sql
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
```

## Deleting Rows
To delete specific rows:
```sql
DELETE FROM table_name
WHERE condition;
```

---

## Selecting Data
### Basic Select
To retrieve all rows and columns:
```sql
SELECT * FROM table_name;
```

### Using WHERE Clause
To filter rows based on a condition:
```sql
SELECT column1, column2
FROM table_name
WHERE condition;
```

### Grouping Data
To group rows with common values:
```sql
SELECT column1, COUNT(*) -- or other aggregate method can be used here
FROM table_name
GROUP BY column1;
```

### Ordering Data
To sort rows in ascending or descending order:
```sql
SELECT column1, column2
FROM table_name
ORDER BY column1 ASC; -- or DESC
```

### Using HAVING Clause
To filter groups after grouping:
```sql
SELECT column1, COUNT(*) as count
FROM table_name
GROUP BY column1
HAVING COUNT(*) > 1; -- note that even though we alias above we must use the COUNT(*) here
```

## SQL Order of Operations (Logical Query Processing Phases)

When executing an SQL query, the operations are performed in a specific logical order, regardless of how the query is written. Below is the breakdown of the SQL order of operations using a `Users` table:

```sql
SELECT name, age
FROM Users
WHERE age > 25
ORDER BY age DESC;
```

1. **FROM**  
    - Identify the source table and perform any joins, subqueries, or table expressions.
    - Example: `FROM Users`

2. **WHERE**  
    - Filter rows based on conditions before grouping or aggregation.
    - Example: `WHERE age > 25`

3. **GROUP BY**  
    - Group rows that share a common value into summary rows.
    - Example: `GROUP BY age`

4. **HAVING**  
    - Filter groups created by `GROUP BY` based on aggregate conditions.
    - Example: `HAVING COUNT(*) > 1`

5. **SELECT**  
    - Select the columns or expressions to include in the result set.
    - Example: `SELECT name, age`

6. **ORDER BY**  
    - Sort the result set based on one or more columns or expressions.
    - Example: `ORDER BY age DESC`

7. **LIMIT/OFFSET**  
    - Limit the number of rows returned or skip a specific number of rows.
    - Example: `LIMIT 10`

---

### Example Query Breakdown

```sql
SELECT name, age
FROM Users
WHERE age > 25
ORDER BY age DESC;
```

1. **FROM**: Identify the `Users` table.
2. **WHERE**: Filter rows where `age > 25`.
3. **SELECT**: Retrieve the `name` and `age` columns.
4. **ORDER BY**: Sort the result by `age` in descending order.

### Subquery Example: Find Users with Above-Average Age

```sql
SELECT name, age
FROM Users
WHERE age > (
    SELECT AVG(age)
    FROM Users
)
ORDER BY age DESC;
```
When parsing the SQL query with a subquery, the logical order of operations is as follows:

1. **FROM Clause (Subquery)**:  
    The subquery in the `WHERE` clause is executed first to calculate the average age from the `Users` table.  
    Example:  
    ```sql
    SELECT AVG(age)
    FROM Users
    ```

2. **FROM Clause (Outer Query)**:  
    The `FROM` clause in the outer query identifies the `Users` table as the source.

3. **WHERE Clause**:  
    The `WHERE` clause in the outer query is evaluated to filter rows where `age` is greater than the result of the subquery.  
    Example:  
    ```sql
    WHERE age > (subquery result)
    ```

4. **SELECT Clause**:  
    The `SELECT` clause determines which columns (`name` and `age`) to include in the result set.  
    Example:  
    ```sql
    SELECT name, age
    ```

5. **ORDER BY Clause**:  
    Finally, the `ORDER BY` clause is applied to sort the result set by `age` in descending order.  
    Example:  
    ```sql
    ORDER BY age DESC
    ```


In [1]:
# imports we'll need for the rest of the notebook
import sqlite3
import pandas as pd
from IPython.display import display

In [2]:

# Step 1: Connect to SQLite database (or create it if it doesn't exist)
# connection = sqlite3.connect("example.db")
connection = sqlite3.connect(":memory:")  # for in-memory database, use ":memory:"

# Step 2: Create a cursor object to execute SQL commands
# cursors are used to interact with the database
cursor = connection.cursor()

# Step 3: Create a table
cursor.execute('''
CREATE TABLE IF NOT EXISTS Users (
    id INTEGER PRIMARY KEY AUTOINCREMENT, -- note we don't need to say NOT NULL and UNIQUE
    name TEXT NOT NULL,
    age INTEGER NOT NULL,
    email TEXT UNIQUE NOT NULL
)
''')

# Step 4: Insert data into the table
cursor.execute('''
INSERT INTO Users (name, age, email)
VALUES ('Alice', 25, 'alice@example.com')
''')
cursor.execute('''
INSERT INTO Users (name, age, email)
VALUES ('Bob', 30, 'bob@example.com')
''')

# Commit the changes
# note that this is similar to git commit
# it saves the changes made to the database
# if we don't commit the changes, they won't be saved
# and the database will be rolled back to the previous state
connection.commit()

# Step 5: Query data from the table
cursor.execute('SELECT * FROM Users')

# this will fetch all rows from the users table
# we could also use fetchone() to get one row at a time
# or fetchmany(size) to get a specific number of rows
# fetchall() returns a list of tuples, where each tuple is a row
rows = cursor.fetchall()
print("Users in the database:")
for row in rows:
    print(row)

# Step 6: Update a record
cursor.execute('''
UPDATE users
SET age = 26
WHERE name = 'Alice'
''')
connection.commit()

# Step 7: Delete a record
cursor.execute('''
DELETE FROM users
WHERE name = 'Bob'
''')
connection.commit()

# Step 8: Query data again to see changes
cursor.execute('SELECT * FROM users')
rows = cursor.fetchall()
print("\nUpdated users in the database:")
for row in rows:
    print(row)

# Step 9: Close the connection
connection.close()

Users in the database:
(1, 'Alice', 25, 'alice@example.com')
(2, 'Bob', 30, 'bob@example.com')

Updated users in the database:
(1, 'Alice', 26, 'alice@example.com')


In [14]:
# Example data with additional rows
left = pd.DataFrame({
    'id': [1, 2, 3, 5],
    'name': ['Alice', 'Bob', 'Charlie', 'Eve']
})
right = pd.DataFrame({
    'id': [2, 3, 4, 6],
    'age': [30, 35, 40, 50]
})
print("Left DataFrame:")
display(left)
print("Right DataFrame:")
display(right)

# Inner join
inner_result = pd.merge(left, right, on='id', how='inner')
print("Inner Join Result:")
display(inner_result)

# Left join
left_result = pd.merge(left, right, on='id', how='left')
print("Left Join Result:")
display(left_result)

# Right join
right_result = pd.merge(left, right, on='id', how='right')
print("Right Join Result:")
display(right_result)

# Outer join
outer_result = pd.merge(left, right, on='id', how='outer')
print("Outer Join Result:")
display(outer_result)

Left DataFrame:


Unnamed: 0,id,name
0,1,Alice
1,2,Bob
2,3,Charlie
3,5,Eve


Right DataFrame:


Unnamed: 0,id,age
0,2,30
1,3,35
2,4,40
3,6,50


Inner Join Result:


Unnamed: 0,id,name,age
0,2,Bob,30
1,3,Charlie,35


Left Join Result:


Unnamed: 0,id,name,age
0,1,Alice,
1,2,Bob,30.0
2,3,Charlie,35.0
3,5,Eve,


Right Join Result:


Unnamed: 0,id,name,age
0,2,Bob,30
1,3,Charlie,35
2,4,,40
3,6,,50


Outer Join Result:


Unnamed: 0,id,name,age
0,1,Alice,
1,2,Bob,30.0
2,3,Charlie,35.0
3,4,,40.0
4,5,Eve,
5,6,,50.0


In [16]:
# Create in-memory SQLite DB
conn = sqlite3.connect(":memory:")
cursor = conn.cursor()

# Create tables
cursor.execute("""
CREATE TABLE IF NOT EXISTS LeftTable (
    id INTEGER,
    name TEXT
)
""")
cursor.execute("""
CREATE TABLE IF NOT EXISTS RightTable (
    id INTEGER,
    age INTEGER
)
""")

# Insert data into LeftTable
left_data = [
    (1, 'Alice'),
    (2, 'Bob'),
    (3, 'Charlie'),
    (5, 'Eve')
]
cursor.executemany("INSERT INTO LeftTable VALUES (?, ?)", left_data)

# Insert data into RightTable
right_data = [
    (2, 30),
    (3, 35),
    (4, 40),
    (6, 50)
]
cursor.executemany("INSERT INTO RightTable VALUES (?, ?)", right_data)

# Perform joins and display results
join_queries = {
    "Inner Join": "SELECT * FROM LeftTable INNER JOIN RightTable ON LeftTable.id = RightTable.id",
    "Left Join": "SELECT * FROM LeftTable LEFT JOIN RightTable ON LeftTable.id = RightTable.id",
    "Right Join": "SELECT * FROM LeftTable RIGHT JOIN RightTable ON LeftTable.id = RightTable.id",
    "Outer Join": """
        SELECT * FROM LeftTable
        FULL OUTER JOIN RightTable ON LeftTable.id = RightTable.id
    """
}

for join_type, query in join_queries.items():
    try:
        print(f"{join_type} Result:")
        df_result = pd.read_sql_query(query, conn)
        display(df_result)
    except Exception as e:
        print(f"Error with {join_type}: {e}")

# Close the connection
conn.close()

Inner Join Result:


Unnamed: 0,id,name,id.1,age
0,2,Bob,2,30
1,3,Charlie,3,35


Left Join Result:


Unnamed: 0,id,name,id.1,age
0,1,Alice,,
1,2,Bob,2.0,30.0
2,3,Charlie,3.0,35.0
3,5,Eve,,


Right Join Result:


Unnamed: 0,id,name,id.1,age
0,2.0,Bob,2,30
1,3.0,Charlie,3,35
2,,,4,40
3,,,6,50


Outer Join Result:


Unnamed: 0,id,name,id.1,age
0,1.0,Alice,,
1,2.0,Bob,2.0,30.0
2,3.0,Charlie,3.0,35.0
3,5.0,Eve,,
4,,,4.0,40.0
5,,,6.0,50.0


## Leetcode Problems

## 🧮 Problem: Users Table

**Table: `users`**

```
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| name        | varchar |
| age         | int     |
| email       | varchar |
+-------------+---------+
```

`id` is the primary key column for this table.

Each row of this table contains information about a user's ID, name, age, and email address.

---

### ❓ Question

Write a SQL query to **find all users who are older than 25 years**.

Return the result table **in any order**.

---

### 🧪 Example 1:

**Input:**

Table: `users`

```
+----+-------+-----+-------------------+
| id | name  | age | email             |
+----+-------+-----+-------------------+
| 1  | Alice | 26  | alice@example.com |
| 2  | Bob   | 30  | bob@example.com   |
| 3  | Carol | 22  | carol@example.com |
+----+-------+-----+-------------------+
```

**Output:**

```
+----+-------+-----+-------------------+
| id | name  | age | email             |
+----+-------+-----+-------------------+
| 1  | Alice | 26  | alice@example.com |
| 2  | Bob   | 30  | bob@example.com   |
+----+-------+-----+-------------------+
```

---


In [5]:
# Create in-memory SQLite DB
conn = sqlite3.connect(":memory:")
cursor = conn.cursor()

# Create table
cursor.execute("""
CREATE TABLE IF NOT EXISTS World (
    name TEXT,
    continent TEXT,
    area INTEGER,
    population INTEGER,
    gdp INTEGER
)
""")

# Clear table just in case
cursor.execute("DELETE FROM World")

# Insert sample data
data = [
    ('Afghanistan', 'Asia', 652230, 25500100, 20343000000),
    ('Albania', 'Europe', 28748, 2831741, 12960000000),
    ('Algeria', 'Africa', 2381741, 37100000, 188681000000),
    ('Andorra', 'Europe', 468, 78115, 3712000000),
    ('Angola', 'Africa', 1246700, 20609294, 100990000000)
]
cursor.executemany("INSERT INTO World VALUES (?, ?, ?, ?, ?)", data)
query = """
SELECT * FROM World
"""
df_result = pd.read_sql_query(query, conn)
display(df_result)


Unnamed: 0,name,continent,area,population,gdp
0,Afghanistan,Asia,652230,25500100,20343000000
1,Albania,Europe,28748,2831741,12960000000
2,Algeria,Africa,2381741,37100000,188681000000
3,Andorra,Europe,468,78115,3712000000
4,Angola,Africa,1246700,20609294,100990000000


In [6]:
try:
    # Query: Find big countries
    # Find countries with a population greater than 25 million and area greater than 500,000 km²
    # and sort them by population in descending order.
    query = """
    -- ENTER YOUR SQL QUERY HERE
    """
    # Run query and load results into a DataFrame
    df_result = pd.read_sql_query(query, conn)

    # Show result
    print("Query Result:")
    display(df_result)

    # ✅ Test the result
    
    expected = pd.DataFrame({
        'name': ['Afghanistan', 'Algeria'],
        'population': [25500100, 37100000],
        'area': [652230, 2381741]
    })

    # Sort for comparison
    df_result_sorted = df_result.sort_values(by='name').reset_index(drop=True)
    expected_sorted = expected.sort_values(by='name').reset_index(drop=True)

    assert df_result_sorted.equals(expected_sorted), "Test failed: Output does not match expected result."
    print("\n✅ Test passed: Output matches expected result.")

except Exception as e:
    print("\n❌ An error occurred:")
    print(e)


❌ An error occurred:
'NoneType' object is not iterable


## 🧾 Problem: Product Sales Analysis I

**Table: `Sales`**

```
+-------------+-------+
| Column Name | Type  |
+-------------+-------+
| sale_id     | int   |
| product_id  | int   |
| year        | int   |
| quantity    | int   |
| price       | int   |
+-------------+-------+
```

- `(sale_id, year)` is the primary key.
- `product_id` is a foreign key referencing the `Product` table.
- Each row represents a sale of a product in a specific year.
- `price` is per unit.

**Table: `Product`**

```
+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| product_id   | int     |
| product_name | varchar |
+--------------+---------+
```

- `product_id` is the primary key.
- Each row represents the name of a product.

---

### ❓ Question

Write a SQL query to report the `product_name`, `year`, and `price` for each `sale_id` in the `Sales` table.

Return the result table **in any order**.

---

### 🧪 Example 1:

**Input:**

Table: `Sales`

```
+---------+------------+------+----------+-------+
| sale_id | product_id | year | quantity | price |
+---------+------------+------+----------+-------+
| 1       | 100        | 2008 | 10       | 5000  |
| 2       | 100        | 2009 | 12       | 5000  |
| 7       | 200        | 2011 | 15       | 9000  |
+---------+------------+------+----------+-------+
```

Table: `Product`

```
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 100        | Nokia        |
| 200        | Apple        |
| 300        | Samsung      |
+------------+--------------+
```

**Output:**

```
+--------------+-------+-------+
| product_name | year  | price |
+--------------+-------+-------+
| Nokia        | 2008  | 5000  |
| Nokia        | 2009  | 5000  |
| Apple        | 2011  | 9000  |
+--------------+-------+-------+
```

**Explanation:**

- From `sale_id = 1`, we know Nokia was sold for 5000 in 2008.
- From `sale_id = 2`, Nokia was again sold for 5000 in 2009.
- From `sale_id = 7`, Apple was sold for 9000 in 2011.


In [7]:
# Create in-memory SQLite DB
conn = sqlite3.connect(":memory:")
cursor = conn.cursor()

# Create tables
cursor.execute("""
CREATE TABLE IF NOT EXISTS Sales (
    sale_id INTEGER,
    product_id INTEGER,
    year INTEGER,
    quantity INTEGER,
    price INTEGER,
    PRIMARY KEY (sale_id, year)
)
""")

cursor.execute("""
CREATE TABLE IF NOT EXISTS Product (
    product_id INTEGER PRIMARY KEY,
    product_name TEXT
)
""")

# Clear tables
cursor.execute("DELETE FROM Sales")
cursor.execute("DELETE FROM Product")

# Insert data into Sales
sales_data = [
    (1, 100, 2008, 10, 5000),
    (2, 100, 2009, 12, 5000),
    (7, 200, 2011, 15, 9000)
]
cursor.executemany("INSERT INTO Sales VALUES (?, ?, ?, ?, ?)", sales_data)

# Insert data into Product
product_data = [
    (100, 'Nokia'),
    (200, 'Apple'),
    (300, 'Samsung')
]
cursor.executemany("INSERT INTO Product VALUES (?, ?)", product_data)

query = """
SELECT * FROM Sales
"""
df_result = pd.read_sql_query(query, conn)
display(df_result) # display what the original data looks like


Unnamed: 0,sale_id,product_id,year,quantity,price
0,1,100,2008,10,5000
1,2,100,2009,12,5000
2,7,200,2011,15,9000


In [8]:
try:
    # Query: Join Sales and Product to get product_name, year, price
    query = """
    -- ENTER YOUR SQL QUERY HERE
    """
    # Run query and load results into a DataFrame
    df_result = pd.read_sql_query(query, conn)

    df_result = pd.read_sql_query(query, conn)
    print("Query Result:")
    display(df_result)

    # ✅ Test the result
    
    # ✅ Test the result
    expected = pd.DataFrame({
        'product_name': ['Nokia', 'Nokia', 'Apple'],
        'year': [2008, 2009, 2011],
        'price': [5000, 5000, 9000]
    })

    df_result_sorted = df_result.sort_values(by=['product_name', 'year']).reset_index(drop=True)
    expected_sorted = expected.sort_values(by=['product_name', 'year']).reset_index(drop=True)

    assert df_result_sorted.equals(expected_sorted), "Test failed: Output does not match expected result."
    print("\n✅ Test passed: Output matches expected result.")
except Exception as e:
    print("\n❌ An error occurred:")
    print(e)# make sure to close the connection


❌ An error occurred:
'NoneType' object is not iterable


## 🧾 Problem: Classes More Than 5 Students

**Table: `Courses`**

```
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| student     | varchar |
| class       | varchar |
+-------------+---------+
```

- `(student, class)` is the primary key.
- Each row indicates a student and the class they are enrolled in.

---

### ❓ Question

Write a SQL query to **find all classes that have at least five students**.

Return the result table in **any order**.

---

### 🧪 Example 1:

**Input:**

Table: `Courses`

```
+---------+----------+
| student | class    |
+---------+----------+
| A       | Math     |
| B       | English  |
| C       | Math     |
| D       | Biology  |
| E       | Math     |
| F       | Computer |
| G       | Math     |
| H       | Math     |
| I       | Math     |
+---------+----------+
```

**Output:**

```
+--------+
| class  |
+--------+
| Math   |
+--------+
```

**Explanation:**

- Math has 6 students → included.
- English, Biology, Computer each have < 5 students → excluded.


In [9]:
# Create in-memory SQLite DB
conn = sqlite3.connect(":memory:")
cursor = conn.cursor()

# Create Courses table
cursor.execute("""
CREATE TABLE IF NOT EXISTS Courses (
    student TEXT,
    class TEXT,
    PRIMARY KEY (student, class)
)
""")

# Clear table just in case
cursor.execute("DELETE FROM Courses")

# Insert test data
courses_data = [
    ('A', 'Math'),
    ('B', 'English'),
    ('C', 'Math'),
    ('D', 'Biology'),
    ('E', 'Math'),
    ('F', 'Computer'),
    ('G', 'Math'),
    ('H', 'Math'),
    ('I', 'Math')
]
cursor.executemany("INSERT INTO Courses VALUES (?, ?)", courses_data)
query = """
SELECT * FROM Courses
"""
df_result = pd.read_sql_query(query, conn)
df_result # display what the original data looks like

Unnamed: 0,student,class
0,A,Math
1,B,English
2,C,Math
3,D,Biology
4,E,Math
5,F,Computer
6,G,Math
7,H,Math
8,I,Math


In [10]:
try:
    # Query: Classes with at least 5 students
    query = """
    -- ENTER YOUR SQL QUERY HERE
    """

    # Execute and display
    df_result = pd.read_sql_query(query, conn)
    print("Query Result:")
    display(df_result)

    # ✅ Test the result
    expected = pd.DataFrame({'class': ['Math']})

    df_result_sorted = df_result.sort_values(by='class').reset_index(drop=True)
    expected_sorted = expected.sort_values(by='class').reset_index(drop=True)

    assert df_result_sorted.equals(expected_sorted), "❌ Test failed: Output does not match expected result."
    print("\n✅ Test passed: Output matches expected result.")
except Exception as e:
    print("\n❌ An error occurred:")
    print(e)



❌ An error occurred:
'NoneType' object is not iterable


## 🧾 Problem: Customer Who Visited but Did Not Make Any Transactions

**Table: `Visits`**

```
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| visit_id    | int     |
| customer_id | int     |
+-------------+---------+
```

- `visit_id` is the primary key.
- Each row represents a customer visit to the mall.

**Table: `Transactions`**

```
+----------------+----------+--------+
| Column Name    | Type     |
+----------------+----------+--------+
| transaction_id | int      |
| visit_id       | int      |
| amount         | int      |
+----------------+----------+--------+
```

- `transaction_id` is the primary key.
- Each row represents a transaction tied to a `visit_id`.

---

### ❓ Question

Write a SQL query to find the **customer IDs who visited the mall but did not make any transactions**, and the **number of such visits** for each customer.

Return the result table in any order.

---

### 🧪 Example 1:

**Input:**

Table: `Visits`

```
+----------+-------------+
| visit_id | customer_id |
+----------+-------------+
| 1        | 23          |
| 2        | 9           |
| 4        | 30          |
| 5        | 54          |
| 6        | 96          |
| 7        | 54          |
| 8        | 54          |
+----------+-------------+
```

Table: `Transactions`

```
+----------------+----------+--------+
| transaction_id | visit_id | amount |
+----------------+----------+--------+
| 2              | 5        | 310    |
| 3              | 5        | 300    |
| 9              | 5        | 200    |
| 12             | 1        | 910    |
| 13             | 2        | 970    |
+----------------+----------+--------+
```

**Output:**

```
+-------------+----------------+
| customer_id | count_no_trans |
+-------------+----------------+
| 54          | 2              |
| 30          | 1              |
| 96          | 1              |
+-------------+----------------+
```

**Explanation:**

- Customers 30 and 96 had 1 visit each with no transactions.
- Customer 54 visited 3 times, but only 1 visit had transactions. So 2 visits count.


In [11]:
import sqlite3
import pandas as pd

# Set up in-memory SQLite DB
conn = sqlite3.connect(":memory:")
cursor = conn.cursor()

# Create tables
cursor.execute("CREATE TABLE IF NOT EXISTS Visits (visit_id INT, customer_id INT)")
cursor.execute("CREATE TABLE IF NOT EXISTS Transactions (transaction_id INT, visit_id INT, amount INT)")

# Clear any data
cursor.execute("DELETE FROM Visits")
cursor.execute("DELETE FROM Transactions")

# Insert data into Visits
visits_data = [
    (1, 23), (2, 9), (4, 30), (5, 54), (6, 96), (7, 54), (8, 54)
]
cursor.executemany("INSERT INTO Visits VALUES (?, ?)", visits_data)

# Insert data into Transactions
transactions_data = [
    (2, 5, 310), (3, 5, 300), (9, 5, 200), (12, 1, 910), (13, 2, 970)
]
cursor.executemany("INSERT INTO Transactions VALUES (?, ?, ?)", transactions_data)

cursor.execute("SELECT * FROM Visits")
visits_df = pd.DataFrame(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])
cursor.execute("SELECT * FROM Transactions")
transactions_df = pd.DataFrame(cursor.fetchall(), columns=[desc[0] for desc in cursor.description])

display(visits_df, transactions_df)

Unnamed: 0,visit_id,customer_id
0,1,23
1,2,9
2,4,30
3,5,54
4,6,96
5,7,54
6,8,54


Unnamed: 0,transaction_id,visit_id,amount
0,2,5,310
1,3,5,300
2,9,5,200
3,12,1,910
4,13,2,970


In [12]:

try:
    # Query: Customers who visited but made no transactions
    query = """
    -- ENTER YOUR SQL QUERY HERE
    """

    df_result = pd.read_sql_query(query, conn)
    print("Query Result:")
    display(df_result)

    # ✅ Test
    expected = pd.DataFrame({
        'customer_id': [30, 54, 96],
        'count_no_trans': [1, 2, 1]
    })

    df_result_sorted = df_result.sort_values(by='customer_id').reset_index(drop=True)
    expected_sorted = expected.sort_values(by='customer_id').reset_index(drop=True)

    assert df_result_sorted.equals(expected_sorted), "❌ Test failed!"
    print("\n✅ Test passed: Output matches expected result.")

except Exception as e:
    print("\n❌ An error occurred:")
    print(e)



❌ An error occurred:
'NoneType' object is not iterable


In [13]:
# Clean up once we're done
conn.close()
