# LeetCode 183: Customers Who Never Order

### Problem Statement

**Table: Customers**

| Column Name | Type    |
|-------------|---------|
| id          | int     |
| name        | varchar |

`id` is the primary key (column with unique values) for this table.
Each row of this table indicates the ID and name of a customer.

**Table: Orders**

| Column Name | Type |
|-------------|------|
| id          | int  |
| customerId  | int  |

`id` is the primary key (column with unique values) for this table.
`customerId` is a foreign key (reference columns) of the ID from the Customers table.
Each row of this table indicates the ID of an order and the ID of the customer who ordered it.

**Task:**
Write a solution to find all customers who never order anything.

**Example 1:**

**Input:**
Customers table:
| id | name  |
|----|-------|
| 1  | Joe   |
| 2  | Henry |
| 3  | Sam   |
| 4  | Max   |

Orders table:
| id | customerId |
|----|------------|
| 1  | 3          |
| 2  | 1          |

**Output:**
| Customers |
|-----------|
| Henry     |
| Max       |

In [1]:
import sqlite3
import pandas as pd

# 1. Setup SQLite in-memory database
conn = sqlite3.connect(":memory:")
cursor = conn.cursor()

# 2. Helper function to display query results as a Pandas DataFrame
def show(query):
    return pd.read_sql_query(query, conn)

print("Environment setup complete. Database ready.")

Environment setup complete. Database ready.


### Schema Description

We have a classic **One-to-Many** relational schema.

1.  **Customers (Parent Table):** Holds the master list of all users.
2.  **Orders (Child Table):** Holds transactional events.

**Entity-Relationship (ER) Diagram:**

    +-------------------+           +-------------------+
    |    Customers      |           |      Orders       |
    +-------------------+           +-------------------+
    | PK  id            | <---------| PK  id            |
    |     name          |   (1:N)   | FK  customerId    |
    +-------------------+           +-------------------+
            |                                 ^
            |                                 |
            | Has 0 or More Orders            | Belongs to 1 Customer
            +---------------------------------+

*   **Cardinality:** One customer can place *zero* to *many* orders.
*   **Constraint:** The problem asks specifically for the "Zero" case (Cardinality = 0).

In [2]:
# Create the tables
create_customers_sql = """
CREATE TABLE Customers (
    id INTEGER PRIMARY KEY,
    name VARCHAR(255)
);
"""

create_orders_sql = """
CREATE TABLE Orders (
    id INTEGER PRIMARY KEY,
    customerId INTEGER,
    FOREIGN KEY (customerId) REFERENCES Customers(id)
);
"""

cursor.execute(create_customers_sql)
cursor.execute(create_orders_sql)
conn.commit()
print("Tables 'Customers' and 'Orders' created.")

Tables 'Customers' and 'Orders' created.


### Sample Data

We will populate the database with the example data provided in the LeetCode problem.

**Data Logic:**
*   **Joe (id:1):** Exists in `Orders` (Order id:2).
*   **Henry (id:2):** DOES NOT exist in `Orders`.
*   **Sam (id:3):** Exists in `Orders` (Order id:1).
*   **Max (id:4):** DOES NOT exist in `Orders`.

**Target:** We need to return Henry and Max.

In [3]:
# Insert sample data
insert_customers = """
INSERT INTO Customers (id, name) VALUES
(1, 'Joe'),
(2, 'Henry'),
(3, 'Sam'),
(4, 'Max');
"""

insert_orders = """
INSERT INTO Orders (id, customerId) VALUES
(1, 3),
(2, 1);
"""

cursor.execute(insert_customers)
cursor.execute(insert_orders)
conn.commit()
print("Sample data inserted.")

Sample data inserted.


### ðŸŽ“ Lecture: The Exclusion Join (Anti-Join)

As a Senior Data Professional, "Finding records in A that are NOT in B" is a daily task. This is technically called an **Anti-Join**.

There are three main ways to solve this in SQL:
1.  **LEFT JOIN + IS NULL** (The most visual method).
2.  **NOT EXISTS** (Often the most performant method).
3.  **NOT IN** (Dangerous if NULLs are involved, generally discouraged).

We will focus on the **LEFT JOIN** approach as it best explains the relational algebra.

#### 1. Visualizing the LEFT JOIN
A `LEFT JOIN` returns **ALL** rows from the Left Table (`Customers`), regardless of whether a match exists in the Right Table (`Orders`).

If a match is found, the columns from `Orders` are populated.
If NO match is found, the columns from `Orders` are filled with `NULL`.

**ASCII Join Visualization:**

    Query: FROM Customers C LEFT JOIN Orders O ON C.id = O.customerId

    [ Left Table: Customers ]       [ Right Table: Orders ]
    +----+-------+                +----+------------+
    | id | name  |                | id | customerId |
    +----+-------+                +----+------------+
    | 1  | Joe   | -- MATCH -->   | 2  | 1          |  (Ordered)
    | 2  | Henry | -- NO MATCH -> | NULL | NULL       |  <-- TARGET!
    | 3  | Sam   | -- MATCH -->   | 1  | 3          |  (Ordered)
    | 4  | Max   | -- NO MATCH -> | NULL | NULL       |  <-- TARGET!

#### 2. The Filter Logic
Once we have the result of the `LEFT JOIN`, we can simply ask the database:
*"Show me the rows where the Order information is missing."*

*   Condition: `WHERE Orders.customerId IS NULL`
    *   (Or `Orders.id IS NULL` - checking the Primary Key of the right table is safer).

#### 3. Why not `NOT IN`?
A common junior mistake is writing:
`WHERE id NOT IN (SELECT customerId FROM Orders)`

While this works for this specific LeetCode problem, in production, if `Orders.customerId` contained a single `NULL` value, `NOT IN` would return an empty set for *everything*.
*   Logic: `x NOT IN (1, 2, NULL)` evaluates to `UNKNOWN`, never `TRUE`.
*   **Best Practice:** Stick to `LEFT JOIN` or `NOT EXISTS` for safety.

#### 4. Performance & Indexing
In a database with 10 Million customers and 100 Million orders:
*   **Full Table Scan:** Comparing every customer to every order is $O(M \times N)$.
*   **Index Scan:** We need an index on `Orders(customerId)`. The database will then iterate through `Customers`, do a quick B-Tree lookup in `Orders`. If the lookup fails, that customer is kept.

### Step-by-Step Reasoning for the Solution

We want to find customers who are "missing" from the orders list.

**Logical Steps:**

1.  **Select Source:** Start with the `Customers` table (Alias `C`).
2.  **Join Strategy:** Use `LEFT JOIN` to bring in `Orders` data (Alias `O`). We use LEFT because we want to keep customers even if they have *no* orders.
3.  **Join Condition:** `C.id = O.customerId`.
4.  **Filter (The Anti-Join):** We want rows where the join *failed*.
    *   Condition: `O.id IS NULL` (This confirms no order record was attached).
5.  **Projection:** Select `C.name` and rename the column to `Customers`.

**Drafting the Query:**

    SELECT
        C.name AS Customers
    FROM
        Customers AS C
    LEFT JOIN
        Orders AS O
    ON
        C.id = O.customerId
    WHERE
        O.id IS NULL;

In [4]:
# Final SQL Solution
final_query = """
SELECT
    C.name AS Customers
FROM
    Customers AS C
LEFT JOIN
    Orders AS O ON C.id = O.customerId
WHERE
    O.id IS NULL;
"""

### Output Verification

Let's trace the logic:

1.  **Joe (id=1):** Matches Order(2). `O.id` is 2 (Not NULL). -> **Drop**.
2.  **Henry (id=2):** No match. `O.id` is NULL. -> **Keep**.
3.  **Sam (id=3):** Matches Order(1). `O.id` is 1 (Not NULL). -> **Drop**.
4.  **Max (id=4):** No match. `O.id` is NULL. -> **Keep**.

**Expected Output:**
| Customers |
|-----------|
| Henry     |
| Max       |

In [5]:
# Execute and show final results
show(final_query)

Unnamed: 0,Customers
0,Henry
1,Max


### Summary and Key Takeaways

1.  **The Anti-Join Pattern:** To find "A without B", standard practice is `LEFT JOIN ... WHERE B.key IS NULL`.
2.  **Naming Matches Intent:** Notice we selected `O.id` for the IS NULL check. While `O.customerId` would also work, checking the **Primary Key** of the right table (`O.id`) is technically the most robust way to prove a row is missing.
3.  **Renaming Output:** LeetCode (and business stakeholders) often ask for specific column headers. We used `AS Customers` to rename `name` to the required format.
4.  **Business Context:** This query is essentially a "Churn Risk" report. Customers who exist but haven't ordered are prime targets for marketing re-engagement campaigns.