# 1045. Customers Who Bought All Products

SQL SchemaPandas SchemaTable: Customer+-------------+---------+| Column Name | Type    |+-------------+---------+| customer_id | int     || product_key | int     |+-------------+---------+This table may contain duplicates rows. customer_id is not NULL.product_key is a foreign key (reference column) to Product table. Table: Product+-------------+---------+| Column Name | Type    |+-------------+---------+| product_key | int     |+-------------+---------+product_key is the primary key (column with unique values) for this table. Write a solution to report the customer ids from the Customer table that bought all the products in the Product table.Return the result table in any order.The result format is in the following example. **Example 1:**Input: Customer table:+-------------+-------------+| customer_id | product_key |+-------------+-------------+| 1           | 5           || 2           | 6           || 3           | 5           || 3           | 6           || 1           | 6           |+-------------+-------------+Product table:+-------------+| product_key |+-------------+| 5           || 6           |+-------------+Output: +-------------+| customer_id |+-------------+| 1           || 3           |+-------------+Explanation: The customers who bought all the products (5 and 6) are customers with IDs 1 and 3.

## Solution Explanation
This problem asks us to find customers who have purchased all products available in the Product table. To solve this, we need to:1. Find the total number of distinct products in the Product table.2. For each customer, count how many distinct products they've purchased.3. Select only those customers who have purchased all products (i.e., their distinct product count equals the total product count).The key insight is to use GROUP BY on customer_id and then compare the count of distinct products each customer bought with the total number of products. We can use a subquery to get the total product count and then filter customers based on this condition.

In [None]:
import pandas as pddef find_customers(customer: pd.DataFrame, product: pd.DataFrame) -> pd.DataFrame:    # Get the total number of distinct products    total_products = product['product_key'].nunique()        # Group by customer_id and count distinct products each customer bought    customer_product_counts = customer.groupby('customer_id')['product_key'].nunique().reset_index()        # Filter customers who bought all products    result = customer_product_counts[customer_product_counts['product_key'] == total_products]        # Return only the customer_id column    return result[['customer_id']]

## Time and Space Complexity
* *Time Complexity**: O(n + m), where n is the number of rows in the Customer table and m is the number of rows in the Product table. We need to scan both tables once to count the distinct products and group by customer.* *Space Complexity**: O(c + p), where c is the number of distinct customers and p is the number of distinct products. We need space to store:* The count of distinct products (constant space)* The grouped customer data with their product counts (proportional to the number of distinct customers)* The final result dataframe (proportional to the number of customers who bought all products)

## Test Cases


In [None]:
# Test Case 1: Basic case from the problem statementdef test_basic_case():    customer_data = {        'customer_id': [1, 2, 3, 3, 1],        'product_key': [5, 6, 5, 6, 6]    }    product_data = {        'product_key': [5, 6]    }        customer_df = pd.DataFrame(customer_data)    product_df = pd.DataFrame(product_data)        result = find_customers(customer_df, product_df)    expected = pd.DataFrame({'customer_id': [1, 3]})        # Sort both dataframes to ensure consistent comparison    result = result.sort_values('customer_id').reset_index(drop=True)    expected = expected.sort_values('customer_id').reset_index(drop=True)        pd.testing.assert_frame_equal(result, expected)    print("Test Case 1 passed!")# Test Case 2: No customer bought all productsdef test_no_matching_customers():    customer_data = {        'customer_id': [1, 2, 3],        'product_key': [5, 6, 7]    }    product_data = {        'product_key': [5, 6, 7, 8]    }        customer_df = pd.DataFrame(customer_data)    product_df = pd.DataFrame(product_data)        result = find_customers(customer_df, product_df)    expected = pd.DataFrame({'customer_id': []})        pd.testing.assert_frame_equal(result, expected, check_dtype=False)    print("Test Case 2 passed!")# Test Case 3: Only one product existsdef test_single_product():    customer_data = {        'customer_id': [1, 2, 3, 4],        'product_key': [5, 5, 5, 6]    }    product_data = {        'product_key': [5]    }        customer_df = pd.DataFrame(customer_data)    product_df = pd.DataFrame(product_data)        result = find_customers(customer_df, product_df)    expected = pd.DataFrame({'customer_id': [1, 2, 3]})        # Sort both dataframes to ensure consistent comparison    result = result.sort_values('customer_id').reset_index(drop=True)    expected = expected.sort_values('customer_id').reset_index(drop=True)        pd.testing.assert_frame_equal(result, expected)    print("Test Case 3 passed!")# Run the teststest_basic_case()test_no_matching_customers()test_single_product()