### IMPORTANT!

First of all it's important to import the necessary libraries to work with SQL in a Jupyter Notebook

In [2]:
import sqlite3
import prettytable          # This is useful to display tables when querying SQL in a Jupiter Notebook

con = sqlite3.connect("e-commerce.db")
cur = con.cursor()

prettytable.DEFAULT = 'DEFAULT'

%load_ext sql
%sql sqlite:///e-commerce.db

My database is empty, so next I will create tables and insert some

In [3]:
%%sql

CREATE TABLE Customers (
    customer_id INTEGER PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    date_joined DATE NOT NULL,
    city VARCHAR(50),
    is_premium BOOLEAN DEFAULT FALSE
);

CREATE TABLE Orders (
    order_id INTEGER PRIMARY KEY,
    customer_id INT NOT NULL,
    order_date DATE NOT NULL,
    total_amount DECIMAL(10,2) NOT NULL,
    FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);

CREATE TABLE Products (
    product_id INTEGER PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    category VARCHAR(50),
    in_stock BOOLEAN DEFAULT TRUE
);

CREATE TABLE OrderItems (
    order_item_id INTEGER PRIMARY KEY,
    order_id INT NOT NULL,
    product_id INT NOT NULL,
    quantity INT NOT NULL,
    FOREIGN KEY (order_id) REFERENCES Orders(order_id),
    FOREIGN KEY (product_id) REFERENCES Products(product_id)
);

INSERT INTO Customers (name, email, date_joined, city, is_premium) VALUES
('Emma Wilson', 'emma.w@mail.com', '2021-11-05', 'Chicago', TRUE),
('Michael Brown', 'mike.b@mail.com', '2022-07-12', 'Houston', FALSE),
('Sophia Lee', 'sophia.lee@mail.com', '2023-01-20', 'Seattle', TRUE),
('Daniel Kim', 'dan.k@mail.com', '2023-03-15', 'Boston', FALSE),
('Olivia Davis', 'olivia.d@mail.com', '2022-09-01', 'Miami', TRUE),
('Liam Johnson', 'liam.j@mail.com', '2023-04-10', 'Denver', FALSE),
('Ava Martinez', 'ava.m@mail.com', '2022-12-25', 'Austin', TRUE),
('Noah Garcia', 'noah.g@mail.com', '2023-02-14', 'San Francisco', FALSE);

INSERT INTO Products (name, price, category, in_stock) VALUES
('Smartwatch Pro', 299.99, 'Electronics', TRUE),
('Bluetooth Speaker', 89.99, 'Electronics', TRUE),
('Organic Green Tea', 8.99, 'Groceries', FALSE),
('Yoga Block Set', 19.99, 'Fitness', TRUE),
('Wireless Keyboard', 59.95, 'Electronics', TRUE),
('Stainless Steel Cookware Set', 199.00, 'Home', FALSE),
('Running Shoes', 129.99, 'Fitness', TRUE),
('Digital Camera', 449.00, 'Electronics', TRUE),
('Air Purifier', 159.00, 'Home', TRUE),
('Graphic Novel Collection', 49.99, 'Books', TRUE);

INSERT INTO Orders (customer_id, order_date, total_amount) VALUES
(1, '2023-03-05', 359.98),
(2, '2023-04-02', 168.93),
(3, '2023-04-15', 228.94),
(4, '2023-05-01', 89.99),
(5, '2023-05-10', 259.98),
(6, '2023-05-12', 199.00),
(7, '2023-05-15', 129.99),
(8, '2023-05-20', 508.94),
(1, '2023-05-25', 79.96),
(2, '2023-06-01', 449.00),
(3, '2023-06-05', 199.99),
(4, '2023-06-10', 299.95),
(5, '2023-06-15', 159.00);

INSERT INTO OrderItems (order_id, product_id, quantity) VALUES
(4, 2, 2),
(4, 5, 1),
(5, 1, 1),
(5, 10, 2),
(6, 6, 1),
(7, 7, 1),
(8, 8, 1),
(8, 9, 1),
(9, 3, 4),
(9, 4, 2),
(10, 8, 1),
(11, 5, 2),
(12, 9, 1),
(13, 2, 3);

 * sqlite:///e-commerce.db
Done.
Done.
Done.
Done.
8 rows affected.
10 rows affected.
13 rows affected.
14 rows affected.


[]

---

### **Task 1: Customers Who Ordered Products from Multiple Categories**
**Goal:** Find customers who have ordered products from at least 3 different categories.  

---

In [None]:
%%sql

SELECT
    c.customer_id,
    c.name
FROM
    Customers c
    JOIN Orders o ON c.customer_id = o.customer_id
    JOIN OrderItems oi ON o.order_id = o.order_id
    JOIN Products p ON oi.product_id = p.product_id
GROUP BY
    c.customer_id, c.name
HAVING
    COUNT(DISTINCT category) >= 3

 * sqlite:///e-commerce.db
Done.


customer_id,name
1,Emma Wilson
2,Michael Brown
3,Sophia Lee
4,Daniel Kim
5,Olivia Davis
6,Liam Johnson
7,Ava Martinez
8,Noah Garcia


### **Task 2: Average Order Value by Month**
**Goal:** Calculate the average order value for each month in 2023.  
**Hint:**  
- Extract the month from `order_date` using `EXTRACT(MONTH FROM order_date)` or equivalent.  
- Group by month and calculate `AVG(total_amount)`.  

---

In [13]:
%%sql

SELECT
    strftime('%m', order_date) AS month_number,
    AVG(total_amount) AS average_amount
FROM
    Orders
WHERE
    strftime('%Y', order_date) = '2023'
GROUP BY
    strftime('%m', order_date)

 * sqlite:///e-commerce.db
Done.


month_number,average_amount
3,359.98
4,198.935
5,211.31000000000003
6,276.985


### **Task 3: Customers Who Ordered Both Electronics and Fitness Products**
**Goal:** List customers who bought products from **both** "Electronics" and "Fitness" categories.

---

In [43]:
%%sql

SELECT
    c.customer_id,
    c.name,
    c.email
FROM
    Customers c
    JOIN Orders o ON c.customer_id = o.customer_id
    JOIN OrderItems oi ON o.order_id = oi.order_id
    JOIN Products p ON oi.product_id = p.product_id
WHERE
    p.category IN ('Fitness', 'Electronics')
GROUP BY
    c.customer_id,
    c.name,
    c.email
HAVING
    COUNT(DISTINCT p.category) = 2;

 * sqlite:///e-commerce.db
Done.


customer_id,name,email


### **Task 4: Products Only Ordered Once**
**Goal:** Find products that were included in **exactly one order** (no repeats).  

---

In [55]:
%%sql

SELECT
    p.product_id,
    p.name AS product_name,
    COUNT(oi.order_id) AS orders_included
FROM
    Products p
    JOIN OrderItems oi ON p.product_id = oi.product_id
GROUP BY
    p.product_id
HAVING
    COUNT(oi.order_id) = 1
    

 * sqlite:///e-commerce.db
Done.


product_id,product_name,orders_included
1,Smartwatch Pro,1
3,Organic Green Tea,1
4,Yoga Block Set,1
6,Stainless Steel Cookware Set,1
7,Running Shoes,1
10,Graphic Novel Collection,1


### **Task 5: Customers Who Ordered Every Month in 2023**
**Goal:** Identify customers who placed orders in **all 12 months** of 2023.  

---

In [17]:
%%sql

SELECT
    c.customer_id,
    c.name,
    c.email
FROM
    Customers c 
    JOIN Orders o ON c.customer_id = o.customer_id
WHERE
    strftime('%Y', o.order_date) = '2023'
GROUP BY
    c.customer_id
HAVING
    COUNT(DISTINCT strftime('%m', o.order_date)) = 12


 * sqlite:///e-commerce.db
Done.


customer_id,name,email


For MySQL the dates are extracted different, it's more like:

```sql
MONTH(o.order_date)
--- or
YEAR(o.order_date)
```

### **Task 6: Rank Customers by Total Spending**
**Goal:** Rank customers by their total spending, showing their rank and total amount.  

---

In [22]:
%%sql

SELECT
    RANK() OVER (ORDER BY SUM(o.total_amount) DESC) AS rank_number,
    c.customer_id,
    c.name,
    SUM(o.total_amount) AS total_spending
FROM
    Customers c
    JOIN Orders o ON c.customer_id = o.customer_id
GROUP BY
    c.customer_id,
    c.name

 * sqlite:///e-commerce.db
Done.


rank_number,customer_id,name,total_spending
1,2,Michael Brown,617.9300000000001
2,8,Noah Garcia,508.94
3,1,Emma Wilson,439.94
4,3,Sophia Lee,428.93
5,5,Olivia Davis,418.98
6,4,Daniel Kim,389.94
7,6,Liam Johnson,199.0
8,7,Ava Martinez,129.99


### **Task 7: Customers Who Ordered a Now-Out-of-Stock Product**
**Goal:** Find customers who ordered products that are **currently out of stock**.  

---

In [26]:
%%sql

SELECT
    c.*
FROM
    Customers c
    JOIN Orders o ON c.customer_id = o.customer_id
    JOIN OrderItems oi ON o.order_id = oi.order_id
    JOIN Products p ON oi.product_id = p.product_id
WHERE
    p.in_stock IS FALSE


 * sqlite:///e-commerce.db
Done.


customer_id,name,email,date_joined,city,is_premium
6,Liam Johnson,liam.j@mail.com,2023-04-10,Denver,0
1,Emma Wilson,emma.w@mail.com,2021-11-05,Chicago,1


### **Task 8: Most Popular Product in Each Category**
**Goal:** For each category, find the product with the highest total sales.  

---

In [3]:
%%sql

WITH ProductSales AS (
    SELECT
        p.product_id,
        p.name,
        p.category,
        SUM(oi.quantity) AS total_sales
    FROM
        Products p
        JOIN OrderItems oi ON p.product_id = oi.product_id
    GROUP BY
        p.product_id,
        p.name,
        p.category

), RankProduct AS (
    SELECT
        product_id,
        name,
        category,
        total_sales,
        ROW_NUMBER() OVER (PARTITION BY category ORDER BY total_sales DESC) AS rank_number
    FROM
        ProductSales)
SELECT
    product_id,
    name,
    category,
    total_sales,
    rank_number
FROM
    RankProduct
WHERE
    rank_number = 1

 * sqlite:///e-commerce.db
Done.


product_id,name,category,total_sales,rank_number
10,Graphic Novel Collection,Books,2,1
2,Bluetooth Speaker,Electronics,5,1
4,Yoga Block Set,Fitness,2,1
3,Organic Green Tea,Groceries,4,1
9,Air Purifier,Home,2,1


### **Task 9: Customers Who Never Ordered Electronics**
**Goal:** List customers who have **never** ordered an "Electronics" product.  

---

In [11]:
%%sql

SELECT
    customer_id,
    name,
    email
FROM
    Customers
WHERE
    customer_id NOT IN (
        SELECT DISTINCT
            o.customer_id
        FROM
            Orders o
            JOIN OrderItems oi ON o.order_id = oi.order_id
            JOIN Products p ON oi.product_id = p.product_id
        WHERE
            p.category = 'Electronics'
    )

 * sqlite:///e-commerce.db
Done.


customer_id,name,email
1,Emma Wilson,emma.w@mail.com
6,Liam Johnson,liam.j@mail.com
7,Ava Martinez,ava.m@mail.com


### **Task 10: Customer Order Frequency**
**Goal:** For each customer, show the number of days between their first and last order.

---


In [16]:
%%sql


WITH OrderDates AS (
    SELECT
        c.customer_id,
        c.name,
        c.email,
        MIN(o.order_date) AS first_order,
        MAX(o.order_date) AS last_order
    FROM
        Customers c
        JOIN Orders o ON c.customer_id = o.customer_id
    GROUP BY
        c.customer_id,
        c.name,
        c.email
)
SELECT
    customer_id,
    name,
    email,
    julianday(last_order) - julianday(first_order) AS date_difference
FROM
    OrderDates


 * sqlite:///e-commerce.db
Done.


customer_id,name,email,date_difference
1,Emma Wilson,emma.w@mail.com,81.0
2,Michael Brown,mike.b@mail.com,60.0
3,Sophia Lee,sophia.lee@mail.com,51.0
4,Daniel Kim,dan.k@mail.com,40.0
5,Olivia Davis,olivia.d@mail.com,36.0
6,Liam Johnson,liam.j@mail.com,0.0
7,Ava Martinez,ava.m@mail.com,0.0
8,Noah Garcia,noah.g@mail.com,0.0


'julianday' is used in SQLite, for MySQL it's 'DATEDIFF()'

```sql
    DATEDIFF(last_order, first_order) AS date_difference
```

I like more MySQL, I'm using SQLite for the sake of displaying the tasks in Jupyter

### About the Author

**Name:** Sebastian Mondragon  

- **Email:** basmondragon@proton.me
- **Telegram:** [https://t.me/basmondragon](https://t.me/basmondragon)
- **LinkedIn:** [https://www.linkedin.com/in/basmondragon/](https://www.linkedin.com/in/basmondragon/)

#### Skills

- **Programming Languages:** Python, SQL  
- **Libraries & Frameworks:** Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, XGBoost  
- **Methodologies:** Data Cleaning, Feature Engineering, Machine Learning, Model Evaluation  
- **Soft Skills:** Problem-Solving, Analytical Thinking, Communication

#### Next Steps

If you have any feedback or suggestions for improving this project, feel free to reach out to me via email or LinkedIn. I’m always open to learning and collaborating on new ideas!  

Feel free to explore my other projects on GitHub: [https://github.com/basmondragon](https://github.com/basmondragon)