Understanding sales trends is crucial for any business striving to maximize its performance and cater to customer needs effectively. One of the most valuable insights comes from analyzing how sales vary across different categories and days of the week.

In this blog, we will walk through a practical example of generating a sales report from raw data using SQL. This report will break down sales by product categories, such as books, phones, and glasses, and map them to the specific day of the week they were ordered. By doing so, we aim to uncover patterns in purchasing behavior that can help businesses make informed decisions, such as optimizing inventory, launching day-specific promotions, or identifying underperforming product lines.

#### Table: Orders
| Column Name  | Type    |
|--------------|---------|
| order_id     | int     |
| customer_id  | int     |
| order_date   | date    |
| item_id      | varchar |
| quantity     | int     |

- (order_id, item_id) is the primary key for this table.
- This table contains information about orders placed, where `order_date` is the date `item_id` was ordered by the customer with `customer_id`.

#### Table: Items
| Column Name   | Type    |
|---------------|---------|
| item_id       | varchar |
| item_name     | varchar |
| item_category | varchar |

- `item_id` is the primary key for this table.
- This table contains details about items, including their name and category.

---

### Problem Statement

As a business owner, you would like to generate a sales report that summarizes how many units in each category were ordered on each day of the week.

#### Expected Output

The output should display the number of units sold for each item category, organized by the day of the week. The result table should be sorted by category.

##### Input:

**Orders Table:**
| order_id | customer_id | order_date | item_id | quantity |
|----------|-------------|------------|---------|----------|
| 1        | 1           | 2020-06-01 | 1       | 10       |
| 2        | 1           | 2020-06-08 | 2       | 10       |
| 3        | 2           | 2020-06-02 | 1       | 5        |
| 4        | 3           | 2020-06-03 | 3       | 5        |
| 5        | 4           | 2020-06-04 | 4       | 1        |
| 6        | 4           | 2020-06-05 | 5       | 5        |
| 7        | 5           | 2020-06-05 | 1       | 10       |
| 8        | 5           | 2020-06-14 | 4       | 5        |
| 9        | 5           | 2020-06-21 | 3       | 5        |

**Items Table:**
| item_id | item_name       | item_category |
|---------|-----------------|---------------|
| 1       | LC Alg. Book    | Book          |
| 2       | LC DB. Book     | Book          |
| 3       | LC Smartphone   | Phone         |
| 4       | LC Phone 2020   | Phone         |
| 5       | LC SmartGlass   | Glasses       |
| 6       | LC T-Shirt XL   | T-Shirt       |

##### Output:

| Category | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday |
|----------|--------|---------|-----------|----------|--------|----------|--------|
| Book     | 20     | 5       | 0         | 0        | 10     | 0        | 0      |
| Glasses  | 0      | 0       | 0         | 0        | 5      | 0        | 0      |
| Phone    | 0      | 0       | 5         | 1        | 0      | 0        | 10     |
| T-Shirt  | 0      | 0       | 0         | 0        | 0      | 0        | 0      |

---

#### Explanation:

1. **Monday**:
   - On 2020-06-01 and 2020-06-08, 20 units of category **Book** (10 + 10) were sold.

2. **Tuesday**:
   - On 2020-06-02, 5 units of category **Book** were sold.

3. **Wednesday**:
   - On 2020-06-03, 5 units of category **Phone** were sold.

4. **Thursday**:
   - On 2020-06-04, 1 unit of category **Phone** was sold.

5. **Friday**:
   - On 2020-06-05, 10 units of category **Book** and 5 units of category **Glasses** were sold.

6. **Sunday**:
   - On 2020-06-14 and 2020-06-21, 10 units of category **Phone** (5 + 5) were sold.

7. No sales occurred for the **T-Shirt** category.


In [31]:
import pandas as pd

data = [[1, 1, '2020-06-01', 1, 10], 
        [2, 1, '2020-06-08', 2, 10], 
        [3, 2, '2020-06-02', 1, 5], 
        [4, 3, '2020-06-03', 3, 5], 
        [5, 4, '2020-06-04', 4, 1], 
        [6, 4, '2020-06-05', 5, 5], 
        [7, 5, '2020-06-05', 1, 10], 
        [8, 5, '2020-06-14', 4, 5], 
        [9, 5, '2020-06-21', 3, 5]]
orders = pd.DataFrame(data, 
          columns=['order_id', 
                   'customer_id', 
                   'order_date', 
                   'item_id', 
                   'quantity']).astype({'order_id':'Int64', 
                   'customer_id':'Int64', 
                   'order_date':'datetime64[ns]', 
                   'item_id':'object', 'quantity':'Int64'})
display(orders)

Unnamed: 0,order_id,customer_id,order_date,item_id,quantity
0,1,1,2020-06-01,1,10
1,2,1,2020-06-08,2,10
2,3,2,2020-06-02,1,5
3,4,3,2020-06-03,3,5
4,5,4,2020-06-04,4,1
5,6,4,2020-06-05,5,5
6,7,5,2020-06-05,1,10
7,8,5,2020-06-14,4,5
8,9,5,2020-06-21,3,5


Unnamed: 0,item_id,item_name,item_category
0,1,LC Alg. Book,Book
1,2,LC DB. Book,Book
2,3,LC SmarthPhone,Phone
3,4,LC Phone 2020,Phone
4,5,LC SmartGlass,Glasses
5,6,LC T-Shirt XL,T-shirt


In [None]:
data = [[1, 'LC Alg. Book', 'Book'], 
        [2, 'LC DB. Book', 'Book'], 
        [3, 'LC SmarthPhone', 'Phone'], 
        [4, 'LC Phone 2020', 'Phone'], 
        [5, 'LC SmartGlass', 'Glasses'], 
        [6, 'LC T-Shirt XL', 'T-shirt']]
items = pd.DataFrame(data, 
        columns=['item_id', 
                 'item_name', 
                 'item_category']).astype({'item_id':'object', 
                 'item_name':'object', 
                 'item_category':'object'})
display(items)

In [32]:
orders = orders.merge(items, how="left")
orders["day"] = orders["order_date"].dt.day_name()

unique_categories = sorted(items["item_category"].unique().tolist())
orders["item_category"] = orders["item_category"].astype("category").cat.set_categories(unique_categories)
orders["day"] = orders["day"].astype("category").cat.set_categories(['Monday', 'Tuesday', 'Wednesday', 
                                                                     'Thursday', 'Friday', 'Saturday', 'Sunday'])
orders = pd.pivot_table(orders, 
                        index="item_category",
                        columns="day",
                        values="quantity",
                        aggfunc="sum",
                        observed=False)
orders = orders.reset_index(names="category").sort_values(by=["category"])

orders

day,category,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday
0,Book,20,5,0,0,10,0,0
1,Glasses,0,0,0,0,5,0,0
2,Phone,0,0,5,1,0,0,10
3,T-shirt,0,0,0,0,0,0,0
