PHW#3

Xinyun Wang

### **Proposition 1**

#### **Query Name**

Top 3 Selling Products Per Category

#### **Description**

This query retrieves the top 3 products with the highest total sales amount in each product category. It uses a common table expression (CTE) and the `ROW_NUMBER()` window function to rank products within their categories.

#### **Inputs**

- **ProductID**: The unique identifier for each product.
- **Quantity**: The number of units sold.
- **UnitPrice**: The price per unit.
- **ProductCategoryID**: The category identifier for each product.

#### **Outputs**

- **ProductCategoryID**: The category identifier.
- **ProductCategoryName**: The name of the product category.
- **ProductID**: The unique identifier for the product.
- **ProductName**: The name of the product.
- **TotalSalesAmount**: The total sales amount for the product.

#### **Steps**

1. **Calculate Total Sales Amount for Each Product**
    
    - Join `Sales.OrderLines` with `Warehouse.StockItems`.
    - Calculate `TotalSalesAmount` as `Quantity * UnitPrice`.
2. **Rank Products Within Each Category**
    
    - Use a CTE with `ROW_NUMBER()` partitioned by `ProductCategoryID` and ordered by `TotalSalesAmount` descending.
3. **Select Top 5 Products Per Category**
    
    - Filter the results where `RowNum` ≤ 3.

#### **Assumptions**

- The data is available in the `Sales.OrderLines`, `Warehouse.StockItems`, and `Warehouse.StockItemStockGroups` tables.
- Each product belongs to only one category.

#### **Example Output**

| ProductCategoryID | ProductCategoryName | ProductID | ProductName | TotalSalesAmount |
| --- | --- | --- | --- | --- |
| 1 | Beverages | 101 | Green Tea | $50,000 |
| 1 | Beverages | 102 | Black Coffee | $45,000 |
| 1 | Beverages | 103 | Herbal Tea | $40,000 |

In [None]:
use WideWorldImporters;

WITH ProductSales AS (
    SELECT 
        si.StockItemID,
        si.StockItemName,
        ssg.StockGroupID AS ProductCategoryID,
        ssg.StockGroupName AS ProductCategoryName,
        SUM(ol.Quantity * ol.UnitPrice) AS TotalSalesAmount,
        ROW_NUMBER() OVER (
            PARTITION BY ssg.StockGroupID 
            ORDER BY SUM(ol.Quantity * ol.UnitPrice) DESC
        ) AS RowNum
    FROM Sales.OrderLines AS ol
    INNER JOIN Warehouse.StockItems AS si
        ON ol.StockItemID = si.StockItemID
    INNER JOIN Warehouse.StockItemStockGroups AS sisg
        ON si.StockItemID = sisg.StockItemID
    INNER JOIN Warehouse.StockGroups AS ssg
        ON sisg.StockGroupID = ssg.StockGroupID
    GROUP BY 
        si.StockItemID,
        si.StockItemName,
        ssg.StockGroupID,
        ssg.StockGroupName
)
SELECT 
    ProductCategoryID,
    ProductCategoryName,
    StockItemID AS ProductID,
    StockItemName AS ProductName,
    TotalSalesAmount
FROM ProductSales
WHERE RowNum <= 3
ORDER BY ProductCategoryID, RowNum;


## **Proposition 2**

### **Query Name**

**Top Customers by Total Order Value**

### **Description**

This query retrieves the top 10 customers based on the total value of their orders. It uses subqueries and aggregate functions to calculate the total order value per customer.

### **Inputs**

- **CustomerID**: The unique identifier for each customer.
- **OrderID**: The unique identifier for each order.
- **Quantity**: The number of units sold.
- **UnitPrice**: The price per unit.

### **Outputs**

- **CustomerID**: The unique identifier for the customer.
- **CustomerName**: The name of the customer.
- **TotalOrderValue**: The total value of all orders placed by the customer.

### **Steps**

1. **Calculate Total Order Value per Order**
    - Sum `Quantity * UnitPrice` for each `OrderID`.
2. **Calculate Total Order Value per Customer**
    - Sum total order values for each `CustomerID`.
3. **Select Top 10 Customers**
    - Order the results by `TotalOrderValue` in descending order and select the top 10.

### **Assumptions**

- All order and customer data is accurate and complete.
- There are sufficient customers to retrieve a top 10 list.

### **Example Output**

| CustomerID | CustomerName | TotalOrderValue |
| --- | --- | --- |
| 101 | Alpha Corp | $500,000 |
| 102 | Beta LLC | $450,000 |

In [None]:
use WideWorldImporters;

WITH OrderTotals AS (
    SELECT 
        o.CustomerID,
        SUM(ol.Quantity * ol.UnitPrice) AS OrderValue
    FROM Sales.Orders AS o
    INNER JOIN Sales.OrderLines AS ol
        ON o.OrderID = ol.OrderID
    GROUP BY 
        o.OrderID,
        o.CustomerID
), CustomerTotals AS (
    SELECT 
        CustomerID,
        SUM(OrderValue) AS TotalOrderValue
    FROM OrderTotals
    GROUP BY CustomerID
)
SELECT TOP 10
    ct.CustomerID,
    c.CustomerName,
    ct.TotalOrderValue
FROM CustomerTotals AS ct
INNER JOIN Sales.Customers AS c
    ON ct.CustomerID = c.CustomerID
ORDER BY ct.TotalOrderValue DESC;


### **Proposition 3**

#### **Query Name**

Average Order Value by Employee

#### **Description**

This query calculates the average order value for each employee who has processed orders. It uses a derived table to compute total sales per order.

#### **Inputs**

- **EmployeeID**: The unique identifier for each employee.
- **OrderID**: The unique identifier for each order.
- **Quantity**: The number of units sold.
- **UnitPrice**: The price per unit.

#### **Outputs**

- **EmployeeID**: The unique identifier for the employee.
- **EmployeeName**: The name of the employee.
- **AverageOrderValue**: The average total value of orders processed by the employee.

#### **Steps**

1. **Calculate Total Value per Order**
    
    - Create a derived table that sums `Quantity * UnitPrice` for each `OrderID`.
2. **Link Orders to Employees**
    
    - Join the derived table with `Sales.Orders` to associate orders with employees.
3. **Compute Average Order Value per Employee**
    
    - Group by `EmployeeID` and calculate the average of the total order values.

#### **Assumptions**

- Employee information is available in `HumanResources.Employees`.
- Order and order line details are accurate and complete.

#### **Example Output**

| EmployeeID | EmployeeName | AverageOrderValue |
| --- | --- | --- |
| 10 | Alice Johnson | $1,500 |
| 20 | Bob Smith | $2,000 |
| 30 | Carol Davis | $1,750 |

In [None]:
use WideWorldImporters;

WITH OrderTotals AS (
    SELECT 
        o.OrderID,
        o.SalespersonPersonID AS EmployeeID,
        SUM(ol.Quantity * ol.UnitPrice) AS TotalOrderValue
    FROM Sales.Orders AS o
    INNER JOIN Sales.OrderLines AS ol
        ON o.OrderID = ol.OrderID
    GROUP BY 
        o.OrderID,
        o.SalespersonPersonID
)
SELECT 
    e.PersonID AS EmployeeID,
    CONCAT(e.FullName, ' ', e.PreferredName) AS EmployeeName,
    AVG(ot.TotalOrderValue) AS AverageOrderValue
FROM OrderTotals AS ot
INNER JOIN Application.People AS e
    ON ot.EmployeeID = e.PersonID
GROUP BY 
    e.PersonID,
    e.FullName,
    e.PreferredName
ORDER BY 
    AverageOrderValue DESC;


## **Proposition 4**

### **Query Name**

**Average Quantity Ordered per Product**

### **Description**

This query calculates the average quantity ordered for each product. It uses aggregate functions and grouping to compute the averages.

### **Inputs**

- **StockItemID**: The unique identifier for each stock item.
- **Quantity**: The number of units ordered.

### **Outputs**

- **StockItemID**: The unique identifier for the product.
- **StockItemName**: The name of the product.
- **AverageQuantity**: The average quantity ordered per order.

### **Steps**

1. **Calculate Average Quantity per Product**
    - Group order lines by `StockItemID` and calculate the average `Quantity`.
2. **Retrieve Product Names**
    - Join with `Warehouse.StockItems` to get product names.

### **Assumptions**

- All order lines are recorded in `Sales.OrderLines`.
- Product information is accurate and up-to-date.

### **Example Output**

| StockItemID | StockItemName | AverageQuantity |
| --- | --- | --- |
| 101 | Widget A | 15 |
| 102 | Widget B | 20 |

In [None]:
use WideWorldImporters;
SELECT 
    si.StockItemID,
    si.StockItemName,
    AVG(ol.Quantity) AS AverageQuantity
FROM Sales.OrderLines AS ol
INNER JOIN Warehouse.StockItems AS si
    ON ol.StockItemID = si.StockItemID
GROUP BY 
    si.StockItemID,
    si.StockItemName
ORDER BY AverageQuantity DESC;



### **Proposition 5**

#### **Query Name**

Monthly Sales Growth Percentage

#### **Description**

This query calculates the month-over-month sales growth percentage. It uses window functions and CTEs to compare sales of consecutive months.

#### **Inputs**

- **OrderDate**: The date when the order was placed.
- **Quantity**: The number of units sold.
- **UnitPrice**: The price per unit.

#### **Outputs**

- **YearMonth**: The year and month of the sales data.
- **TotalSales**: The total sales amount for the month.
- **GrowthPercentage**: The percentage change compared to the previous month.

#### **Steps**

1. **Calculate Monthly Sales**
    
    - Group orders by `YEAR(OrderDate)` and `MONTH(OrderDate)`.
    - Sum `Quantity * UnitPrice` to get `TotalSales` per month.
2. **Calculate Growth Percentage**
    
    - Use `LAG()` window function to access `TotalSales` from the previous month.
    - Compute `(CurrentMonthSales - PreviousMonthSales) / PreviousMonthSales * 100`.

#### **Assumptions**

- Sales data spans multiple months and years.
- There are no gaps in monthly data.

#### **Example Output**

| YearMonth | TotalSales | GrowthPercentage |
| --- | --- | --- |
| 2020-01 | $100,000 | NULL |
| 2020-02 | $110,000 | 10% |
| 2020-03 | $121,000 | 10% |

In [None]:
use WideWorldImporters;

WITH MonthlySales AS (
    SELECT 
        CAST(DATEFROMPARTS(YEAR(OrderDate), MONTH(OrderDate), 1) AS DATE) AS YearMonth,
        SUM(ol.Quantity * ol.UnitPrice) AS TotalSales
    FROM Sales.Orders AS o
    INNER JOIN Sales.OrderLines AS ol
        ON o.OrderID = ol.OrderID
    GROUP BY 
        CAST(DATEFROMPARTS(YEAR(OrderDate), MONTH(OrderDate), 1) AS DATE)
)
SELECT 
    ms.YearMonth,
    ms.TotalSales,
    LAG(ms.TotalSales) OVER (ORDER BY ms.YearMonth) AS PreviousMonthSales,
    CASE 
        WHEN LAG(ms.TotalSales) OVER (ORDER BY ms.YearMonth) IS NULL THEN NULL
        ELSE 
            ((ms.TotalSales - LAG(ms.TotalSales) OVER (ORDER BY ms.YearMonth)) 
            / LAG(ms.TotalSales) OVER (ORDER BY ms.YearMonth)) * 100
    END AS GrowthPercentage
FROM MonthlySales AS ms
ORDER BY ms.YearMonth;


## **Proposition 6**

### **Query Name**

### 

**Products Sold Below Average Price**

### **Description**

### 

This query finds products that have been sold below their average selling price. It uses subqueries to calculate the average unit price per product and identifies sales where the unit price was less than this average.

### **Inputs**

### 

- **StockItemID**: Unique identifier for each product.
- **UnitPrice**: Price at which the product was sold.
- **Quantity**: Number of units sold.

### **Outputs**

### 

- **StockItemID**: Unique identifier for the product.
- **StockItemName**: Name of the product.
- **OrderID**: Order in which the product was sold below average price.
- **UnitPrice**: Unit price at which the product was sold.
- **AverageUnitPrice**: Average unit price of the product.

### **Steps**

### 

1. **Calculate Average Unit Price per Product**
    - Use a subquery to compute the average unit price for each product.
2. **Identify Sales Below Average Price**
    - Select order lines where `UnitPrice` is less than the product's average unit price.
3. **Retrieve Product and Order Details**
    - Join with `Warehouse.StockItems` to get product names.

### **Assumptions**

### 

- All sales data is stored in `Sales.OrderLines`.
- Product information is available in `Warehouse.StockItems`.
- Average prices are calculated based on historical sales data.

### **Example Output**

### 

| StockItemID | StockItemName | OrderID | UnitPrice | AverageUnitPrice |
| --- | --- | --- | --- | --- |
| 101 | Widget A | 1001 | $9.00 | $10.00 |
| 102 | Widget B | 1002 | $8.50 | $9.00 |

In [None]:
use WideWorldImporters;

SELECT 
    ol.StockItemID,
    si.StockItemName,
    ol.OrderID,
    ol.UnitPrice,
    avg_price.AverageUnitPrice
FROM Sales.OrderLines AS ol
INNER JOIN Warehouse.StockItems AS si
    ON ol.StockItemID = si.StockItemID
INNER JOIN (
    SELECT 
        StockItemID,
        AVG(UnitPrice) AS AverageUnitPrice
    FROM Sales.OrderLines
    GROUP BY StockItemID
) AS avg_price
    ON ol.StockItemID = avg_price.StockItemID
WHERE ol.UnitPrice < avg_price.AverageUnitPrice
ORDER BY ol.StockItemID, ol.OrderID;



## **Proposition 7**

### **Query Name**

### 

**Employees with Above-Average Sales**

### **Description**

### 

This query identifies employees whose total sales exceed the company's average sales per employee. It uses subqueries to calculate the average sales and compares each employee's sales against it.

### **Inputs**

### 

- **EmployeeID**: Unique identifier for each employee.
- **OrderID**: Unique identifier for each order.
- **Quantity**: Number of units sold.
- **UnitPrice**: Price per unit.

### **Outputs**

### 

- **EmployeeID**: Unique identifier for the employee.
- **EmployeeName**: Name of the employee.
- **TotalSales**: Total sales amount by the employee.
- **AverageSales**: Company's average sales per employee.

### **Steps**

### 

1. **Calculate Total Sales per Employee**
    - Sum `Quantity * UnitPrice` for each employee.
2. **Calculate Company Average Sales per Employee**
    - Use a subquery to find the average of total sales per employee.
3. **Compare Employee Sales to Average**
    - Select employees where their total sales exceed the average.

### **Assumptions**

### 

- Employee data is available in `Application.People`.
- Sales data is accurate and up-to-date.
- Each employee is uniquely identified by `PersonID`.

### **Example Output**

### 

| EmployeeID | EmployeeName | TotalSales | AverageSales |
| --- | --- | --- | --- |
| 10 | Alice Johnson | $500,000 | $350,000 |
| 20 | Bob Smith | $450,000 | $350,000 |

In [None]:
use WideWorldImporters;

WITH EmployeeSales AS (
    SELECT 
        o.SalespersonPersonID AS EmployeeID,
        SUM(ol.Quantity * ol.UnitPrice) AS TotalSales
    FROM Sales.Orders AS o
    INNER JOIN Sales.OrderLines AS ol
        ON o.OrderID = ol.OrderID
    GROUP BY o.SalespersonPersonID
)
SELECT 
    es.EmployeeID,
    CONCAT(p.FullName, ' ', p.PreferredName) AS EmployeeName,
    es.TotalSales,
    (SELECT AVG(TotalSales) FROM EmployeeSales) AS AverageSales
FROM EmployeeSales AS es
INNER JOIN Application.People AS p
    ON es.EmployeeID = p.PersonID
WHERE es.TotalSales > (SELECT AVG(TotalSales) FROM EmployeeSales)
ORDER BY es.TotalSales DESC;



### **Proposition 8:**

**Retrieve recent orders using derived tables.**

#### **Functional Specification**

**Query Name**

Customers' Most Recent Orders

**Description**

This query lists the most recent order for each customer by using a derived table that contains the maximum order date per customer. It then joins this derived table back to the `Sales.Orders` table to get full order details.

**Inputs**

- **CustomerID**: The unique identifier for each customer.
- **OrderID**: The unique identifier for each order.
- **OrderDate**: The date when the order was placed.

**Outputs**

- **CustomerID**: The unique identifier for the customer.
- **OrderID**: The unique identifier for the order.
- **OrderDate**: The date of the order.

**Steps**

1. Create a derived table that selects `CustomerID` and their maximum `OrderDate`.
2. Join this derived table with the `Sales.Orders` table on `CustomerID` and `OrderDate`.

**Assumptions**

- The data is available in the `Sales.Orders` table.
- Customers may have multiple orders on the same maximum date.

**Example Output**

| CustomerID | OrderID | OrderDate |
| --- | --- | --- |
| 201 | 1500 | 2023-10-01 |
| 202 | 1505 | 2023-10-02 |

In [None]:
use WideWorldImporters;

SELECT 
    o.CustomerID,
    o.OrderID,
    o.OrderDate
FROM Sales.Orders AS o
INNER JOIN (
    SELECT 
        CustomerID, 
        MAX(OrderDate) AS MaxOrderDate
    FROM Sales.Orders
    GROUP BY CustomerID
) AS RecentOrders
    ON o.CustomerID = RecentOrders.CustomerID 
    AND o.OrderDate = RecentOrders.MaxOrderDate
ORDER BY o.CustomerID;


## **Proposition 9:**

### **Query Name**

**Customers with Above Average Order Frequency**

### **Description**

This query identifies customers who have placed orders more frequently than the average customer within the last year. It uses a subquery to calculate the average order count and compares each customer's order count against it.

### **Inputs**

- **CustomerID**: Unique identifier for each customer.
- **OrderDate**: Date when the order was placed.

### **Outputs**

- **CustomerID**: Unique identifier for the customer.
- **CustomerName**: Name of the customer.
- **OrderCount**: Total number of orders placed by the customer.
- **AverageOrderCount**: Average number of orders per customer.

### **Steps**

1. **Define Time Frame**
    
    - Consider orders placed within the last year from the current date.
2. **Calculate Order Count per Customer**
    
    - Count the number of orders for each customer within the time frame.
3. **Calculate Average Order Count**
    
    - Compute the average number of orders per customer.
4. **Identify Customers Above Average**
    
    - Select customers whose order count exceeds the average.
5. **Retrieve Customer Details**
    
    - Join with `Sales.Customers` to get customer names.

### **Assumptions**

- Customer data is stored in `Sales.Customers`.
- Order data is up-to-date.

### **Example Output**

| CustomerID | CustomerName | OrderCount | AverageOrderCount |
| --- | --- | --- | --- |
| 301 | Acme Corp | 15 | 8 |
| 402 | Beta Inc | 12 | 8 |

In [None]:
use WideWorldImporters;

WITH CustomerOrders AS (
    SELECT 
        o.CustomerID,
        COUNT(*) AS OrderCount
    FROM Sales.Orders AS o
    WHERE o.OrderDate >= DATEADD(YEAR, -1, 2016-06-01)
    GROUP BY o.CustomerID
)
SELECT 
    co.CustomerID,
    c.CustomerName,
    co.OrderCount,
    (SELECT AVG(OrderCount * 1.0) FROM CustomerOrders) AS AverageOrderCount
FROM CustomerOrders AS co
INNER JOIN Sales.Customers AS c
    ON co.CustomerID = c.CustomerID
WHERE co.OrderCount > (SELECT AVG(OrderCount * 1.0) FROM CustomerOrders)
ORDER BY co.OrderCount DESC;


## **Proposition 10:**

### **Query Name**

**Orders with the Largest Number of Line Items**

### **Description**

This query finds orders that have the highest number of line items, indicating complex or large orders. It uses grouping and ordering to identify these orders.

### **Inputs**

- **OrderID**: Unique identifier for each order.
- **OrderLineID**: Unique identifier for each order line.

### **Outputs**

- **OrderID**: Unique identifier for the order.
- **CustomerID**: Customer who placed the order.
- **OrderDate**: Date when the order was placed.
- **LineItemCount**: Number of line items in the order.

### **Steps**

1. **Calculate Line Item Count per Order**
    
    - Count the number of `OrderLineID`s for each `OrderID`.
2. **Retrieve Order Details**
    
    - Join with `Sales.Orders` to get `CustomerID` and `OrderDate`.
3. **Order Results**
    
    - Sort the results by `LineItemCount` in descending order.

### **Assumptions**

- Orders have at least one line item.
- There are orders with varying numbers of line items.

### **Example Output**

| OrderID | CustomerID | OrderDate | LineItemCount |
| --- | --- | --- | --- |
| 1001 | 301 | 2021-07-15 | 15 |
| 1002 | 402 | 2021-07-16 | 12 |
| 1003 | 503 | 2021-07-17 | 10 |

In [None]:
use WideWorldImporters;

WITH OrderLineCounts AS (
    SELECT 
        ol.OrderID,
        COUNT(ol.OrderLineID)  AS LineItemCount
    FROM Sales.OrderLines AS ol
    GROUP BY ol.OrderID
)
SELECT 
    o.OrderID,
    o.CustomerID,
    o.OrderDate,
    olc.LineItemCount
FROM OrderLineCounts AS olc
INNER JOIN Sales.Orders AS o
    ON olc.OrderID = o.OrderID
WHERE olc.LineItemCount >= 5
ORDER BY olc.LineItemCount DESC, o.OrderDate DESC;
