# Arbor Foods Trading Co.


## Load SQL extension for IPython and connect to database

The following lines of code will provide the ability to write SQL queries in the Jupyter Notebook:


In [None]:
# This command loads the sql extension for IPython
%load_ext sql

# This command establishes a connection to the Arbor Foods database using the PostegreSQL database system
%sql postgresql://your_username:your_password@localhost:5432/your_database_name 

The below removes any views that were created during a previous run of this notebook.


In [None]:
%%sql
DROP VIEW customer_orders;
DROP VIEW detailed_orders;
DROP VIEW employee_orders;

## Getting to Know the Data


### List all tables and views

To obtain a list of all tables and views in the PostgreSQL database, the `information_schema.tables` system table can be queried:


In [None]:
%%sql
SELECT table_name AS name,
       table_type AS type
  FROM information_schema.tables
 WHERE table_schema = 'public' AND table_type IN ('BASE TABLE', 'VIEW');

## Create Views

I'll be creating views that will help with the rest of the project.


### A view with order and customer information

First, combining the `orders` and `customers` tables to get more detailed information about each order:


In [None]:
%%sql
CREATE VIEW customer_orders AS
SELECT o.order_id,
       c.company_name,
       c.customer_id, 
       c.contact_name,
       o.order_date
  FROM orders AS o
  JOIN customers AS c
    ON c.customer_id = o.customer_id;

The first 10 rows of the `customer_orders` view:

In [None]:
%%sql
SELECT *
  FROM customer_orders
 LIMIT 10;

### A view with detailed order information


The next view will combine the `order_details`, `products`, and `orders` tables to get detailed order information.


In [None]:
%%sql
CREATE VIEW detailed_orders AS 
SELECT o.order_id,
       o.order_date,
       p.product_name,
       p.product_id,
       od.quantity,
       od.unit_price, 
       od.discount
  FROM orders AS o
  JOIN order_details AS od
    ON o.order_id = od.order_id
  JOIN products AS p
    ON od.product_id = p.product_id;

The first 10 rows of the newly created `detailed_orders` view:


In [None]:
%%sql
SELECT *
  FROM detailed_orders
 LIMIT 10;

### A view with employee and order information


Combining the `employees` and `orders` tables will provide information on which employee was responsible for each order.


In [None]:
%%sql
CREATE VIEW employee_orders AS
SELECT e.employee_id,
       e.first_name || ' ' || e.last_name AS employee_name,
       o.order_id, 
       o.order_date
  FROM employees AS e
  JOIN orders AS o
    ON e.employee_id = o.employee_id;

The first 10 rows of the `employee_orders` view:

In [None]:
%%sql
SELECT *
  FROM employee_orders
 LIMIT 10;

## Ranking Employee Sales Performance


Ranking employees based on their total sales amount will allow management to recognize and reward top-performing employees, foster a culture of excellence within the organization, as well as identify employees who might be struggling so management can offer the necessary training or resources to help them improve.


The following creates a Common Table Expression (CTE) that calculates the total sales for each employee using the employee_orders view and order_details table. Then, the next command ranks each employee based on their total sales:


In [None]:
%%sql
WITH total_sales_by_employee AS(
  SELECT e.employee_id, 
         e.employee_name,
         ROUND(SUM(od.quantity * od.unit_price * (1-od.discount))::numeric,2) AS total_sales
    FROM employee_orders AS e
    JOIN order_details AS od
      ON e.order_id = od.order_id
   GROUP BY e.employee_id, e.employee_name
)

SELECT employee_id AS "Emp ID", 
       employee_name AS "Emp Name",
       total_sales AS "Total Sales",
       RANK() OVER(ORDER BY total_sales DESC) AS "Sales Rank"
  FROM total_sales_by_employee;

Based on the above table, `Margaret Peacock` is the top rank employee in regards to total sales with $232,890.85 in total sales.

Conversely, the table also shows `Steven Buchanan` has having the least amount of sales among all the employees with a total sales of $68,792.28.


## Running Total of Monthly Sales


Creating a running total of sales by month will provide a more macro-level perspective around the company's overall sales performance over time, which will help management identify trends that might shape the company's future strategies.


For this analysis task, the `orders` and `order_details` tables will be needed. Luckily, a view combining these tables was created earlier, the `detailed_orders` view.

The following query creates a CTE called `monthly_sales` calculates the total sales per month using the `quantity`, `unit_price`, `discount`, and `order_date` columns from the `detailed_orders` view.

A second query uses the CTE to calculate a running total of total sales per month.


In [None]:
%%sql
WITH monthly_sales AS(
    SELECT DATE_TRUNC('month', order_date)::DATE AS month,
           ROUND(SUM(unit_price * quantity * (1 - discount))::numeric,2) AS total_sales
      FROM detailed_orders
     GROUP BY DATE_TRUNC('month', order_date)
)

SELECT month AS "Month",
       SUM(total_sales) OVER(ORDER BY month) AS "Running Total"
  FROM monthly_sales
 ORDER BY month;


## Month-Over-Month Sales Growth

Analyzing the month-over-month sales growth rate will provide a better understanding of the rate at which sales are increasing or decreasing, and will help the management team to identify significant trends.

The following query will compare each month's sales with the previous month's, then calculate the percentage change in sales.

In [None]:
%%sql
WITH monthly_sales AS(
    SELECT EXTRACT(MONTH FROM order_date) AS month,
           EXTRACT(YEAR FROM order_date) AS year,
           ROUND(SUM(unit_price * quantity * (1 - discount))::numeric,2) AS total_sales
      FROM detailed_orders
     GROUP BY EXTRACT(MONTH FROM order_date), EXTRACT(YEAR FROM order_date)
),

previous_sales AS(
    SELECT month,
           year,
           total_sales,
           LAG(total_sales) OVER(ORDER BY year, month) AS previous_month_sales
      FROM monthly_sales
)

SELECT year AS "Year",
       month AS "Month",
       total_sales AS "Monthly Sales",
       previous_month_sales AS "Previous Month Sales",
       ROUND((total_sales / previous_month_sales - 1) * 100, 2) AS "Sales Growth Rate"
  FROM previous_sales;

## Identifying High-Value Customers

Offering targeted promotions and special offers to customers with above-average order values could lead to an increase in sales, improved customer retention, and attract new customers.

The following query joins the `customer_orders` view with the `order_details` table into a CTE called `customer_sales` which calculates the sale amount for each order.

Another CTE called `labeld_sales` uses the `customer_sales` CTE to calculate the average sale amount per customer.

A third and final CTE called `above_avg_counts` counts the number of above average purchases per customer.

The final query uses the `above_avg_counts` CTE to rank all of the customers based on how many above-average orders they have made. The final output has been truncated to only show the top 10 companies.

In [None]:
%%sql
WITH customer_sales AS(
    SELECT c.customer_id,
           (od.quantity * od.unit_price * (1 - od.discount)) AS sale_amount
      FROM customer_orders AS c
      JOIN order_details AS od
        ON c.order_id = od.order_id
),

labled_sales AS (
    SELECT customer_id,
           sale_amount,
           AVG(sale_amount) OVER(PARTITION BY customer_id) AS avg_sale
      FROM customer_sales
),

above_avg_counts AS (
    SELECT customer_id,
           COUNT(*) FILTER(WHERE sale_amount > avg_sale) AS above_avg_count
      FROM labled_sales
     GROUP BY customer_id
)

SELECT customer_id AS "Customer ID",
       above_avg_count "Above-Average Orders",
       RANK() OVER(ORDER BY above_avg_count DESC) AS "Rank"
  FROM above_avg_counts
 LIMIT 10;

From the output above we can see the top 10 companies with the highest amount of above-average order values. The customer `SAVEA` is at the top of the list with 40 above-average orders.

## Percentage of Sales for Each Category

By knowing the percentage of total sales for each product category, management will have better insights into which categories drive most of the company's sales. This will help guide decisions about inventory and marketing strategies.

The first below creates a CTE called `sales_per_category` which calculates the sale amount for each product category.

The second query outputs the IDs, names, and percentage of sales for each category.

In [None]:
%%sql
WITH sales_per_category AS (
    SELECT c.category_id,
           c.category_name,
           SUM(od.unit_price * od.quantity * (1 - od.discount)) AS sales_amount
      FROM categories AS c
      JOIN products AS p
        ON c.category_id = p.category_id
      JOIN order_details AS od
        ON p.product_id = od.product_id
    GROUP BY c.category_id
)

SELECT category_id AS "Category ID",
       category_name AS "Category Name",
       ROUND(((sales_amount / SUM(sales_amount) OVER(ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) * 100)::numeric,1) AS "% Total Sales"
  FROM sales_per_category
 ORDER BY "% Total Sales" DESC;

From the above table, the `Beverages` category makes up the most of the company's total sales at about 21.2%. `Grains/Cereals` and `Produce` make up the least of the company's sales making up about 7.6% and 7.9% of all sales, respectively.

## Top Products Per Category

The final objective will be to provide management with a list of the top three items sold in each product category, which will allow them to identify top performers and to ensure these products are kept in stock.

The first query below creates a CTE called `product_sales` that calculates the total sales for each product.

The second query outputs the top three products from each category based on their total sales.

In [None]:
%%sql
WITH product_sales AS (
    SELECT p.product_name,
           p.product_id,
           p.category_id,
           ROUND((SUM(od.unit_price * od.quantity * (1 - od.discount)))::numeric, 2) AS total_sales
      FROM products AS p
      JOIN order_details AS od
        ON p.product_id = od.product_id
     GROUP BY p.category_id, p.product_id
)

SELECT category_id AS "Category ID",
       product_id AS "Product ID",
       product_name AS "Product Name",
       total_sales AS "Total Sales"
  FROM (SELECT category_id,
               product_id,
               product_name,
               total_sales,
               ROW_NUMBER() OVER(PARTITION BY category_id ORDER BY total_sales DESC) AS row_num
          FROM product_sales
       ) AS tmp
 WHERE row_num <= 3;