<img src = "https://8weeksqlchallenge.com/images/case-study-designs/1.png">

## Introduction

Danny seriously loves Japanese food so in the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: sushi, curry and ramen.

Danny’s Diner is in need of your assistance to help the restaurant stay afloat - the restaurant has captured some very basic data from their few months of operation but have no idea how to use their data to help them run the business.

## Problem Statement

Danny wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent and also which menu items are their favourite. Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers.

He plans on using these insights to help him decide whether he should expand the existing customer loyalty program - additionally he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.

Danny has provided you with a sample of his overall customer data due to privacy issues - but he hopes that these examples are enough for you to write fully functioning SQL queries to help him answer his questions!

Danny has shared with you 3 key datasets for this case study:

- sales
- menu
- members

## Entity Relationship Diagram

<img src = "https://user-images.githubusercontent.com/81607668/127271130-dca9aedd-4ca9-4ed8-b6ec-1e1920dca4a8.png">

In [1]:
import sqlite3
import pandas as pd

In [2]:
database = './data/DannysDiner.sqlite'

In [3]:
connection = sqlite3.connect(database)

In [4]:
pd.read_sql("SELECT * FROM sales", connection)

Unnamed: 0,customer_id,order_date,product_id
0,A,2019-01-02,2
1,A,2020-05-03,3
2,A,2020-11-06,3
3,A,2021-01-03,1
4,A,2021-09-02,2
5,A,2022-05-03,3
6,B,2019-05-02,3
7,B,2020-01-03,2
8,B,2021-04-02,2
9,B,2022-01-03,2


In [5]:
pd.read_sql("SELECT * FROM sales_adjusted", connection)

Unnamed: 0,customer_id,order_date,product_id,valid_date
0,A,2019-01-02,2,2019-01-08
1,A,2020-05-03,3,2020-05-09
2,A,2020-11-06,3,2020-11-12
3,A,2021-01-03,1,2021-01-09
4,A,2021-09-02,2,2021-09-08
5,A,2022-05-03,3,2022-05-09
6,B,2019-05-02,3,2019-05-08
7,B,2020-01-03,2,2020-01-09
8,B,2021-04-02,2,2021-04-08
9,B,2022-01-03,2,2022-01-09


In [6]:
pd.read_sql("SELECT * FROM menu", connection)

Unnamed: 0,product_id,product_name,price
0,1,sushi,10
1,2,curry,15
2,3,ramen,12


In [7]:
pd.read_sql("SELECT * FROM members", connection)

Unnamed: 0,customer_id,join_date
0,A,2020-01-06
1,B,2020-01-07
2,C,2020-01-08


# Case Study Questions

### 1. What is the total amount each customer spent at the restaurant?

In [8]:
pd.read_sql("""

            SELECT s.customer_id AS Customer, SUM(m.price) AS "Total Amount Spent"
            FROM sales s
            INNER JOIN menu m
            ON s.product_id = m.product_id
            GROUP BY Customer
            
            """,connection)

Unnamed: 0,Customer,Total Amount Spent
0,A,76
1,B,79
2,C,49


### 2. How many days has each customer visited the restaurant?

In [9]:
pd.read_sql("""

            SELECT customer_id AS Customer, COUNT(DISTINCT(order_date)) AS "Total Days"
            FROM sales
            GROUP BY customer_id
            
            """,connection)

Unnamed: 0,Customer,Total Days
0,A,6
1,B,6
2,C,4


### 3. What was the first item from the menu purchased by each customer?

In [10]:
pd.read_sql("""

            WITH first_items_purchased AS
            (SELECT s.customer_id AS Customer, s.order_date AS Order_Date, m.product_name AS Product,
            DENSE_RANK() OVER (PARTITION BY s.customer_id ORDER BY s.order_date) AS rank
            FROM sales s
            INNER JOIN menu m
            ON s.product_id = m.product_id
            GROUP BY customer_id, order_date)
            
            SELECT Customer, Product
            FROM first_items_purchased
            WHERE rank = 1
            
            """,connection)

Unnamed: 0,Customer,Product
0,A,curry
1,B,ramen
2,C,curry


### 4. What is the most purchased item on the menu and how many times was it purchased by all customers?

In [11]:
pd.read_sql("""

            SELECT m.product_name AS Item, COUNT(s.product_id) AS "Times Purchased"
            FROM sales s
            INNER JOIN menu m
            ON s.product_id = m.product_id
            GROUP BY s.product_id,Item
            ORDER BY "Times Purchased" DESC
            LIMIT 1
            
            """,connection)

Unnamed: 0,Item,Times Purchased
0,ramen,7


### 5. Which item was the most popular for each customer?

In [12]:
pd.read_sql("""

            WITH most_popular_item AS 
            (SELECT s.customer_id AS Customer, m.product_name AS Product, COUNT(m.product_id) AS Orders,
            DENSE_RANK() OVER(PARTITION BY s.customer_id ORDER BY COUNT(s.customer_id) DESC) AS rank
            FROM menu m
            INNER JOIN sales s
            ON m.product_id = s.product_id
            GROUP BY Customer, Product)
            
            SELECT Customer, Product, Orders
            FROM most_popular_item
            WHERE rank = 1
            
            """,connection)

Unnamed: 0,Customer,Product,Orders
0,A,ramen,3
1,B,curry,3
2,C,ramen,2


### 6. Which item was purchased first by the customer after they became a member?

In [13]:
pd.read_sql("""
            
            WITH first_puchase AS
            (SELECT s.customer_id AS Customer, m.join_date AS Joining_Date, s.order_date AS Order_Date, 
            s.product_id, me.product_name AS Product,
            DENSE_RANK() OVER(PARTITION BY s.customer_id ORDER BY s.order_date) AS rank
            FROM sales AS s
            INNER JOIN members AS m
            ON s.customer_id = m.customer_id
            INNER JOIN menu AS me
            ON s.product_id = me.product_id
            WHERE s.order_date >= m.join_date)
            
            SELECT Customer, Joining_Date, Order_Date, Product
            FROM first_puchase
            WHERE rank=1
            
            """,connection)

Unnamed: 0,Customer,Joining_Date,Order_Date,Product
0,A,2020-01-06,2020-05-03,ramen
1,B,2020-01-07,2021-04-02,curry
2,C,2020-01-08,2020-02-02,ramen


### 7. Which item was purchased just before the customer became a member?

In [14]:
pd.read_sql("""
            
            WITH puchase_cte AS
            (SELECT s.customer_id AS Customer, m.join_date AS Joining_Date, s.order_date AS Order_Date, 
            s.product_id, me.product_name AS Product,
            DENSE_RANK() OVER(PARTITION BY s.customer_id ORDER BY s.order_date DESC) AS rank
            FROM sales AS s
            INNER JOIN members AS m
            ON s.customer_id = m.customer_id
            INNER JOIN menu AS me
            ON s.product_id = me.product_id
            WHERE s.order_date < m.join_date)
            
            SELECT Customer, Order_Date, Product
            FROM puchase_cte
            WHERE rank = 1
            
            """,connection)

Unnamed: 0,Customer,Order_Date,Product
0,A,2019-01-02,curry
1,B,2020-01-03,curry
2,C,2019-03-07,curry


### 8. What is the total items and amount spent for each member before they became a member?

In [15]:
pd.read_sql("""
            
            SELECT s.customer_id AS Customer, COUNT(DISTINCT(s.product_id)) AS "Unique Menu Items", 
            SUM(me.price) AS "Total Amount Spent ($)"
            FROM sales s
            INNER JOIN members m 
            ON s.customer_id = m.customer_id
            INNER JOIN menu me
            ON s.product_id = me.product_id
            WHERE s.order_date < m.join_date
            GROUP BY Customer
            
            """,connection)

Unnamed: 0,Customer,Unique Menu Items,Total Amount Spent ($)
0,A,1,15
1,B,2,27
2,C,1,15


### 9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?

In [16]:
pd.read_sql("""

            WITH price_points AS
            (
               SELECT *, 
                  CASE
                     WHEN m.product_id = 1 THEN m.price * 20
                     ELSE m.price * 10
                  END AS Points
               FROM menu m
               INNER JOIN sales AS s
               ON m.product_id = s.product_id
            )
            
            SELECT customer_id AS Customer, SUM(Points) AS "Total Points"
            FROM price_points
            GROUP BY Customer
            
            """,connection)

Unnamed: 0,Customer,Total Points
0,A,860
1,B,890
2,C,590


### 10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?

In [19]:
pd.read_sql("""

WITH dates AS 
(SELECT *, date('2023-01-31') AS Last_Date
FROM members 
),

    cte AS
    (SELECT d.customer_id, s.order_date, d.join_date, sa.valid_date, d.last_date, m.product_name, m.price,

       SUM(CASE
          WHEN m.product_name = 'sushi' THEN 2 * 10 * m.price
          WHEN s.order_date BETWEEN d.join_date AND sa.valid_date THEN 2 * 10 * m.price
          ELSE 10 * m.price
          END) AS points

    FROM dates AS d
    INNER JOIN sales AS s
       ON d.customer_id = s.customer_id
    INNER JOIN sales_adjusted sa
        ON s.customer_id = sa.customer_id
    JOIN menu AS m
       ON s.product_id = m.product_id
    WHERE s.order_date < d.last_date
    GROUP BY d.customer_id, s.order_date, d.join_date, sa.valid_date, d.last_date, m.product_name, m.price
    )

SELECT customer_id AS Customer, SUM(points) AS "Total Points"
FROM cte
WHERE Customer IN('A', 'B')
GROUP BY Customer

""", connection
)

Unnamed: 0,Customer,Total Points
0,A,6660
1,B,6510


# BONUS QUESTIONS

### Join All The Things - Recreate the table with: customer_id, order_date, product_name, price, member (Y/N)

In [20]:
pd.read_sql("""

SELECT s.customer_id AS Customer, s.order_date "Order Date", m.product_name AS Product, m.price AS Price,
   CASE
      WHEN me.join_date > s.order_date THEN 'N'
      WHEN me.join_date <= s.order_date THEN 'Y'
      ELSE 'N'
      END AS Member
FROM sales AS s
LEFT JOIN menu AS m
   ON s.product_id = m.product_id
LEFT JOIN members AS me
   ON s.customer_id = me.customer_id;
""", connection
)

Unnamed: 0,Customer,Order Date,Product,Price,Member
0,A,2019-01-02,curry,15,N
1,A,2020-05-03,ramen,12,Y
2,A,2020-11-06,ramen,12,Y
3,A,2021-01-03,sushi,10,Y
4,A,2021-09-02,curry,15,Y
5,A,2022-05-03,ramen,12,Y
6,B,2019-05-02,ramen,12,N
7,B,2020-01-03,curry,15,N
8,B,2021-04-02,curry,15,Y
9,B,2022-01-03,curry,15,Y


### Rank All The Things - Danny also requires further information about the ```ranking``` of customer products, but he purposely does not need the ranking for non-member purchases so he expects null ```ranking``` values for the records when customers are not yet part of the loyalty program.

In [24]:
pd.read_sql("""

WITH summary AS 
(
   SELECT s.customer_id AS Customer, s.order_date AS Order_Date, m.product_name AS Product, m.price AS Price,
      CASE
      WHEN me.join_date > s.order_date THEN 'N'
      WHEN me.join_date <= s.order_date THEN 'Y'
      ELSE 'N' END AS Member
   FROM sales AS s
   LEFT JOIN menu AS m
      ON s.product_id = m.product_id
   LEFT JOIN members AS me
      ON s.customer_id = me.customer_id
)

SELECT *, CASE
   WHEN member = 'N' then NULL
   ELSE
      RANK () OVER(PARTITION BY Customer, Member
      ORDER BY Order_Date) END AS Ranking
FROM summary;

""",connection
)

Unnamed: 0,Customer,Order_Date,Product,Price,Member,Ranking
0,A,2019-01-02,curry,15,N,
1,A,2020-05-03,ramen,12,Y,1.0
2,A,2020-11-06,ramen,12,Y,2.0
3,A,2021-01-03,sushi,10,Y,3.0
4,A,2021-09-02,curry,15,Y,4.0
5,A,2022-05-03,ramen,12,Y,5.0
6,B,2019-05-02,ramen,12,N,
7,B,2020-01-03,curry,15,N,
8,B,2021-04-02,curry,15,Y,1.0
9,B,2022-01-03,curry,15,Y,2.0
