# CASE STUDY #1 - DANNY'S DINER 

<p align="center">
<img src="https://8weeksqlchallenge.com/images/case-study-designs/1.png" style="border-radius:10%" width="30%" height="30%">

## Problem Statement
---
Danny seriously loves Japanese food so in the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: sushi, curry and ramen. 
Danny wants to use the data to answer a few simple questions about his customers.  He plans on using these insights to help him decide whether he should expand the existing customer loyalty program.


Danny has provided you with a sample of his overall customer data due to privacy issues - but he hopes that these examples are enough for you to write fully functioning SQL queries to help him answer his questions!

## Import Modules and Load to Use
---

In [1]:
import pandas as pd
import duckdb
from datetime import datetime, timedelta

In [2]:
con = duckdb.connect(database=':memory:', read_only=False) # -> Tạo DB connection

## Dataset
---
Danny has shared with you 3 key datasets for this case study:
* sales
* menu
* members

This is entity relationship diagram
<p align="center">
<img src="../Image/entity_relationship_diagram_case_study_1.png">

## Table 1: sales

The sales table captures all ```customer_id``` level purchases with an corresponding ```order_date``` and ```product_id``` information for when and what menu items were ordered

In [3]:
SQL1 = '''
CREATE TABLE sales(
  "customer_id" VARCHAR(1),
  "order_date" DATE,
  "product_id" INTEGER
);
INSERT INTO sales
  ("customer_id", "order_date", "product_id")
VALUES
  ('A', '2021-01-01', '1'),
  ('A', '2021-01-01', '2'),
  ('A', '2021-01-07', '2'),
  ('A', '2021-01-10', '3'),
  ('A', '2021-01-11', '3'),
  ('A', '2021-01-11', '3'),
  ('B', '2021-01-01', '2'),
  ('B', '2021-01-02', '2'),
  ('B', '2021-01-04', '1'),
  ('B', '2021-01-11', '1'),
  ('B', '2021-01-16', '3'),
  ('B', '2021-02-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-07', '3');
  '''
con.execute(SQL1).fetch_df()

Unnamed: 0,Count
0,15


In [4]:
SQL2 = '''
SELECT * FROM sales
'''
con.execute(SQL2).fetch_df()

Unnamed: 0,customer_id,order_date,product_id
0,A,2021-01-01,1
1,A,2021-01-01,2
2,A,2021-01-07,2
3,A,2021-01-10,3
4,A,2021-01-11,3
5,A,2021-01-11,3
6,B,2021-01-01,2
7,B,2021-01-02,2
8,B,2021-01-04,1
9,B,2021-01-11,1


## Table 2: menu
The menu table maps the product_id to the actual product_name and price of each menu item.

In [5]:
SQL3 = '''
CREATE TABLE menu (
  "product_id" INTEGER,
  "product_name" VARCHAR(5),
  "price" INTEGER
);

INSERT INTO menu
  ("product_id", "product_name", "price")
VALUES
  ('1', 'sushi', '10'),
  ('2', 'curry', '15'),
  ('3', 'ramen', '12');
'''
con.execute(SQL3).fetch_df()

Unnamed: 0,Count
0,3


In [6]:
SQL4 = '''
SELECT * FROM menu 
'''
con.execute(SQL4).fetch_df()

Unnamed: 0,product_id,product_name,price
0,1,sushi,10
1,2,curry,15
2,3,ramen,12


## Table 3: members
The final members table captures the join_date when a customer_id joined the beta version of the Danny’s Diner loyalty program.

In [7]:
SQL5='''
CREATE TABLE members (
  "customer_id" VARCHAR(1),
  "join_date" DATE
);

INSERT INTO members
  ("customer_id", "join_date")
VALUES
  ('A', '2021-01-07'),
  ('B', '2021-01-09');
'''
con.execute(SQL5).fetch_df()

Unnamed: 0,Count
0,2


In [8]:
SQL6='''
SELECT * FROM members
'''
con.execute(SQL6).fetch_df()

Unnamed: 0,customer_id,join_date
0,A,2021-01-07
1,B,2021-01-09


## Case Study Questions
---
Each of the following case study questions can be answered using a single SQL statement:

1. What is the total amount each customer spent at the restaurant?
2. How many days has each customer visited the restaurant?
3. What was the first item from the menu purchased by each customer?
4. What is the most purchased item on the menu and how many times was it purchased by all customers?
5. Which item was the most popular for each customer?
6. Which item was purchased first by the customer after they became a member?
7. Which item was purchased just before the customer became a member?
8. What is the total items and amount spent for each member before they became a member?
9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?
10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?

## Solution
---

### 1. What is the total amount each customer spent at the restaurant?

In [9]:
SQL7='''
SELECT 
	sales.customer_id,
  SUM(menu.price) AS total_spent
FROM sales
JOIN menu
	ON sales.product_id = menu.product_id
GROUP BY customer_id
ORDER BY customer_id
'''
con.execute(SQL7).fetch_df()

Unnamed: 0,customer_id,total_spent
0,A,76.0
1,B,74.0
2,C,36.0


**Answer-01**:

- Customer A spent $76
- Customer B spent $74
- Customer C spent $36

### 2. How many days has each customer visited the restaurant?

In [10]:
SQL8='''
SELECT customer_id, COUNT(DISTINCT(order_date)) AS no_of_visit
FROM sales
GROUP BY customer_id
'''
con.execute(SQL8).fetch_df()

Unnamed: 0,customer_id,no_of_visit
0,A,4
1,B,6
2,C,2


**Answer-02**:

- Customer A has visited 4 days
- Customer B has visited 6 days
- Customer C has visited 2 days

### 3. What was the first item from the menu purchased by each customer?

In [11]:
SQL9='''
WITH rank AS(
    SELECT 
    s.order_date,
    s.customer_id, m.product_name,
    RANK() OVER(PARTITION BY s.customer_id ORDER BY s.order_date ASC) AS rank
    FROM sales s
    JOIN menu m
    ON s.product_id =m.product_id)
SELECT customer_id, product_name
FROM rank
WHERE rank =1
GROUP BY customer_id, product_name
'''
con.execute(SQL9).fetch_df()

Unnamed: 0,customer_id,product_name
0,A,sushi
1,A,curry
2,B,curry
3,C,ramen


**Answer-03**:

First item purchased by:
- Customer A is sushi
- Customer B is curry
- Customer C is ramen

### 4. What is the most purchased item on the menu and how many times was it purchased by all customers?

In [12]:
SQL10='''
SELECT m.product_name, COUNT(s.product_id) AS  times
FROM sales s
JOIN menu m
ON s.product_id = m.product_id
GROUP BY m.product_name
ORDER BY times DESC
LIMIT 1
'''
con.execute(SQL10).fetch_df()

Unnamed: 0,product_name,times
0,ramen,8


**Answer-04**:

- Most purchased item was ramen and it was ordered 8 times

### 5.Which item was the most popular for each customer?

In [13]:
SQL11='''
WITH rank_items AS
(
    SELECT s.customer_id, m.product_name, COUNT(s.product_id),
	RANK() OVER(PARTITION BY s.customer_id ORDER BY COUNT(s.product_id) DESC) AS rank
	FROM sales s
	JOIN menu m
    ON s.product_id = m.product_id
	GROUP BY s.customer_id, m.product_name
)
SELECT r.customer_id, r.product_name as popular_items
FROM rank_items r
WHERE rank = 1
'''
con.execute(SQL11).fetch_df()

Unnamed: 0,customer_id,popular_items
0,A,ramen
1,B,curry
2,B,sushi
3,B,ramen
4,C,ramen


**Answer-05**:

Most popular item for:
- Customer A was ramen
- Customer B was a tie between all three menu items
- Customer C was ramen

### 6. Which item was purchased first by the customer after they became a member?

In [14]:
SQL12='''
WITH rank_items AS
(
    SELECT s.customer_id, s.order_date, mem.join_date ,m.product_name,
    RANK() OVER(PARTITION BY s.customer_id ORDER BY s.order_date ASC) AS rank
    FROM sales s
    JOIN menu m
    ON s.product_id = m.product_id
    JOIN members mem
    ON s.customer_id = mem.customer_id
    WHERE s.order_date > mem.join_date
)
SELECT r.customer_id, r.product_name
FROM rank_items r
WHERE r.rank =1
'''
con.execute(SQL12).fetch_df()

Unnamed: 0,customer_id,product_name
0,A,ramen
1,B,sushi


**Answer-06**:

Only A and B are in membership and first product ordered after becoming a member for:
- Customer A is ramen
- Customer B is sushi

### 7.Which item was purchased just before the customer became a member?

In [15]:
SQL13='''
WITH rank_items AS
(
    SELECT s.customer_id, s.order_date, mem.join_date ,m.product_name,
    RANK() OVER(PARTITION BY s.customer_id ORDER BY s.order_date ASC) AS rank
    FROM sales s
    JOIN menu m
    ON s.product_id = m.product_id
    JOIN members mem
    ON s.customer_id = mem.customer_id
    WHERE s.order_date < mem.join_date
)
SELECT r.customer_id, r.product_name
FROM rank_items r
WHERE r.rank =1
'''
con.execute(SQL13).fetch_df()

Unnamed: 0,customer_id,product_name
0,A,sushi
1,A,curry
2,B,curry


**Answer-07**:

Last item purchased before becoming a member for:
- Customer A was curry and sushi (iteams were purchased in a same day)
- Customer B was sushi


### 8. What is the total items and amount spent for each member before they became a member?

In [16]:
SQL14='''
SELECT s.customer_id, COUNT(m.product_name) AS count_items, SUM(m.price) AS total_price
FROM sales s
JOIN menu m
ON s.product_id = m.product_id
JOIN members mem
ON s.customer_id = mem.customer_id
WHERE s.order_date < mem.join_date
GROUP BY s.customer_id
'''
con.execute(SQL14).fetch_df()

Unnamed: 0,customer_id,count_items,total_price
0,A,2,25.0
1,B,3,40.0


### 9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?

In [17]:
SQL15='''
SELECT s.customer_id, 
SUM(CASE 
        WHEN m.product_name <> 'sushi' 
        THEN (m.price*10) 
        ELSE (m.price*20) 
    END ) AS total_points
FROM sales s 
JOIN menu m 
ON s.product_id = m.product_id
GROUP BY s.customer_id
'''
con.execute(SQL15).fetch_df()

Unnamed: 0,customer_id,total_points
0,A,860.0
1,B,940.0
2,C,360.0


In [18]:
SQL16='''
SELECT s.customer_id, SUM(IF(s.product_id = 1, m.price*20, m.price*10)) as total_points
FROM sales s
JOIN menu  m
ON s.product_id=m.product_id
GROUP BY s.customer_id
'''
con.execute(SQL16).fetch_df()

Unnamed: 0,customer_id,total_points
0,A,860.0
1,B,940.0
2,C,360.0


<u>Answer-09</u>:

- Customer A has 860 points
- Customer B has 940 points
- Customer C has 360 points

### 10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?

In [19]:
SQL17 = '''
WITH points AS(
    SELECT me.customer_id, s.order_date, me.join_date, me.join_date + INTERVAL 6 DAY AS valid_date,
    m.product_name, m.price,
    SUM(CASE
        WHEN m.product_name = 'sushi' THEN 2 * 10 * m.price
        WHEN s.order_date BETWEEN me.join_date AND (me.join_date + INTERVAL 6 DAY) THEN 2 * 10 * m.price
        ELSE 10 * m.price
        END
        ) AS points   
    FROM sales s
    JOIN members me
    ON s.customer_id=me.customer_id
    JOIN menu m
    ON s.product_id = m.product_id
    WHERE s.order_date < '2021-01-31'
    GROUP BY me.customer_id, s.order_date, me.join_date, me.join_date + INTERVAL 6 DAY, m.product_name,     m.price
    )
SELECT customer_id, SUM(points) AS total_points
FROM points
GROUP BY customer_id
'''
con.execute(SQL17).fetch_df()

Unnamed: 0,customer_id,total_points
0,A,1370.0
1,B,820.0


<u>Answer-10</u>:

- Customer A has 1370 points
- Customer B has 820 points

## Bonus Questions
---

### Join All The Things
Creating basic data tables that are used to quickly derive insights without needing to join the underlying tables using SQL

In [20]:
table1='''
SELECT s.customer_id, s.order_date, m.product_name, m.price, IF(s.order_date >=me.join_date, 'Y','N') AS member
FROM sales s
LEFT JOIN members me
ON s.customer_id=me.customer_id
LEFT JOIN menu m
ON s.product_id = m.product_id
'''
con.execute(table1).fetch_df()

Unnamed: 0,customer_id,order_date,product_name,price,member
0,A,2021-01-01,sushi,10,N
1,A,2021-01-01,curry,15,N
2,A,2021-01-07,curry,15,Y
3,A,2021-01-10,ramen,12,Y
4,A,2021-01-11,ramen,12,Y
5,A,2021-01-11,ramen,12,Y
6,B,2021-01-01,curry,15,N
7,B,2021-01-02,curry,15,N
8,B,2021-01-04,sushi,10,N
9,B,2021-01-11,sushi,10,Y


### Rank All The Things
Ranking customer products, but not need the ranking for non-member purchases so fill in null ranking values for the records when customers are not yet part of the loyalty program.

In [21]:
table2='''
WITH main_table AS(
    SELECT s.customer_id, s.order_date, m.product_name, m.price, IF(s.order_date >=me.join_date, 'Y',       'N') AS member
    FROM sales s
    LEFT JOIN members me
    ON s.customer_id=me.customer_id
    LEFT JOIN menu m
    ON s.product_id = m.product_id
)
SELECT *, 
CASE
WHEN member = 'N' then NULL
ELSE RANK () OVER(PARTITION BY customer_id, member ORDER BY order_date) END AS ranking
FROM main_table
'''
con.execute(table2).fetch_df()

Unnamed: 0,customer_id,order_date,product_name,price,member,ranking
0,B,2021-01-11,sushi,10,Y,1.0
1,B,2021-01-16,ramen,12,Y,2.0
2,B,2021-02-01,ramen,12,Y,3.0
3,B,2021-01-01,curry,15,N,
4,B,2021-01-02,curry,15,N,
5,B,2021-01-04,sushi,10,N,
6,C,2021-01-01,ramen,12,N,
7,C,2021-01-01,ramen,12,N,
8,C,2021-01-07,ramen,12,N,
9,A,2021-01-01,sushi,10,N,
