#  Case Study 1: Diner Data Analysis
<img src='dinner_shop.png'>

## Problem Statement
### wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent, and which menu items are their favourite.


### Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers. He offered a subscription program to his cutomers through which they can gain points which can be used later to get a discount on meals. In return, subscribed customers should fill some personal information in a survey form.  He plans on using these insights to help him decide whether he should expand the existing customer loyalty program — additionally he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.


### Assumptions:
### 1. Customers can buy a meal (product) before registering for the subsrciption program.
### 2. A customer may buy mroe than one meal in the same day (invited his/her firend over dinner )

## ERD


### The data set contains the following 3 tables which you may refer to the relationship diagram below to understand the connection.

#### - sales
#### - members
#### - menu

<img src='ERD.PNG'>

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

import psycopg2

import warnings
warnings.filterwarnings("ignore")

In [2]:
hostname = 'localhost'
database = 'SQL_Challenge'
username = 'postgres'
pwd = 'Seriously!'
port_id = 5432

In [3]:
try:
    conn = psycopg2.connect(host= hostname,
                            dbname = database,
                            user = username,
                            password = pwd,
                            port = port_id
                            )
except Exception as error:
    print(error) 

### 1. What is the total amount each customer spent at the restaurant?

In [4]:
a1_script = """
SELECT s.customer_id AS customer, SUM(m.price) AS total_payment
FROM  sales s
JOIN menu m
    ON s.product_id = m.product_id
GROUP BY 1
ORDER by 1;
"""
a1_sql = pd.read_sql_query(a1_script,con=conn)

a1_sql

Unnamed: 0,customer,total_payment
0,A,76
1,B,74
2,C,36


### 2. How many days has each customer visited the restaurant?

In [5]:
a2_script = """
SELECT customer_id AS customer, COUNT(DISTINCT(order_date)) AS num_of_days
FROM sales
GROUP BY 1;
"""
a2_sql = pd.read_sql_query(a2_script,con=conn)

a2_sql

Unnamed: 0,customer,num_of_days
0,A,4
1,B,6
2,C,2


### 3. What was the item(s) from the menu purchased by each customer in their first day?

In [6]:
a3_script = """
SELECT  DISTINCT customer_id, product_name
FROM (SELECT 
        s.customer_id, m.product_name, 
        s.order_date, 
        rank() OVER(PARTITION BY s.customer_id ORDER by s.order_date) as rnk
      FROM sales s
      JOIN menu m
        ON s.product_id = m.product_id ) AS t
WHERE rnk = 1;
"""
a3_sql = pd.read_sql_query(a3_script,con=conn)

a3_sql

Unnamed: 0,customer_id,product_name
0,A,curry
1,A,sushi
2,B,curry
3,C,ramen


### 4. What is the most purchased item on the menu and how many times was it purchased by all customers?

In [7]:
a4_script = """
SELECT m.product_id, 
        m.product_name, 
        COUNT(*) AS purchase_count
FROM menu m
JOIN sales s
    ON m.product_id = s.product_id
GROUP BY 1,2
ORDER BY 3 DESC
LIMIT 1;
"""
a4_sql = pd.read_sql_query(a4_script,con=conn)

a4_sql


Unnamed: 0,product_id,product_name,purchase_count
0,3,ramen,8


### 5. Which item was the most popular for each customer?

In [8]:
a5_script = """
SELECT customer_id, product_name, product_count
FROM (
    SELECT s.customer_id, 
            m.product_name,
            COUNT(s.product_id) AS product_count,
            RANK() OVER(PARTITION BY customer_id ORDER BY COUNT(s.product_id) DESC) AS rnk
    FROM sales s
    JOIN menu m
        ON s.product_id  = m.product_id
    GROUP BY 1,2 ) t
WHERE rnk = 1;
"""
a5_sql = pd.read_sql_query(a5_script,con=conn)

a5_sql

Unnamed: 0,customer_id,product_name,product_count
0,A,ramen,3
1,B,curry,2
2,B,ramen,2
3,B,sushi,2
4,C,ramen,3


## 6. Which item was purchased first by the customer after the day they became a member?

In [9]:
a6_script = """
SELECT customer_id, join_date, order_date, product_id
FROM (
    SELECT c.customer_id, 
            c.join_date, 
            s.order_date, 
            s.product_id,
            RANK() OVER (PARTITION BY c.customer_id ORDER BY s.order_date) AS rnk
    FROM sales s
    JOIN customers c
        ON s.customer_id = c.customer_id AND s.order_date > c.join_date ) t
WHERE rnk =1;
"""
a6_sql = pd.read_sql_query(a6_script,con=conn)

a6_sql

Unnamed: 0,customer_id,join_date,order_date,product_id
0,A,2021-01-07,2021-01-10,3
1,B,2021-01-09,2021-01-11,1
2,C,2021-01-01,2021-01-07,3


### 7. Which item was purchased just before the customer became a member?

In [10]:
a7_script = """
SELECT customer_id, join_date, order_date, product_id
FROM (
    SELECT c.customer_id, 
            c.join_date, 
            s.order_date, 
            s.product_id,
            RANK() OVER (PARTITION BY c.customer_id ORDER BY s.order_date) AS rnk
    FROM sales s
    JOIN customers c
        ON s.customer_id = c.customer_id AND s.order_date < c.join_date ) t
    WHERE rnk = 1;
"""
a7_sql = pd.read_sql_query(a7_script,con=conn)

a7_sql

Unnamed: 0,customer_id,join_date,order_date,product_id
0,A,2021-01-07,2021-01-01,1
1,A,2021-01-07,2021-01-01,2
2,B,2021-01-09,2021-01-01,2


## 8. What is the total items and amount spent for each member before they became a member?

In [11]:
a8_script = """
SELECT customer_id, 
        COUNT(product_id) AS product_count,
        SUM(price) AS total_revenue
    FROM (
    SELECT c.customer_id, 
                c.join_date, 
                s.order_date, 
                s.product_id,
                m.price,
                RANK() OVER (PARTITION BY c.customer_id ORDER BY s.order_date) AS rnk
        FROM sales s
        JOIN customers c
            ON s.customer_id = c.customer_id AND s.order_date < c.join_date
        JOIN menu m
            ON m.product_id = s.product_id) t
    GROUP BY 1
"""
a8_sql = pd.read_sql_query(a8_script,con=conn)

a8_sql

Unnamed: 0,customer_id,product_count,total_revenue
0,B,3,40
1,A,2,25


### 9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier — how many points would each customer have?

In [12]:
a9_script = """
SELECT customer_id, SUM(price*product_points) AS customer_point
FROM (
    SELECT  s.customer_id,
        s.order_date,
        s.product_id,
        m.product_name,
        m.price,
        CASE 
            WHEN m.product_name = 'sushi' THEN 20
            ELSE 10
        ENd AS product_points
    FROM sales s
    JOIN menu m
    ON s.product_id = m.product_id) t
GROUP BY 1;
"""
a9_sql = pd.read_sql_query(a9_script,con=conn)

a9_sql

Unnamed: 0,customer_id,customer_point
0,B,940
1,C,360
2,A,860


###  10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not  just sushi - how many points do customer A and B have at the end of January? 

In [13]:
a10_script = """
SELECT customer_id, SUM(price*points)
FROM (
    SELECT c.customer_id, 
            c.join_date,
            CAST((c.join_date + INTERVAL'6 day') AS date)  AS end_period_date,
            s.order_date,
            m.product_name, 
            m.price,
            CASE 
                WHEN s.order_date <= CAST((c.join_date + INTERVAL'6 day') AS date) AND s.order_date >= c.join_date THEN 20 
                WHEN s.order_date > CAST((c.join_date + INTERVAL'6 day') AS date)  AND m.product_name = 'sushi' THEN 20 
                WHEN s.order_date > CAST((c.join_date + INTERVAL'6 day') AS date)  AND m.product_name != 'sushi' THEN 10 
                ELSE 0 
            END AS points
    FROM customers c
    JOIN sales s
        ON c.customer_id = s.customer_id
    JOIN menu m
        ON s.product_id = m.product_id
    WHERE s.order_date BETWEEN '2021-01-01'::date AND '2021-01-31'::date) t
WHERE customer_id in ('A','B')
GROUP BY 1
"""
a10_sql = pd.read_sql_query(a10_script,con=conn)

a10_sql

Unnamed: 0,customer_id,sum
0,A,1020
1,B,320
