#  Case Study 1: Diner Data Analysis
<img src='../assets/dinner_shop.png'>

## Problem Statement
### The owner of a dinner restaurant wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent, and which menu items are their favourite.


### Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers. He offered a subscription program to his cutomers through which they can gain points which can be used later to get a discount on meals. In return, subscribed customers should fill some personal information in a survey form.  He plans on using these insights to help him decide whether he should expand the existing customer loyalty program — additionally he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.


### Assumptions:
### 1. Customers can buy a meal (product) before registering for the subsrciption program.
### 2. A customer may buy mroe than one meal in the same day (invited his/her firend over diner)

## ERD


### The data set contains the following 3 tables which you may refer to the relationship diagram below to understand the connection.

#### - sales
#### - members
#### - menu

<img src='../assets/ERD.PNG'>

In [1]:
import pandas as pd

import psycopg2

import warnings
warnings.filterwarnings("ignore")

In [2]:
hostname = 'localhost'
database = 'SQL_Challenge'
username = 'postgres'
pwd = 'Iamwaleedsiendeed1'
port_id = 5432

In [3]:
try:
    conn = psycopg2.connect(host= hostname,
                            dbname = database,
                            user = username,
                            password = pwd,
                            port = port_id
                            )
except Exception as error:
    print(error) 

In [4]:
cur = conn.cursor()

In [5]:
drop_customers_tbl = """
DROP TABLE IF EXISTS customers CASCADE
"""

cur.execute(drop_customers_tbl)
conn.commit()

In [6]:
make_customers_tbl = """
CREATE TABLE IF NOT EXISTS customers (
    customer_id SMALLINT PRIMARY KEY,
    join_date DATE,
    first_name VARCHAR(30),
    last_name VARCHAR(30),
    email VARCHAR(50));
"""

cur.execute(make_customers_tbl)
conn.commit()

In [7]:
# fill_customer_tbl = """
# INSERT INTO customers
#   (customer_id, join_date)
# VALUES
#   ('A', '2021-01-07'),
#   ('B', '2021-01-09'),
#   ('C', '2021-01-01');
# """
fill_customer_tbl = """
COPY customers
  (customer_id, join_date, first_name, last_name,email)
FROM 'C:\\Users\\ADMIN\\Desktop\\data\\customers_data.csv'
WITH (FORMAT CSV, HEADER, DELIMITER ',');
"""

cur.execute(fill_customer_tbl)
conn.commit()

In [8]:
drop_menu_tbl = """
DROP TABLE IF EXISTS menu CASCADE
"""

cur.execute(drop_menu_tbl)
conn.commit()

In [9]:
make_menu_tbl = """
CREATE TABLE IF NOT EXISTS menu (
    meal_id SMALLINT PRIMARY KEY,
    meal_name VARCHAR(30),
    price SMALLINT);

"""

cur.execute(make_menu_tbl)
conn.commit()

In [10]:
# fill_menu_tbl = """
# INSERT INTO menu
#   (meal_id, meal_name, price)
# VALUES
#   (1, 'sushi', 10),
#   (2, 'curry', 12),
#   (3, 'non alcoholic ramen', 14);
# """

fill_menu_tbl = """
COPY menu
  (meal_id, meal_name, price)
FROM 'C:\\Users\\ADMIN\\Desktop\\data\\menu_data.csv'
WITH (FORMAT CSV, HEADER, DELIMITER ',');
"""

cur.execute(fill_menu_tbl)
conn.commit()

In [11]:
drop_sales_tbl = """
DROP TABLE IF EXISTS sales CASCADE
"""
cur.execute(drop_sales_tbl)
conn.commit()

In [12]:
make_sales_tbl = """
CREATE TABLE sales (
  order_id SMALLINT,
  order_date DATE,
  customer_id SMALLINT REFERENCES customers(customer_id) ON DELETE CASCADE,
  meal_id SMALLINT REFERENCES menu(meal_id) ON DELETE CASCADE
);

"""

cur.execute(make_sales_tbl)
conn.commit()

In [13]:



fill_sales_tbl = """
COPY sales
  (order_id, order_date, customer_id, meal_id)
FROM 'C:\\Users\\ADMIN\\Desktop\\data\\sales_data.csv'
WITH (FORMAT CSV, HEADER, DELIMITER ',');
"""

cur.execute(fill_sales_tbl)
conn.commit()

In [14]:
customers = """
SELECT *
FROM  customers

"""
customers = pd.read_sql_query(customers,con=conn)

customers

Unnamed: 0,customer_id,join_date,first_name,last_name,email
0,1,2021-02-24,Aimee,Berrisford,aberrisford0@over-blog.com
1,2,2021-01-04,Ninon,Bloyes,nbloyes1@scientificamerican.com
2,3,2021-02-20,Luce,Evershed,levershed2@google.pl
3,4,2021-02-10,Blair,Cosby,bcosby3@dyndns.org
4,5,2021-02-22,Morey,Shatliffe,mshatliffe4@theguardian.com
5,6,2021-01-05,Leonerd,Stickels,lstickels5@europa.eu
6,7,2021-02-24,Imogene,Le Huquet,ilehuquet6@youtu.be
7,8,2021-03-08,Anthony,Thompson,athompson7@wikispaces.com
8,9,2021-01-10,Karie,Pessolt,kpessolt8@go.com
9,10,2021-01-05,Robina,Kesper,rkesper9@taobao.com


In [15]:
menu = """
SELECT *
FROM  customers

"""
menu = pd.read_sql_query(menu,con=conn)

menu

Unnamed: 0,customer_id,join_date,first_name,last_name,email
0,1,2021-02-24,Aimee,Berrisford,aberrisford0@over-blog.com
1,2,2021-01-04,Ninon,Bloyes,nbloyes1@scientificamerican.com
2,3,2021-02-20,Luce,Evershed,levershed2@google.pl
3,4,2021-02-10,Blair,Cosby,bcosby3@dyndns.org
4,5,2021-02-22,Morey,Shatliffe,mshatliffe4@theguardian.com
5,6,2021-01-05,Leonerd,Stickels,lstickels5@europa.eu
6,7,2021-02-24,Imogene,Le Huquet,ilehuquet6@youtu.be
7,8,2021-03-08,Anthony,Thompson,athompson7@wikispaces.com
8,9,2021-01-10,Karie,Pessolt,kpessolt8@go.com
9,10,2021-01-05,Robina,Kesper,rkesper9@taobao.com


In [16]:
sales = """
SELECT *
FROM  sales

"""
sales = pd.read_sql_query(sales,con=conn)

sales

Unnamed: 0,order_id,order_date,customer_id,meal_id
0,1,2021-03-27,25,8
1,2,2021-03-03,14,7
2,3,2021-01-02,3,5
3,4,2021-02-24,8,4
4,5,2021-03-03,1,6
...,...,...,...,...
995,996,2021-03-11,10,5
996,997,2021-03-09,11,5
997,998,2021-03-03,20,4
998,999,2021-03-02,15,4


### 1. What is the total amount each customer spent at the restaurant?

In [17]:
a1_script = """
SELECT s.customer_id AS customer, SUM(m.price) AS total_payment
FROM  sales s
JOIN menu m
    ON s.meal_id = m.meal_id
GROUP BY 1
ORDER by 1;
"""
a1_sql = pd.read_sql_query(a1_script,con=conn)

a1_sql

Unnamed: 0,customer,total_payment
0,1,906
1,2,766
2,3,696
3,4,612
4,5,906
5,6,714
6,7,978
7,8,902
8,9,818
9,10,806


### 2. How many days has each customer visited the restaurant?

In [18]:
a2_script = """
SELECT customer_id AS customer, COUNT(DISTINCT(order_date)) AS num_of_days
FROM sales
GROUP BY 1;
"""
a2_sql = pd.read_sql_query(a2_script,con=conn)

a2_sql

Unnamed: 0,customer,num_of_days
0,1,31
1,2,29
2,3,27
3,4,23
4,5,35
5,6,25
6,7,34
7,8,31
8,9,33
9,10,35


### 3. What was the item(s) from the menu purchased by each customer in their first day?

In [19]:
a3_script = """
SELECT  DISTINCT customer_id, meal_name
FROM (SELECT 
        s.customer_id, m.meal_name, 
        s.order_date, 
        rank() OVER(PARTITION BY s.customer_id ORDER by s.order_date) as rnk
      FROM sales s
      JOIN menu m
        ON s.meal_id = m.meal_id ) AS t
WHERE rnk = 1;
"""
a3_sql = pd.read_sql_query(a3_script,con=conn)

a3_sql

Unnamed: 0,customer_id,meal_name
0,1,Non Alcoholic Ramen
1,2,Sushi
2,3,Honey Mustard Chicken
3,4,Sushi
4,5,Honey Mustard Chicken
5,6,Non Alcoholic Ramen
6,7,Moroccan Apricot Chicken
7,8,Non Alcoholic Ramen
8,9,Chicken Curry
9,9,Chicken Parmesan Pizza


### 4. What is the most purchased item on the menu and how many times was it purchased by all customers?

In [20]:
a4_script = """
SELECT m.meal_id, 
        m.meal_name, 
        COUNT(*) AS purchase_count
FROM menu m
JOIN sales s
    ON m.meal_id = s.meal_id
GROUP BY 1,2
ORDER BY 3 DESC
LIMIT 1;
"""
a4_sql = pd.read_sql_query(a4_script,con=conn)

a4_sql


Unnamed: 0,meal_id,meal_name,purchase_count
0,8,Chicken Curry,134


### 5. Which item was the most popular for each customer?

In [21]:
a5_script = """
SELECT customer_id, meal_name, product_count
FROM (
    SELECT s.customer_id, 
            m.meal_name,
            COUNT(s.meal_id) AS product_count,
            RANK() OVER(PARTITION BY customer_id ORDER BY COUNT(s.meal_id) DESC) AS rnk
    FROM sales s
    JOIN menu m
        ON s.meal_id  = m.meal_id
    GROUP BY 1,2 ) t
WHERE rnk = 1;
"""
a5_sql = pd.read_sql_query(a5_script,con=conn)

a5_sql

Unnamed: 0,customer_id,meal_name,product_count
0,1,Non Alcoholic Ramen,9
1,2,Chicken Curry,7
2,2,Beef & Blue Cheese Tart,7
3,3,Sushi,7
4,4,Chicken Parmesan Pizza,7
5,5,Non Alcoholic Ramen,8
6,6,Non Alcoholic Ramen,7
7,6,Chicken Curry,7
8,7,Chicken Curry,8
9,7,Chicken Parmesan Pizza,8


### 6. Which item was purchased first by the customer after the day they became a member?

In [22]:
a6_script = """
SELECT customer_id, join_date, order_date, meal_id
FROM (
    SELECT c.customer_id, 
            c.join_date, 
            s.order_date, 
            s.meal_id,
            RANK() OVER (PARTITION BY c.customer_id ORDER BY s.order_date) AS rnk
    FROM sales s
    JOIN customers c
        ON s.customer_id = c.customer_id AND s.order_date > c.join_date ) t
WHERE rnk =1;
"""
a6_sql = pd.read_sql_query(a6_script,con=conn)

a6_sql

Unnamed: 0,customer_id,join_date,order_date,meal_id
0,1,2021-02-24,2021-02-27,3
1,1,2021-02-24,2021-02-27,6
2,2,2021-01-04,2021-01-06,6
3,3,2021-02-20,2021-02-26,1
4,4,2021-02-10,2021-02-14,4
5,5,2021-02-22,2021-02-23,6
6,5,2021-02-22,2021-02-23,5
7,6,2021-01-05,2021-01-13,8
8,6,2021-01-05,2021-01-13,5
9,7,2021-02-24,2021-02-25,8


### 7. Which item was purchased just before the customer became a member?

In [23]:
a7_script = """
SELECT customer_id, join_date, order_date, meal_id
FROM (
    SELECT c.customer_id, 
            c.join_date, 
            s.order_date, 
            s.meal_id,
            RANK() OVER (PARTITION BY c.customer_id ORDER BY s.order_date) AS rnk
    FROM sales s
    JOIN customers c
        ON s.customer_id = c.customer_id AND s.order_date < c.join_date ) t
    WHERE rnk = 1;
"""
a7_sql = pd.read_sql_query(a7_script,con=conn)

a7_sql

Unnamed: 0,customer_id,join_date,order_date,meal_id
0,1,2021-02-24,2021-01-01,6
1,3,2021-02-20,2021-01-02,5
2,4,2021-02-10,2021-01-15,7
3,5,2021-02-22,2021-01-01,5
4,6,2021-01-05,2021-01-01,6
5,7,2021-02-24,2021-01-02,2
6,8,2021-03-08,2021-01-01,6
7,9,2021-01-10,2021-01-04,8
8,9,2021-01-10,2021-01-04,3
9,10,2021-01-05,2021-01-01,6


### 8. What is the total items and amount spent for each member before they became a member?

In [24]:
a8_script = """
SELECT customer_id, 
        COUNT(meal_id) AS product_count,
        SUM(price) AS total_revenue
    FROM (
    SELECT c.customer_id, 
                c.join_date, 
                s.order_date, 
                s.meal_id,
                m.price,
                RANK() OVER (PARTITION BY c.customer_id ORDER BY s.order_date) AS rnk
        FROM sales s
        JOIN customers c
            ON s.customer_id = c.customer_id AND s.order_date < c.join_date
        JOIN menu m
            ON m.meal_id = s.meal_id) t
    GROUP BY 1
"""
a8_sql = pd.read_sql_query(a8_script,con=conn)

a8_sql

Unnamed: 0,customer_id,product_count,total_revenue
0,8,29,648
1,16,28,622
2,28,26,566
3,15,22,492
4,4,11,238
5,23,21,450
6,20,8,172
7,25,24,542
8,1,21,438
9,13,10,216


### 9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier — how many points would each customer have?

In [25]:
a9_script = """
SELECT customer_id, SUM(price*product_points) AS customer_point
FROM (
    SELECT  s.customer_id,
        s.order_date,
        s.meal_id,
        m.meal_name,
        m.price,
        CASE 
            WHEN m.meal_name = 'sushi' THEN 20
            ELSE 10
        ENd AS product_points
    FROM sales s
    JOIN menu m
    ON s.meal_id = m.meal_id) t
GROUP BY 1;
"""
a9_sql = pd.read_sql_query(a9_script,con=conn)

a9_sql

Unnamed: 0,customer_id,customer_point
0,8,9020
1,16,7880
2,28,8040
3,15,6000
4,4,6120
5,23,6440
6,20,5280
7,25,5660
8,1,9060
9,26,9800


###  10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not  just sushi - how many points do each customer have at the end of February? 

In [26]:
a10_script = """
SELECT customer_id, SUM(price*points)
FROM (
    SELECT c.customer_id, 
            c.join_date,
            CAST((c.join_date + INTERVAL'6 day') AS date)  AS end_period_date,
            s.order_date,
            m.meal_name, 
            m.price,
            CASE 
                WHEN s.order_date <= CAST((c.join_date + INTERVAL'20 day') AS date) AND s.order_date >= c.join_date THEN 20 
                WHEN s.order_date > CAST((c.join_date + INTERVAL'20 day') AS date)  AND m.meal_name = 'sushi' THEN 20 
                WHEN s.order_date > CAST((c.join_date + INTERVAL'20 day') AS date)  AND m.meal_name != 'sushi' THEN 10 
                ELSE 0 
            END AS points
    FROM customers c
    JOIN sales s
        ON c.customer_id = s.customer_id
    JOIN menu m
        ON s.meal_id = m.meal_id
    WHERE s.order_date BETWEEN '2021-01-01'::date AND '2021-02-28'::date) t
GROUP BY 1
"""
a10_sql = pd.read_sql_query(a10_script,con=conn)

a10_sql

Unnamed: 0,customer_id,sum
0,1,880
1,2,5700
2,3,920
3,4,2920
4,5,920
5,6,5600
6,7,1320
7,8,0
8,9,6320
9,10,5000


In [27]:
cur.close()
conn.close()