## Case Study #1 - Danny's Diner


#### Problem Statement
Danny wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent and also which menu items are their favourite. Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers.

He plans on using these insights to help him decide whether he should expand the existing customer loyalty program - additionally he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.

Danny has provided you with a sample of his overall customer data due to privacy issues - but he hopes that these examples are enough for you to write fully functioning SQL queries to help him answer his questions!

Danny has shared with you 3 key datasets for this case study: `sales`, `menu`, and `members`

#### Entity Relationship Diagram

![week1.png](week1.png)

Import modules

In [2]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sqlite3 as sql
from sqlalchemy import text
pd.set_option('display.max_columns', None)

Initialize SQL

In [3]:
conn = sql.connect("week1.db")
cursor = conn.cursor() 
with open('week1-sql.txt','r') as file:
    script = file.read()
    script = script.replace('\n', ' ')
cursor.executescript(script)

<sqlite3.Cursor at 0x1fcc65551c0>

Verify tables

In [7]:
query = """SELECT name FROM sqlite_master WHERE type='table';"""
cursor.execute(query)
tables = [table[0] for table in cursor.fetchall()]
tables
print(f'The tables in the database are: {', '.join(tables)}')

The tables in the database are: sales, menu, members


Fetch table information

In [50]:
for table in tables:
    print("=================================")
    print(f'Table [{table}]')
    df = pd.read_sql_query(f'SELECT * FROM {table}', conn)
    print(f'Dimensions: {df.shape[0]} rows x {df.shape[1]} columns\n')
    print(df.head())
    info_df = pd.DataFrame.from_dict({'Datatypes':df.dtypes, 'NULL count':df.isna().sum()})
    print()
    print(info_df)
    print()

Table [sales]
Dimensions: 15 rows x 3 columns

  customer_id  order_date  product_id
0           A  2021-01-01           1
1           A  2021-01-01           2
2           A  2021-01-07           2
3           A  2021-01-10           3
4           A  2021-01-11           3

            Datatypes  NULL count
customer_id    object           0
order_date     object           0
product_id      int64           0

Table [menu]
Dimensions: 3 rows x 3 columns

   product_id product_name  price
0           1        sushi     10
1           2        curry     15
2           3        ramen     12

             Datatypes  NULL count
product_id       int64           0
product_name    object           0
price            int64           0

Table [members]
Dimensions: 2 rows x 2 columns

  customer_id   join_date
0           A  2021-01-07
1           B  2021-01-09

            Datatypes  NULL count
customer_id    object           0
join_date      object           0



## Case Study Questions

TASK 1: 

In [None]:
# Sample
query = """          
    SELECT 
    strftime('%Y-%m', dateordered) AS 'Order Month',
    sum(orders) AS 'Total Orders',
    sum(CASE WHEN orderstatus = 'returned' THEN 0 ELSE orders END) AS 'Completed Orders',
    sum(CASE WHEN orderstatus = 'complete' THEN 0 ELSE orders END) AS 'Returned Orders',
    100*round(sum(CASE WHEN orderstatus = 'complete' THEN 0 ELSE orders END)/((sum(orders))*1.00),4) AS 'Return Percentage'
    FROM storeco 
    GROUP BY 1
    ORDER BY 1 ASC;               
"""
result = pd.read_sql_query(query, conn)
result

['sales']
