# DS-SF-34 | 03 | Databases, Scrapping, and APIs | Assignment | Starter Code

## The Bistro Meets `SQLite`

In this assignment, we will be exploring the `bistro` dataset.  The previous assignment used `pandas`.  Today, we'll answer the same questions but using `SQLite`.  In some situations, `pandas` will be a better solution.  In others, doing it using `SQL` will make more sense.  As you gain more experience, you'll know which one to use.

> ### Question 1.  Import the `sqlite3` package.

In [2]:
import os

import pandas as pd
pd.set_option('display.max_rows', 10)
pd.set_option('display.notebook_repr_html', True)
pd.set_option('display.max_columns', 10)

# TODO
import sqlite3

> ### Question 2.  Connect to the `dataset-03-bistro.db` database.  The rest of this assignment focus on the `bistro` table.

In [3]:
# TODO
db = sqlite3.connect(os.path.join('..', 'datasets', 'dataset-03-bistro.db'))

> ### Question 3.  How many samples (i.e., rows) are in this dataset?

In [8]:
# 244
pd.io.sql.read_sql(
'''
SELECT COUNT(*)
    FROM bistro
    
;
''', con = db)

Unnamed: 0,COUNT(*)
0,244


Answer: TODO

> ### Question 4.  Print the first two rows of the table to the console.

In [9]:
# TODO
pd.io.sql.read_sql(
'''
SELECT *
    FROM bistro
    LIMIT 2
;
''', con = db)

Unnamed: 0,index,day,time,name,gender,is_smoker,party,check,tip
0,0,Sunday,Dinner,Kimberly,Female,0,2,16.99,1.01
1,1,Sunday,Dinner,Nicholas,Male,0,3,10.34,1.66


> ### Question 5.  For which week days does the dataset has data for?

In [13]:
# TODO
pd.io.sql.read_sql(
'''
SELECT day
    FROM bistro
    GROUP BY day
;
''', con = db)

Unnamed: 0,day
0,Friday
1,Saturday
2,Sunday
3,Thursday


Answer: TODO

> ### Question 6.  How often was the bistro patronized for each week day?

In [22]:
# TODO
pd.io.sql.read_sql(
'''
SELECT day, COUNT(day) AS Patronage
    FROM bistro
    GROUP BY day
;
''', con = db)

Unnamed: 0,day,Patronage
0,Friday,19
1,Saturday,87
2,Sunday,76
3,Thursday,62


Answer: TODO

> ### Question 7.  How much tip did waiters collect for each week day?

In [19]:
# TODO
pd.io.sql.read_sql(
'''
SELECT day, SUM(tip) AS Total_Tips
    FROM bistro
    GROUP BY day
;
''', con = db)

Unnamed: 0,day,Total_Tips
0,Friday,51.96
1,Saturday,260.4
2,Sunday,247.39
3,Thursday,171.83


Answer: TODO

> ### Question 8.  What is the average tip per check (in absolute \$) for each week day?

In [23]:
# TODO
pd.io.sql.read_sql(
'''
SELECT day, AVG(tip) AS Avg_Tips
    FROM bistro
    GROUP BY day
;
''', con = db)

Unnamed: 0,day,Avg_Tips
0,Friday,2.734737
1,Saturday,2.993103
2,Sunday,3.255132
3,Thursday,2.771452


Answer: TODO

> ### Question 9.  What is the average tip per check (as a percentage of the check) for each week day?

(`CHECK` is a reserved keywork; use `` `check` `` (put the name between backticks) to reference the `check` column)

In [30]:
# TODO
pd.io.sql.read_sql(
'''
SELECT day, (SUM(tip) / SUM(`check`)) AS Avg_Tips
    FROM bistro
    GROUP BY day
;
''', con = db)

Unnamed: 0,day,Avg_Tips
0,Friday,0.159445
1,Saturday,0.146424
2,Sunday,0.152038
3,Thursday,0.156732


Answer: TODO

> ### Question 10.  Are there any name in common between male and female patrons?  (E.g., `Chris` can refer to either a man or a woman)

(check `numpy.intersect1d()`; it could come in handy)

(https://docs.scipy.org/doc/numpy/reference/generated/numpy.intersect1d.html)

In [33]:
# TODO
pd.io.sql.read_sql(
'''
SELECT name
    FROM bistro
    WHERE gender = 'Female'

INTERSECT 
SELECT name
    FROM bistro
    WHERE gender ='Male'
    
;
''', con = db)



Unnamed: 0,name
0,Casey


Answer: TODO

> ### Question 11.  If no patrons share the same name, how many unique patrons are in the dataset?

In [36]:
# TODO
pd.io.sql.read_sql(
'''
SELECT DISTINCT gender, name
    FROM bistro
    GROUP BY name
    
;
''', con = db)

Unnamed: 0,gender,name
0,Male,Adalberto
1,Male,Aiden
2,Male,Alejandro
3,Male,Alex
4,Male,Alexander
...,...,...
176,Female,Wanda
177,Male,Wilburn
178,Male,William
179,Male,Willie


Answer: TODO

> ### Question 12.  How many times did `Kevin` patronized the bistro?  How about `Alice`?

In [43]:
# TODO
pd.io.sql.read_sql(
'''
SELECT name, COUNT(day) AS Patronage
    FROM bistro
    WHERE name = 'Kevin'
UNION ALL
SELECT name, COUNT(day) AS Patronage
    FROM bistro
    WHERE name = 'Alice'

;
''', con = db)

Unnamed: 0,name,Patronage
0,Kevin,4
1,Alice,2


Answer: TODO

> ### Question 13.  Who are the top 3 female and male patrons?

In [139]:
# TODO
pd.io.sql.read_sql(
'''
SELECT DISTINCT gender, name, COUNT(day)
    FROM bistro
    WHERE gender = 'Male'
    GROUP BY name
    ORDER BY COUNT(day) DESC
    LIMIT 3
    
;
''', con = db)

Unnamed: 0,gender,name,COUNT(day)
0,Male,David,8
1,Male,Casey,5
2,Male,James,5


In [140]:
pd.io.sql.read_sql(
'''
SELECT DISTINCT gender, name, COUNT(day)
    FROM bistro
    WHERE gender = 'Female'
    GROUP BY name
    ORDER BY COUNT(day) DESC
    LIMIT 3
    
;
''', con = db)

Unnamed: 0,gender,name,COUNT(day)
0,Female,Mary,4
1,Female,Casey,3
2,Female,Laura,3


Answer: TODO

> ### Question 14.  Who's the best tipper (as a fraction of all tips over all check totals)?  Who's the worst?  How many times did they patronize the bistro?

In [158]:
# TODO
pd.io.sql.read_sql(
'''
SELECT name, (SUM(tip) / SUM(`check`)) AS Avg_Tips, COUNT(day) AS Visits
    FROM bistro
    GROUP BY name
    ORDER BY (SUM(tip) / SUM(`check`)) DESC
    LIMIT 1
;
''', con = db)

Unnamed: 0,name,Avg_Tips,Visits
0,Maryann,0.416667,1


In [159]:
pd.io.sql.read_sql(
'''
SELECT name, (SUM(tip) / SUM(`check`)) AS Avg_Tips, COUNT(day) AS Visits
    FROM bistro
    GROUP BY name
    ORDER BY (SUM(tip) / SUM(`check`)) ASC
    LIMIT 1
;
''', con = db)

Unnamed: 0,name,Avg_Tips,Visits
0,Jeremy,0.035638,1


Answer: TODO