# DS-SF-34 | 03 | Databases, Scrapping, and APIs | Assignment | Starter Code

## The Bistro Meets `SQLite`

In this assignment, we will be exploring the `bistro` dataset.  The previous assignment used `pandas`.  Today, we'll answer the same questions but using `SQLite`.  In some situations, `pandas` will be a better solution.  In others, doing it using `SQL` will make more sense.  As you gain more experience, you'll know which one to use.

> ### Question 1.  Import the `sqlite3` package.

In [2]:
import os

import pandas as pd
pd.set_option('display.max_rows', 10)
pd.set_option('display.notebook_repr_html', True)
pd.set_option('display.max_columns', 10)

import sqlite3

> ### Question 2.  Connect to the `dataset-03-bistro.db` database.  The rest of this assignment focus on the `bistro` table.

In [8]:
db = sqlite3.connect(os.path.join('..', 'datasets', 'dataset-03-bistro.db'))

> ### Question 3.  How many samples (i.e., rows) are in this dataset?

In [15]:
pd.io.sql.read_sql(
'''
SELECT COUNT(*)
    FROM bistro
    ;
''', con = db)

Unnamed: 0,COUNT(*)
0,244


Answer: TODO

> ### Question 4.  Print the first two rows of the table to the console.

In [17]:
pd.io.sql.read_sql(
'''
SELECT *
    FROM bistro
    LIMIT 2
    ;
''', con = db)

Unnamed: 0,index,day,time,name,gender,is_smoker,party,check,tip
0,0,Sunday,Dinner,Kimberly,Female,0,2,16.99,1.01
1,1,Sunday,Dinner,Nicholas,Male,0,3,10.34,1.66


In [16]:
pd.io.sql.read_sql(
'''
SELECT *
    FROM bistro
    ;
''', con = db)

Unnamed: 0,index,day,time,name,gender,is_smoker,party,check,tip
0,0,Sunday,Dinner,Kimberly,Female,0,2,16.99,1.01
1,1,Sunday,Dinner,Nicholas,Male,0,3,10.34,1.66
2,2,Sunday,Dinner,Larry,Male,0,3,21.01,3.50
3,3,Sunday,Dinner,Joseph,Male,0,2,23.68,3.31
4,4,Sunday,Dinner,Janice,Female,0,4,24.59,3.61
...,...,...,...,...,...,...,...,...,...
239,239,Saturday,Dinner,Kevin,Male,0,3,29.03,5.92
240,240,Saturday,Dinner,Sandra,Female,0,2,27.18,2.00
241,241,Saturday,Dinner,Carl,Male,0,2,22.67,2.00
242,242,Saturday,Dinner,Jon,Male,0,2,17.82,1.75


> ### Question 5.  For which week days does the dataset has data for?

In [19]:
pd.io.sql.read_sql(
'''
SELECT DISTINCT [day]
    FROM bistro
    ;
''', con = db)

Unnamed: 0,day
0,Sunday
1,Saturday
2,Thursday
3,Friday


Answer: TODO

> ### Question 6.  How often was the bistro patronized for each week day?

In [36]:
pd.io.sql.read_sql(
'''
SELECT DAY, COUNT() AS visits
    FROM bistro
    GROUP BY day
    ;
''', con = db)

Unnamed: 0,day,visits
0,Friday,19
1,Saturday,87
2,Sunday,76
3,Thursday,62


Answer: TODO

> ### Question 7.  How much tip did waiters collect for each week day?

In [40]:
pd.io.sql.read_sql(
'''
SELECT sum(tip), day
    FROM bistro
    GROUP BY day
    ;
''', con = db)

Unnamed: 0,sum(tip),day
0,51.96,Friday
1,260.4,Saturday
2,247.39,Sunday
3,171.83,Thursday


Answer: TODO

> ### Question 8.  What is the average tip per check (in absolute \$) for each week day?

In [42]:
pd.io.sql.read_sql(
'''
SELECT avg(tip) AS avg_tip, day
    FROM bistro
    GROUP BY day
    ;
''', con = db)

Unnamed: 0,avg_tip,day
0,2.734737,Friday
1,2.993103,Saturday
2,3.255132,Sunday
3,2.771452,Thursday


Answer: TODO

> ### Question 9.  What is the average tip per check (as a percentage of the check) for each week day?

(`CHECK` is a reserved keywork; use `` `check` `` (put the name between backticks) to reference the `check` column)

In [50]:
pd.io.sql.read_sql(
'''
SELECT (SUM(tip) / SUM(`check`)) AS avg_tip, day
    FROM bistro
    GROUP BY day
    ORDER BY avg_tip asc
    ;
''', con = db)

Unnamed: 0,avg_tip,day
0,0.146424,Saturday
1,0.152038,Sunday
2,0.156732,Thursday
3,0.159445,Friday


Answer: TODO

> ### Question 10.  Are there any name in common between male and female patrons?  (E.g., `Chris` can refer to either a man or a woman)

(check `numpy.intersect1d()`; it could come in handy)

(https://docs.scipy.org/doc/numpy/reference/generated/numpy.intersect1d.html)

In [59]:
pd.io.sql.read_sql(
'''
SELECT name
    FROM bistro
    WHERE gender = 'Female'

INTERSECT
SELECT name
    FROM bistro
    WHERE gender = 'Male'
    
    ;
''', con = db)

Unnamed: 0,name
0,Casey


Answer: TODO

> ### Question 11.  If no patrons share the same name, how many unique patrons are in the dataset?

In [63]:
pd.io.sql.read_sql(
'''
SELECT DISTINCT gender, name
    FROM bistro
    ;
''', con = db)

Unnamed: 0,gender,name
0,Female,Kimberly
1,Male,Nicholas
2,Male,Larry
3,Male,Joseph
4,Female,Janice
...,...,...
177,Male,Darwin
178,Male,Henry
179,Male,Jeremy
180,Female,Dorothy


Answer: TODO

> ### Question 12.  How many times did `Kevin` patronized the bistro?  How about `Alice`?

In [None]:
# TODO

Answer: TODO

> ### Question 13.  Who are the top 3 female and male patrons?

In [None]:
# TODO

Answer: TODO

> ### Question 14.  Who's the best tipper (as a fraction of all tips over all check totals)?  Who's the worst?  How many times did they patronize the bistro?

In [None]:
# TODO

Answer: TODO