# SQL Case Study - Country Club
### Chloe Jung


### Part 1

/* Welcome to the SQL mini project. You will carry out this project partly in
the PHPMyAdmin interface, and partly in Jupyter via a Python connection.
 
This is Tier 2 of the case study, which means that there'll be less guidance for you about how to setup
your local SQLite connection in PART 2 of the case study. This will make the case study more challenging for you:
you might need to do some digging, aand revise the Working with Relational Databases in Python chapter in the previous resource.
 
Otherwise, the questions in the case study are exactly the same as with Tier 1.
 
PART 1: PHPMyAdmin
You will complete questions 1-9 below in the PHPMyAdmin interface.
Log in by pasting the following URL into your browser, and
using the following Username and Password:
 
URL: https://sql.springboard.com/
Username: student
Password: learn_sql@springboard
 
The data you need is in the "country_club" database. This database
contains 3 tables:
	i) the "Bookings" table,
	ii) the "Facilities" table, and
	iii) the "Members" table.
 
In this case study, you'll be asked a series of questions. You can
solve them using the platform, but for the final deliverable,
paste the code for each solution into this script, and upload it
to your GitHub.
 
Before starting with the questions, feel free to take your time,
exploring the data, and getting acquainted with the 3 tables. */
 
 
/* QUESTIONS
/* Q1: Some of the facilities charge a fee to members, but some do not.
Write a SQL query to produce a list of the names of the facilities that do. */
 
SELECT * FROM Facilities
WHERE membercost != 0.0
 
/* Q2: How many facilities do not charge a fee to members? */
 
SELECT COUNT(*) FROM Facilities WHERE membercost != 0.0
 
/* Q3: Write an SQL query to show a list of facilities that charge a fee to members,
where the fee is less than 20% of the facility's monthly maintenance cost.
Return the facid, facility name, member cost, and monthly maintenance of the
facilities in question. */
 
SELECT facid, name, membercost, monthlymaintenance FROM Facilities
WHERE membercost < 0.2*monthlymaintenance
 
 
/* Q4: Write an SQL query to retrieve the details of facilities with ID 1 and 5.
Try writing the query without using the OR operator. */
 
SELECT * FROM Facilities
WHERE facid IN (1,5)

/* Q5: Produce a list of facilities, with each labelled as
'cheap' or 'expensive', depending on if their monthly maintenance cost is
more than $100. Return the name and monthly maintenance of the facilities
in question. */
 
SELECT name, monthlymaintenance,
CASE WHEN monthlymaintenance >100 THEN 'expensive'
ELSE 'cheap' END AS cost
FROM Facilities

/* Q6: You'd like to get the first and last name of the last member(s)
who signed up. Try not to use the LIMIT clause for your solution. */
 
SELECT surname, firstname
FROM Members
WHERE joindate = (SELECT MAX(joindate) FROM Members)

/* Q7: Produce a list of all members who have used a tennis court.
Include in your output the name of the court, and the name of the member
formatted as a single column. Ensure no duplicate data, and order by
the member name. */

SELECT DISTINCT facid, CONCAT(m.firstname, ‘ ‘, m.surname) AS name
FROM Bookings as b
JOIN Members as m
ON m.memid = b.memid
WHERE facid IN (0,1)

SELECT DISTINCT f.name AS facility, CONCAT(m.firstname, m.surname) As name
FROM Bookings AS b
LEFT JOIN Facilities AS f
USING ( facid )
LEFT JOIN Members AS m
USING ( memid )
WHERE f.name LIKE 'Tennis%'
ORDER BY m.surname, m.firstname

/* Q8: Produce a list of bookings on the day of 2012-09-14 which
will cost the member (or guest) more than $30. Remember that guests have
different costs to members (the listed costs are per half-hour 'slot'), and
the guest user's ID is always 0. Include in your output the name of the
facility, the name of the member formatted as a single column, and the cost.
Order by descending cost, and do not use any subqueries. */

SELECT f.name AS facility, CONCAT(m.firstname, ' ', m.surname) AS name,
CASE WHEN memid>0 THEN f.membercost
ELSE f.guestcost END AS cost
FROM Bookings AS b 
LEFT JOIN Facilities as f 
USING (facid)
LEFT JOIN Members as m
USING (memid)
WHERE starttime LIKE '2012-09-14%'
AND (CASE WHEN memid>0
THEN f.membercost>30
ELSE f.guestcost>30
END)
ORDER BY cost DESC 

/* Q9: This time, produce the same result as in Q8, but using a subquery. */

SELECT f.name AS facility, CONCAT(m.firstname, ' ', m.surname) AS name,
CASE WHEN memid>0 THEN f.membercost
ELSE f.guestcost END AS cost
FROM Bookings AS b 
LEFT JOIN Facilities as f 
USING (facid)
LEFT JOIN Members as m
USING (memid)
WHERE starttime LIKE '2012-09-14%'
AND (CASE WHEN memid>0
THEN f.membercost>30
ELSE f.guestcost>30
END)
ORDER BY cost DESC 


### Part 2

In [1]:
import sqlite3
from sqlite3 import Error

 
def create_connection(db_file):
    """ create a database connection to the SQLite database
        specified by the db_file
    :param db_file: database file
    :return: Connection object or None
    """
    conn = None
    try:
        conn = sqlite3.connect(db_file)
        print(sqlite3.version)
    except Error as e:
        print(e)
 
    return conn

 
def select_all_tasks(conn):
    """
    Query all rows in the tasks table
    :param conn: the Connection object
    :return:
    """
    cur = conn.cursor()
    
    query1 = """
        SELECT *
        FROM FACILITIES
        """
    cur.execute(query1)
 
    rows = cur.fetchall()
 
    for row in rows:
        print(row)


def main():
    database = "sqlite_db_pythonsqlite.db"
 
    # create a database connection
    conn = create_connection(database)
    with conn: 
        print("2. Query all tasks")
        select_all_tasks(conn)
 
 
if __name__ == '__main__':
    main()

2.6.0
2. Query all tasks
(0, 'Tennis Court 1', 5, 25, 10000, 200)
(1, 'Tennis Court 2', 5, 25, 8000, 200)
(2, 'Badminton Court', 0, 15.5, 4000, 50)
(3, 'Table Tennis', 0, 5, 320, 10)
(4, 'Massage Room 1', 9.9, 80, 4000, 3000)
(5, 'Massage Room 2', 9.9, 80, 4000, 3000)
(6, 'Squash Court', 3.5, 17.5, 5000, 80)
(7, 'Snooker Table', 0, 5, 450, 15)
(8, 'Pool Table', 0, 5, 400, 15)


In [2]:
import pandas as pd
from datetime import datetime 

# Create the connection
db = sqlite3.connect('sqlite_db_pythonsqlite.db')

# create the dataframe from a query
bookings = pd.read_sql_query("SELECT * FROM bookings", db)
facilities = pd.read_sql_query("SELECT * FROM facilities", db)
members = pd.read_sql_query("SELECT * FROM members", db)

In [3]:
bookings.head()

Unnamed: 0,bookid,facid,memid,starttime,slots
0,0,3,1,2012-07-03 11:00:00,2
1,1,4,1,2012-07-03 08:00:00,2
2,2,6,0,2012-07-03 18:00:00,2
3,3,7,1,2012-07-03 19:00:00,2
4,4,8,1,2012-07-03 10:00:00,1


In [4]:
facilities.head()

Unnamed: 0,facid,name,membercost,guestcost,initialoutlay,monthlymaintenance
0,0,Tennis Court 1,5.0,25.0,10000,200
1,1,Tennis Court 2,5.0,25.0,8000,200
2,2,Badminton Court,0.0,15.5,4000,50
3,3,Table Tennis,0.0,5.0,320,10
4,4,Massage Room 1,9.9,80.0,4000,3000


In [5]:
members.head()

Unnamed: 0,memid,surname,firstname,address,zipcode,telephone,recommendedby,joindate
0,0,GUEST,GUEST,GUEST,0,(000) 000-0000,,2012-07-01 00:00:00
1,1,Smith,Darren,"8 Bloomsbury Close, Boston",4321,555-555-5555,,2012-07-02 12:02:05
2,2,Smith,Tracy,"8 Bloomsbury Close, New York",4321,555-555-5555,,2012-07-02 12:08:23
3,3,Rownam,Tim,"23 Highway Way, Boston",23423,(844) 693-0723,,2012-07-03 09:32:15
4,4,Joplette,Janice,"20 Crossing Road, New York",234,(833) 942-4710,1.0,2012-07-03 10:25:05


### /* Q10: 
Produce a list of facilities with a total revenue less than 1000.
The output of facility name and total revenue, sorted by revenue. Remember
that there's a different cost for guests and members! */


In [6]:
def total_revenue(facid):
    total_revenue = 0;
    member_cost = facilities.membercost[facilities.facid==facid][facid]
    guest_cost = facilities.guestcost[facilities.facid==facid][facid]
    fac_bookings = bookings[bookings.facid==facid]
    for i, row in fac_bookings.iterrows():
        if row.memid > 0: total_revenue += member_cost
        else : total_revenue += guest_cost
    return total_revenue

In [7]:
ten = pd.DataFrame(columns=['facility name', 'total revenue'])
for i, row in facilities.iterrows():
    ten.loc[i] = [facilities.name[i],total_revenue(i)]

In [8]:
ten

Unnamed: 0,facility name,total revenue
0,Tennis Court 1,4040.0
1,Tennis Court 2,4205.0
2,Badminton Court,604.5
3,Table Tennis,90.0
4,Massage Room 1,20807.9
5,Massage Room 2,6987.3
6,Squash Court,4970.0
7,Snooker Table,115.0
8,Pool Table,265.0


In [9]:
ten[ten['total revenue'] < 1000]

Unnamed: 0,facility name,total revenue
2,Badminton Court,604.5
3,Table Tennis,90.0
7,Snooker Table,115.0
8,Pool Table,265.0


### /* Q11: 
Produce a report of members and who recommended them in alphabetic surname,firstname order */

In [10]:
members.head()

Unnamed: 0,memid,surname,firstname,address,zipcode,telephone,recommendedby,joindate
0,0,GUEST,GUEST,GUEST,0,(000) 000-0000,,2012-07-01 00:00:00
1,1,Smith,Darren,"8 Bloomsbury Close, Boston",4321,555-555-5555,,2012-07-02 12:02:05
2,2,Smith,Tracy,"8 Bloomsbury Close, New York",4321,555-555-5555,,2012-07-02 12:08:23
3,3,Rownam,Tim,"23 Highway Way, Boston",23423,(844) 693-0723,,2012-07-03 09:32:15
4,4,Joplette,Janice,"20 Crossing Road, New York",234,(833) 942-4710,1.0,2012-07-03 10:25:05


In [11]:
eleven = pd.DataFrame(columns=['member name','recommender name'])
eleven

Unnamed: 0,member name,recommender name


In [12]:
for i, row in members.iterrows():
    if i==0: continue
    if members.recommendedby[i]=='': recommender = None
    else: recommender = (members.surname[int(members.recommendedby[i])] + ', ' + members.firstname[int(members.recommendedby[i])])
    eleven.loc[i] = [(members.surname[i]+ ' '+members.firstname[i]), recommender]
eleven.sort_values(by=['recommender name'])

Unnamed: 0,member name,recommender name
21,Sarwin Ramnaresh,"Bader, Florence"
20,Coplin Joan,"Baker, Timothy"
18,Genting Matthew,"Butters, Gerald"
23,Rumney Henrietta,"Coplin, Joan"
16,Baker Timothy,"Farrell, Jemima"
17,Pinker David,"Farrell, Jemima"
22,Jones Douglas,"Jones, David"
7,Dare Nancy,"Joplette, Janice"
11,Jones David,"Joplette, Janice"
8,Boothe Tim,"Rownam, Tim"


### /* Q12: 
Find the facilities with their usage by member, but not guests */

In [16]:
twelve = bookings[bookings.memid > 0].groupby(['facid', 'memid'])[['bookid']].count()
twelve.index = twelve.index.set_levels(list(facilities.name), level=0)
twelve

Unnamed: 0_level_0,Unnamed: 1_level_0,bookid
facid,memid,Unnamed: 2_level_1
Tennis Court 1,2,30
Tennis Court 1,3,6
Tennis Court 1,4,19
Tennis Court 1,5,57
Tennis Court 1,6,31
...,...,...
Pool Table,27,3
Pool Table,28,25
Pool Table,29,33
Pool Table,30,5


### /* Q13: 
Find the facilities usage by month, but not guests */

In [18]:
bookings['month'] = bookings.starttime.apply(lambda x: x.split()[0].split('-')[1])
thirteen = bookings[bookings.memid > 0].groupby(['facid', 'month'])[['bookid']].count()
thirteen

Unnamed: 0_level_0,Unnamed: 1_level_0,bookid
facid,month,Unnamed: 2_level_1
0,7,65
0,8,111
0,9,132
1,7,41
1,8,109
1,9,126
2,7,51
2,8,132
2,9,161
3,7,48
