Welcome to the SQL mini project. You will carry out this project partly in the PHPMyAdmin interface, and partly in Jupyter via a Python connection.

This is Tier 2 of the case study, which means that there'll be less guidance for you about how to setup your local SQLite connection in PART 2 of the case study. This will make the case study more challenging for you: you might need to do some digging, aand revise the Working with Relational Databases in Python chapter in the previous resource.

Otherwise, the questions in the case study are exactly the same as with Tier 1. 

## PART 1: PHPMyAdmin

You will complete questions 1-9 below in the PHPMyAdmin interface. 

The data you need is in the "country_club" database. This database contains 3 tables:
    i) the "Bookings" table,
    ii) the "Facilities" table, and
    iii) the "Members" table.

In this case study, you'll be asked a series of questions. You can solve them using the platform, but for the final deliverable, paste the code for each solution into this script, and upload it to your GitHub.

Before starting with the questions, feel free to take your time, exploring the data, and getting acquainted with the 3 tables.

### QUESTIONS 

Q1: Some of the facilities charge a fee to members, but some do not. Write a SQL query to produce a list of the names of the facilities that do.

In [None]:
SELECT name
FROM Facilities
WHERE membercost > 0;

Q2: How many facilities do not charge a fee to members? 

In [None]:
SELECT COUNT(facid)
FROM Facilities
WHERE membercost = 0;

Q3: Write an SQL query to show a list of facilities that charge a fee to members, where the fee is less than 20% of the facility's monthly maintenance cost. Return the facid, facility name, member cost, and monthly maintenance of the facilities in question.

In [None]:
SELECT facid, name, membercost, monthlymaintenance
FROM Facilities
WHERE facid
IN (
    SELECT facid
    FROM Facilities
    WHERE membercost > 0
    )
AND (membercost / monthlymaintenance) < 0.2;

Q4: Write an SQL query to retrieve the details of facilities with ID 1 and 5. Try writing the query without using the OR operator.

In [None]:
SELECT *
FROM Facilities
WHERE facid
IN (1, 5);

Q5: Produce a list of facilities, with each labelled as 'cheap' or 'expensive', depending on if their monthly maintenance cost is more than $100. Return the name and monthly maintenance of the facilities in question.

In [None]:
SELECT name, monthlymaintenance,
CASE 
    WHEN monthlymaintenance >100
        THEN 'expensive'
    ELSE 'cheap'
    END 
    AS expense
FROM Facilities;

Q6: You'd like to get the first and last name of the last member(s) who signed up. Try not to use the LIMIT clause for your solution.

In [None]:
SELECT firstname, surname
FROM Members
WHERE (
    SELECT MAX(joindate)
    FROM Members
    ) = joindate

Q7: Produce a list of all members who have used a tennis court. Include in your output the name of the court, and the name of the member formatted as a single column. Ensure no duplicate data, and order by the member name.

In [None]:
SELECT DISTINCT(
    CONCAT(m.firstname,' ', m.surname,' ', 
           CASE 
           WHEN facid = 0 THEN 'Tennis Court 1' ELSE 'Tennis Court 2' END)) AS list
FROM Members AS m
LEFT JOIN Bookings AS b
ON m.memid = b.memid
WHERE facid IN (0,1)
ORDER BY list;

# Just reread the question and now I'm like... this was probably meant to be two columns
# not one column, but also am unsure, so I'm just gonna provide both, also, yes, I could
# join in the facilities table, and would if there were more than two options
# for the tennis court names, but that honestly seemed like a waste of processing power

SELECT 
    CONCAT(m.firstname,' ', m.surname) AS member_name, 
    CASE WHEN facid = 0 THEN 'Tennis Court 1' ELSE 'Tennis Court 2' END AS court
FROM Members AS m
LEFT JOIN Bookings AS b
ON m.memid = b.memid
WHERE facid IN (0,1)
GROUP BY member_name, court
ORDER BY member_name;

Q8: Produce a list of bookings on the day of 2012-09-14 which will cost the member (or guest) more than $30. Remember that guests have different costs to members (the listed costs are per half-hour 'slot'), and the guest user's ID is always 0. Include in your output the name of the facility, the name of the member formatted as a single column, and the cost. Order by descending cost, and do not use any subqueries.

In [None]:
SELECT f.name AS facility, CONCAT(m.firstname,' ', m.surname),
    CASE 
        WHEN b.memid = 0 THEN f.guestcost * b.slots
        ELSE f.membercost * b.slots
    END AS cost
FROM Bookings AS b
LEFT JOIN Facilities AS f
ON b.facid = f.facid
LEFT JOIN Members AS m
ON b.memid = m.memid
WHERE b.starttime LIKE '2012-09-14%' AND 
((b.memid = 0) AND (f.guestcost * b.slots > 30) OR
((b.memid != 0) AND (f.membercost * b.slots > 30)))
ORDER BY cost DESC;

Q9: This time, produce the same result as in Q8, but using a subquery.

In [None]:
SELECT a.member_name AS 'Member Name', b.name AS Facility, 
    CASE 
        WHEN a.memid = 0 THEN b.guestcost * a.slots
        ELSE b.membercost * a.slots
    END AS cost
FROM (SELECT b.memid, b.facid, b.starttime, b.slots, m.member_name
FROM Bookings AS b
JOIN (SELECT CONCAT(firstname, ' ', surname) AS member_name, memid
FROM Members) AS m
ON m.memid = b.memid
WHERE starttime LIKE '2012-09-14%') AS a
LEFT JOIN Facilities as b
ON a.facid = b.facid
WHERE (member_name = 'GUEST GUEST' AND b.guestcost * a.slots > 30) 
    OR (member_name != 'GUEST GUEST' AND b.membercost * a.slots > 30)
ORDER BY cost;

# yeah.... I had memid in the query and forgot, and didn't feel like going back in
# and changing the member name being the filter since it still works...

# # Part 2: SQLite

Export the country club data from PHPMyAdmin, and connect to a local SQLite instance from Jupyter notebook 
for the following questions.

In [1]:
from sqlalchemy import create_engine
import pandas as pd
# Create engine: engine
engine = create_engine('sqlite:///sqlite_db_pythonsqlite.db')

In [2]:
#Checking that everything has been imported correctly
table_names = engine.table_names()
# Print the table names to the shell
print(table_names)

['Bookings', 'Facilities', 'Members']


### QUESTIONS:

Q10: Produce a list of facilities with a total revenue less than 1000. The output of facility name and total revenue, sorted by revenue. Remember that there's a different cost for guests and members!

In [3]:
df = pd.read_sql_query("SELECT f.name, SUM(CASE WHEN b.memid = 0 THEN b.slots * f.guestcost ELSE b.slots * f.membercost END) AS total_cost FROM Bookings AS b JOIN Facilities AS f ON b.facid = f.facid GROUP BY name", 
                       engine)
dr = df.loc[df['total_cost'] > 1000]
dr = dr.sort_values(by = 'total_cost')

dr.head()

Unnamed: 0,name,total_cost
0,Badminton Court,1906.5
5,Squash Court,13468.0
7,Tennis Court 1,13860.0
8,Tennis Court 2,14310.0
2,Massage Room 2,14454.6


Q11: Produce a report of members and who recommended them in alphabetic surname, firstname order

In [4]:
recs = pd.read_sql_query("SELECT m.firstname || ' ' || m.surname AS 'member_name', recs.recommender FROM Members AS m JOIN (SELECT memid, firstname || ' ' || surname AS recommender FROM Members WHERE recommendedby > 0) AS recs ON m.recommendedby = recs.memid", engine)
recs = recs.sort_values(by = 'member_name')
recs.head(10)

Unnamed: 0,member_name,recommender
2,Anne Baker,Ponder Stibbons
1,David Jones,Janice Joplette
7,Douglas Jones,David Jones
3,Florence Bader,Ponder Stibbons
8,Henrietta Rumney,Matthew Genting
5,Joan Coplin,Timothy Baker
9,John Hunt,Millicent Purview
4,Matthew Genting,Gerald Butters
0,Nancy Dare,Janice Joplette
6,Ramnaresh Sarwin,Florence Bader


Q12: Find the facilities with their usage by member, but not guests

In [5]:
member_use = pd.read_sql_query("SELECT f.name, COUNT(b.bookid) AS member_uses FROM Bookings AS b JOIN Facilities AS f ON b.facid = f.facid WHERE memid != 0 GROUP BY f.name", engine)
member_use.head(10)

Unnamed: 0,name,member_uses
0,Badminton Court,344
1,Massage Room 1,421
2,Massage Room 2,27
3,Pool Table,783
4,Snooker Table,421
5,Squash Court,195
6,Table Tennis,385
7,Tennis Court 1,308
8,Tennis Court 2,276


Q13: Find the facilities usage by month, but not guests

In [6]:
month_use = pd.read_sql_query("SELECT strftime('%m', b.starttime) AS month, COUNT(b.bookid) AS uses, f.name FROM Bookings AS b JOIN Facilities AS f ON b.facid = f.facid GROUP BY month, f.name", engine)
month_use.head(10)

Unnamed: 0,month,uses,name
0,7,56,Badminton Court
1,7,123,Massage Room 1
2,7,12,Massage Room 2
3,7,110,Pool Table
4,7,75,Snooker Table
5,7,75,Squash Court
6,7,51,Table Tennis
7,7,88,Tennis Court 1
8,7,68,Tennis Court 2
9,8,146,Badminton Court
