Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [None]:
NAME = ""
COLLABORATORS = ""

---

## Set User Credentials

With a shared resource at a provider like a MySQL RDBMS, we need to use credentials to authenticate ourselves to the server, and need the logical location of the server itself.

For these notebooks, these are kept in a text file named 'creds.json', stored either in the same directory or in a data directory.  For this notebook, this is stored in the same directory as the notebook.

- Right click on the `creds.json` file and select *Open With*->*Editor*
- Replace the mysql dictionary's key for "user" (currently `"nostudent"`) with the base part of your email address (i.e. without the `@denison.edu`).  Your password on the mysql server, at present, is the same as your user, so change that from `"nostudent"` as well.  The server should be correct, mapped to `"hadoop2.mathsci.denison.edu"`. Likewise, the scheme should be correct, mapped to `"mysql+mysqlconnector"`. 

**Make sure to use double quotes for strings** ... this is `JSON`, not Python, and we have to follow JSON syntax.

Once this is complete, execute the following cell to connect to the database using SQL alchemy. If you are off-campus you will need to use a VPN first.

In [None]:
import pandas as pd
import os
import os.path
import json
import sqlalchemy as sa

def getmysql_creds(dirname=".",filename="creds.json"):
    """ Using directory and filename parameters, open a credentials file
        and obtain the four parts needed for a connection string to
        a remote provider using the "mysql" dictionary within
        an outer dictionary.  
        
        Return a scheme, server, user, and password
    """
    assert os.path.isfile(os.path.join(dirname, filename))
    with open(os.path.join(dirname, filename)) as f:
        D = json.load(f)
    mysql = D["mysql"]
    return mysql["scheme"], mysql["server"], mysql["user"], mysql["pass"],mysql["database"]

scheme, server, user, password, database = getmysql_creds()

**Q1** In the prior cell, we use the provided function `getmysql_creds` to obtain the five pieces of information needed to establish a connection.  Write the lines of code needed to actually create a connection string, to create the `engine` obhect, and to create the `connection` object.  We have given you the Python template string.

In [None]:
template = '{}://{}:{}@{}/{}'
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
assert isinstance(connection, sa.engine.base.Connection)

**Q2** Write a query to obtain all information about departments in the Fine Arts division of the college, and use the connection object to make the request.  The result should be a list of tuples assigned to Python variable `tuple_list`.

In [None]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# Testing cell

assert isinstance(tuple_list, list)
assert len(tuple_list) == 5

**Q3** Make the following assignments to variables associated with the results of the last two questions:

- `col_list`: the names of the columns 
- `first_dept`: the id of the first department result returned
- `third_deptname`: the *name* of the depatment for the third result returned.

In [None]:
# Solution cell

col_list = None
first_deptid = None
third_deptname = None

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# Testing cell

assert len(col_list) == 4
assert 'departmentchair' in col_list
assert first_dept == 'ART'
assert third_deptname == 'Dance'

**Q4** Write a function

    fetch_query(dbcon, query)
    
that submits the given `query`, a Python string, to the given database `dbcon` and returns a list of row tuples of the result.  Your function should **not** use `fetchall` and **must** use a `for` or a `while` loop to accumulate the results.  The database-programming information for successfully completing this question comes from your reading in 13.3.

In [None]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
query1 = """
SELECT D.departmentid, departmentname, 
       instructorlast || ', ' || instructorfirst AS chairname
FROM departments AS D LEFT JOIN instructors AS I
     ON D.departmentchair = I.instructorid
WHERE division = 'Natural Sciences'
ORDER BY D.departmentid
"""

dlist = fetch_query(connection, query1)
assert len(dlist) == 6
assert dlist[3]['departmentid'] == 'MATH'

**Q4** Repeat the last exercise in writing a generic function

    fetch_query(dbcon, query)
    
that submits the given `query`, a Python string, to the given database `dbcon`.  But this time, construct a return a `pandas` dataframe with the result.  You still need to use an accumulation loop.  You should be able to just append a few lines to your existing function to get this done.

In [None]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
query1 = """
SELECT D.departmentid, departmentname, 
       instructorlast || ', ' || instructorfirst AS chairname
FROM departments AS D LEFT JOIN instructors AS I
     ON D.departmentchair = I.instructorid
WHERE division = 'Natural Sciences'
ORDER BY D.departmentid
"""

df = fetch_query(connection, query1)
assert len(df) == 6
assert isinstance(df, pd.core.frame.DataFrame)
assert df.iloc[3]['departmentid'] == 'MATH'

**Q5** Assign to variable `query` an SQL statement that yields the last name, first name, and major for the students whose last name starts with `B` and goes through, and including `Bz'.

In [None]:
# Solution cell

query = """
"""
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
results = connection.execute(query).fetchall()
assert len(results) == 194

**Q6** Write a function

    getStudentsInNameRange(dbcon, namefrom, nameto)
    
that abstracts the query from the last section: it retrieves the studentlast, studentfirst, studentmajor from the students table where the last names are between the given parameters of `namefrom` and `nameto`.

Note how, in the solution cell and in the asserts, that `namefrom` and `nameto` are Python strings.  The caller **should not** incorporate any knowledge of SQL and how, hidden in the function, the SQL is written, nor how the syntax of SQL might represent constant strings.  

Your return should be a `pandas` DataFrame.

Do not undervalue this last question.  The culmination of understanding how to work with a database should be functions that abstract away the details of the SQL, so that the higher level steps of a client application clearly convey the steps and not the underlying (SQL) details.

In [None]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
namefrom = """B"""
nameto = """Bz"""
resultdf = getStudentsInNameRange(connection, namefrom, nameto)
assert resultdf.shape == (194,3)