Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [None]:
NAME = ""
COLLABORATORS = ""

---

In [None]:
import pandas as pd
import os
import os.path
import json

In [None]:
def getmysql_creds(dirname=".",filename="creds.json"):
    """ Using directory and filename parameters, open a credentials file
        and obtain the four parts needed for a connection string to
        a remote provider using the "mysql" dictionary within
        an outer dictionary.  
        
        Return a scheme, server, user, and password
    """
    assert os.path.isfile(os.path.join(dirname, filename))
    with open(os.path.join(dirname, filename)) as f:
        D = json.load(f)
    mysql = D["mysql"]
    return mysql["scheme"], mysql["server"], mysql["user"], mysql["pass"]

## Set User Credentials

With a shared resource at a provider like a MySQL RDBMS, we need to use credentials to authenticate ourselves to the server, and need the logical location of the server itself.

For these notebooks, these are kept in a text file named 'creds.json', stored either in the same directory or in a data directory.  For this notebook, this is stored in the same directory as the notebook.

- Right click on the `creds.json` file and select *Open With*->*Editor*
- Replace the mysql dictionary's key for "user" (currently `"nostudent"`) with the base part of your email address (i.e. without the `.denison.edu`).  Your password on the mysql server, at present, is the same as your user, so change that from `"nostudent"` as well.  The server should be correct, mapped to `"hadoop2.mathsci.denison.edu"`. Likewise, the scheme should be correct, mapped to `"mysql+mysqlconnector"`. 

**Make sure to use double quotes for strings** ... this is `JSON`, not Python, and we have to follow JSON syntax.

Once this is complete, execute the following cell to get these values into global variables.

In [None]:
scheme, server, user, password = getmysql_creds()

## Package Double Check

If you followed all the instructions at the initial install of Anaconda on your computer, these packages should already be installed on your machine.  

1. Check for what SQL-related packages are installed:  Click the `+` in the upper left of Jupyter Lab and open a Terminal.  At the command prompt, type the following command:

    `$ conda list | grep sql`

    I get the following:
    ```
    ipython-sql               0.3.9                    pypi_0    pypi
    mysql-connector-c         6.1.11               hccea1a4_0  
    mysql-connector-python    8.0.18           py37h3febbb0_1  
    sqlalchemy                1.3.9            py37h1de35cc_0  
    sqlite                    3.30.1               ha441bb4_0  
    sqlparse                  0.3.0                    pypi_0    pypi
    ```

2. If `mysql-connector-python` is not present:

    `$ conda install mysql-connector-python`
    
    Accept the defaults and install.
    
3. If `ipython-sql`  is not present:

    `$ pip install ipython-sql`
    
    Accept the defaults and install

## Establish Connection to Server from This Notebook Client

### Connection String

A connection string rolls togther the information about the scheme, the user, the password, and the server.  At this point, it is still just a string that we have constructed in Python, and named `cstring`.

In [None]:
template = '{}://{}:{}@{}/'
cstring = template.format(scheme, user, password, server)

### Load Notebook Extension to Enable "SQL Magic"

This extension is provided by the `ipython-sql` package installed previously, and allows the three forms of sql within Notebooks (code cells, not markdown cells)

1. `%%sql`: Magic to make an entire multi-line cell be interpreted as an SQL to be sent as a request to a connected server.
2. Single line `%sql`: Magic to allow a single line SQL, placed in its entirety to the right of the `%sql` and the only thing in the cell.
3. Embedded `%sql`: Magic to allow the SQL *following the `%sql`* to be executed as part of a larger Python code block.  This allows the integration of Python code and variables incorporated with sql requests.

**Execute the following cell to load the sql-magics extension**

> Note that if this cell is executed a second time within the same kernel, it will not reload.  Normally this is fine, unless the sql magics module has changed.

In [None]:
%load_ext sql

### Establish Connection from Client to Server

The cell below uses the Python variable constructed earlier to actually establish a connection with the server.  If when you execute this cell, you get a `'Connected'` message, you are ready to proceed.  If not, then the most likely problem is one with the JSON specifications of your credentials.  Much less likely is a problem with the user that was created in the MySQL Server.

If you cannot get a connection established, switch over to MySQL Workbench and try from that different client.

In [None]:
%sql $cstring

### Tell the Provider the Default Database

In the next cell, we use a multi-line sql to issue our first real SQL: a statement that sets the default database to use for future interactions.  In this case, we want to use the `book` database, that includes many of the tables we have seen and used in the last few chapters.

Expected result is a reflection of the connection string and an empty return, with 0 rows and an empty list.  This is because the result of this SQL has no rows to give back as a result.

In [None]:
%%sql
USE book

## Focus on Projecting Columns

1. Project the `pop` column from the `indicators0` table

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

2. Project the `name` column from the `topnames` table.

    - Then amend the query to limit the results to at most 8 rows.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

3. Project the `year`, `sex` and `name` columns from the `topnames` table, limiting to at most 7 rows

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

4. Project the `code', `pop`, and `gdp` fields of `indicators0`, but, in the result, name the second field `Population` and the third field `GDP`.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

5. Project all columns (by wildcard) from the `indicators0` table.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

6. Project all columns from the `topnames` table such that the result starts with the most recent year, working backwards toward the oldest year.  You can limit the result to 10 records.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

7. Project all columns from the `topnames` table such that the result starts with the most recent year, working backwards toward the oldest year, and so that `'Female'` occurs before `'Male'` within each year.  You can limit the result to 8 records.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

## Focus on Selecting and Filtering Rows

In this next section, we explore how, in SQL, we create subsets of the *rows* (or *records* of the data requested from tables.

1. Start by changing the default database to the `nycflights13` schema.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

2. Project the `tzone` column from the `airports` table.
    - Then amend your query to filter based on uniqueness

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

3. Project unique `manufacturer` and `engine` combinations from the `planes` table.  Is this a different result than the unique manufacturers?  Why or why not?
    - Amend your query to sort by manufacturer

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

4. As a way of seeing how to deal with two databases *without* changing the default, use the database name *and* table name to get the distinct `income` categories from the `countries` table of the `book` database.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

5. Project all columns from the `indicators1` view in the `book` database where the population is greater than 1000.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

6. Project `carrier` and `flight` for those distinct records in the `flight` table where the departure delay is more than 600 minutes.  How many records do we get if we do not eliminate duplicates?

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

7. Same question as before, but also restrict to records with *either* American Airlines ('AA') or Delta ('DL') as the carrier.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

8. Find the plane models that have a seat capacity more than or equal to 140 and less than or equal to 149. Use the SQL to specify a range.
    - How would you change you query to see if the range is inclusive or exclusive on the two ends

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

9. Repeat question 7, but use the SQL *Set Inclusion* condition.  Then extend to include Frontier ('F9')

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

10. Find the airports that have the string "Island" anywhere in their name.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()