# Denison CS181/DA210 SW Lab #10 - Step 1

Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\rightarrow$Restart And Run All).

Make sure you fill in any place that says `# YOUR CODE HERE` or "YOUR ANSWER HERE".

---

In [None]:
import pandas as pd
import os
import os.path
import json

In [None]:
def getsqlite_creds(dirname=".",filename="creds.json"):
    """ Using directory and filename parameters, open a credentials file
        and obtain the two parts needed for a connection string to
        a local provider using the "sqlite" dictionary within
        an outer dictionary.  
        
        Return a scheme and a dbfile
    """
    assert os.path.isfile(os.path.join(dirname, filename))
    with open(os.path.join(dirname, filename)) as f:
        D = json.load(f)
    sqlite = D["sqlite"]
    return sqlite["scheme"], sqlite["dbdir"], sqlite["database"]

---

## Part A: Working with Independent Client(s)

#### Download/install client(s)

Working with databases gives us the opportunity to use an independent client as a tool to look at the tables of a database and even to create and try out SQL queries.  This indpendent route allows us to separate the concerns of the Python programming that we do in Jupyter notebooks from problem solving data queries.

There are several independent clients you could use, depending on the type of relational database you're working with.  For now, I'll use SQLiteStudio:
- **SQLiteStudio** is cross-platform (PC, Mac, Linux) and open source, and works with SQLite databases.  This is the type of database we'll work with on our local machines.
    - Link: https://sqlitestudio.pl/
    - Works with: local SQLite file-based databases

Other choices include:
- **Table Plus**, a cross-platform (PC, Mac, Linux) with a free trial version.  The trial version only allows two open tabs and/or two database connections, but can be sufficient for many "check it out" scenarios.  
    - Link: https://tableplus.com/
    - Works with: both remote MySQL databases as well as local SQLite file-based databases
- **MySQL Workbench**, another cross-platorm client application that is more full-featured than Table Plus, but is specific to remote MySQL databases.
    - Link: https://www.mysql.com/products/workbench/
    - Works with: remote MySQL databases
    
Go ahead and download at least SQLiteStudio on your own machine.  You need not install this program; instead, you can run it directly from the folder you download.

We will not spend time configuring these now, or trying to connect to remote MySQL databases.  For now, we'll focus on local SQLite databases.  You should take note of the information needed to connect to the SQLite databases in the cells below, as this same information will be used to configure connection setups in independent clients.

#### Explore a SQLite database

You can use SQLiteStudio to visualize the schema and data of local SQLite file-based databases.  Use these steps:

1. Open `SQLiteStudio`.

2. Go to `Database` -> `Add a database`.

3. Click the folder icon and navigate to your class repository -> `dbfiles` -> `book.db`.

4. Leave the "Name (on the list)" setting as `book` and the "Database type" as `SQLite 3`, then click `OK`.

> You've reached the first checkpoint in the lab.  Make sure to have it signed off by the instructor or TA.
>
> Checkpoint 1: In the far level pane, expand `Tables` and double-click on `indicators`.  Based on the `Structure` tab (which should be the default view), how many fields are there in this table?  Which field(s) form the primary key?

---

## Part B: SQL and Jupyter Notebooks

#### Get SQL-Related Packages

We need to install some SQL-related packages.

1. Use `pip` to install `ipython-sql` and its dependencies (don't type the `$` -- that's your prompt):

    `$ pip install ipython-sql`

    Accept the defaults and install.

    Mine generates lots of lines of text, with this at the end:

    ```
    Installing collected packages: sqlparse, prettytable, ipython-sql
    Successfully installed ipython-sql-0.4.0 prettytable-0.7.2 sqlparse-0.4.2
    ```
    
    &nbsp;

2. Verify that the SQL-related packages are installed:  If you have a Mac, click the `+` in the upper left of Jupyter Lab and open a Terminal.  If you are using Windows, open the `Anaconda Powershell Prompt` program.  In Linux, this may just work; if not, ask for help!

    At the command prompt, type the following command (but without the dollar sign):

    `$ conda list | grep sql`

    I get the following (you may have other packages, but you should at least have these):
    ```
    ipython-sql               0.4.0                    pypi_0    pypi
    sqlalchemy                1.4.22           py39h2bbff1b_0  
    sqlite                    3.36.0               h2bbff1b_0  
    sqlparse                  0.4.2                    pypi_0    pypi
    ```

#### Load notebook extension to enable "SQL Magic"

This extension is provided by the `ipython-sql` package installed previously, and allows the three forms of sql within Notebooks (code cells, not markdown cells)

1. `%%sql`: Magic to make an entire multi-line cell be interpreted as an SQL to be sent as a request to a connected server.
2. Single line `%sql`: Magic to allow a single line SQL, placed in its entirety to the right of the `%sql` and the only thing in the cell.
3. Embedded `%sql`: Magic to allow the SQL command *following the `%sql`* to be executed as part of a larger Python code block.  This allows the integration of Python code and variables incorporated with sql requests.

**Execute the following cell to load the sql-magics extension**

_Note that if this cell is executed a second time within the same kernel, it will not reload.  Normally this is fine, unless the sql magics module has changed._

In [None]:
%load_ext sql

---

## Part C: Connect to a Local SQLite Database

#### Connection String

For a local database file, like those used by SQLite for `book.db` and `school.db` in the `dbfiles` directory, a connection string rolls together the information about the scheme, the directory holding the database files, and the name of the database.  At this point, it is just a string that we construct in Python, named `cstring`.

In [None]:
scheme, dbdir, database = getsqlite_creds()
template = '{}:///{}/{}.db'
cstring = template.format(scheme, dbdir, database)
print("Connection string:", cstring)

#### Establish Connection from Client to Server

The cell below uses the Python variable constructed earlier to actually establish a connection with the server (for SQLite, there is no formal server process, as SQLite works directly with ordinary files on the disk).  We'll maintain the illusion of a server, however, for our mental model of relational databases.

In [None]:
%sql $cstring

#### Make a Query

Try and execute the following line, which uses an "inline" sql magic to exectute a query to the database and retrieve the result.

In [None]:
%sql SELECT * from indicators0

> You've reached the second checkpoint in the lab.  Make sure to have it signed off by the instructor or TA.
>
> Checkpoint 2: Based on the result of your query in the previous cell, how many fields are present in the `indicators0` table?  How many records are there?