# Introduction to the SQLite Command Prompt
---

* There are many ways to access an SQLite database
    * Through Python `sqlite` module and Pandas (we'll talk about this next week)
    * Through Python with an ORM like [SQLAlchemy](https://www.sqlalchemy.org/)
    * Via a GUI Application like [DB Browser for SQLite](https://sqlitebrowser.org/)
    * On the command line using the `sqlite` [command line shell](https://sqlite.org/cli.html) or using an alternative shell like [LiteCLI](https://litecli.com/) 
* Today we will focus on `sqlite` command line shell because it is always available (on \*nix systems with sqlite installed) 
* If you want to try `litecli` at a later date you must run the following command in the terminal on the CRC cluster
    
```
module load python/anaconda3.6-5.2.0
``` 

* This will load the `litecli` command shell as an alternative to the `sqlite` command line shell
    * This shell has better autocomplete and other nice features
   

## Setup and SQLite
---
For this lesson you will need your UNIX shell (open one up with the + button in the upper left hand corner of this page and select the terminal), plus SQLite3 and the database file we'll be using: survey.db. We already have all of these things, but when you start to do this by yourself, you'll have to make sure that your machine is situated correctly.

A **relational database** is a way to store and manipulate information. Databases are arranged as **tables**. Each table has columns (also known as **fields**) that describe the data, and rows (also known as **records**) which contain the data.

When we are using a spreadsheet, we put formulas into cells to calculate new values based on old ones. When we are using a database, we send commands (usually called **queries**) to a **database manager**: a program that manipulates the database for us. The database manager does whatever lookups and calculations the query specifies, returning the results in a tabular form that we can then use as a starting point for further queries.

#### Changing Database Managers

Every database manager — Oracle, IBM DB2, PostgreSQL, MySQL, Microsoft Access, and SQLite — stores data in a different way, so a database created with one cannot be used directly by another. However, every database manager can import and export data in a variety of formats, like .csv, so it is possible to move information from one to another.

Queries are written in a language called **SQL**, which stands for “Structured Query Language”. SQL provides hundreds of different ways to analyze and recombine data. We will only look at a handful of queries, but that handful accounts for most of what scientists do.

#### Getting Into and Out of SQLite
In order to use the SQLite commands interactively, we need to enter into the SQLite console. So, open up a terminal, move your current working directory to Data Basics session 7, and run:

The SQLite command is `sqlite3` and you are telling SQLite to open up the `survey.db`. You need to specify the **.db** file otherwise, SQLite will open up a temporary, empty database.

To get out of SQLite, type out `.exit` or `.quit`. For some terminals, **Ctrl-D** can also work. If you forget any SQLite . (dot) command, type `.help`.

#### Checking if Data is Available

All SQLite-specific commands are prefixed with a . to distinguish them from SQL commands. Type `.tables` to list the tables in the database. Run the following command in the sqlite shell:

We can use the command `.schema TABLENAME` to show us the create statement for a table - this basically tells us all that we need to know about the attributes in a certain table. Run the following command in the sqlite shell:

You can change some SQLite settings to make the output easier to read. First, set the output mode to display left-aligned columns. Then turn on the display of column headers.

## The Data(base)

Before we get into using SQLite to work with the data, let’s take a look at the tables of the database we will use in our examples. Here is an image showing the various relations of the tables:

![The survey.db database structure](https://swcarpentry.github.io/sql-novice-survey/fig/sql-join-structure.svg)

And here is the data:


### Person

A table of the people who took the readings

<table class="table table-striped">
  <thead>
    <tr>
      <th>id</th>
      <th>personal</th>
      <th>family</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>dyer</td>
      <td>William</td>
      <td>Dyer</td>
    </tr>
    <tr>
      <td>pb</td>
      <td>Frank</td>
      <td>Pabodie</td>
    </tr>
    <tr>
      <td>lake</td>
      <td>Anderson</td>
      <td>Lake</td>
    </tr>
    <tr>
      <td>roe</td>
      <td>Valentina</td>
      <td>Roerich</td>
    </tr>
    <tr>
      <td>danforth</td>
      <td>Frank</td>
      <td>Danforth</td>
    </tr>
  </tbody>
</table>
    
### Site

A table of the locations where readings were taken

<table class="table table-striped">
  <thead>
    <tr>
      <th>name</th>
      <th>lat</th>
      <th>long</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>DR-1</td>
      <td>-49.85</td>
      <td>-128.57</td>
    </tr>
    <tr>
      <td>DR-3</td>
      <td>-47.15</td>
      <td>-126.72</td>
    </tr>
    <tr>
      <td>MSK-4</td>
      <td>-48.87</td>
      <td>-123.4</td>
    </tr>
  </tbody>
</table>


### Visited

A table of the visits to the sites

<table class="table table-striped">
  <thead>
    <tr>
      <th>id</th>
      <th>site</th>
      <th>dated</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>619</td>
      <td>DR-1</td>
      <td>1927-02-08</td>
    </tr>
    <tr>
      <td>622</td>
      <td>DR-1</td>
      <td>1927-02-10</td>
    </tr>
    <tr>
      <td>734</td>
      <td>DR-3</td>
      <td>1930-01-07</td>
    </tr>
    <tr>
      <td>735</td>
      <td>DR-3</td>
      <td>1930-01-12</td>
    </tr>
    <tr>
      <td>751</td>
      <td>DR-3</td>
      <td>1930-02-26</td>
    </tr>
    <tr>
      <td>752</td>
      <td>DR-3</td>
      <td>-null-</td>
    </tr>
    <tr>
      <td>837</td>
      <td>MSK-4</td>
      <td>1932-01-14</td>
    </tr>
    <tr>
      <td>844</td>
      <td>DR-1</td>
      <td>1932-03-22</td>
    </tr>
  </tbody>
</table>

### Survey

A table of the readings. The field `quant` is short for quantitative and indicates what is being measured. Values are `rad`, `sal`, and `temp` referring to ‘radiation’, ‘salinity’ and ‘temperature’, respectively.

<table class="table table-striped">
  <thead>
    <tr>
      <th>taken</th>
      <th>person</th>
      <th>quant</th>
      <th>reading</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>619</td>
      <td>dyer</td>
      <td>rad</td>
      <td>9.82</td>
    </tr>
    <tr>
      <td>619</td>
      <td>dyer</td>
      <td>sal</td>
      <td>0.13</td>
    </tr>
    <tr>
      <td>622</td>
      <td>dyer</td>
      <td>rad</td>
      <td>7.8</td>
    </tr>
    <tr>
      <td>622</td>
      <td>dyer</td>
      <td>sal</td>
      <td>0.09</td>
    </tr>
    <tr>
      <td>734</td>
      <td>pb</td>
      <td>rad</td>
      <td>8.41</td>
    </tr>
    <tr>
      <td>734</td>
      <td>lake</td>
      <td>sal</td>
      <td>0.05</td>
    </tr>
    <tr>
      <td>734</td>
      <td>pb</td>
      <td>temp</td>
      <td>-21.5</td>
    </tr>
    <tr>
      <td>735</td>
      <td>pb</td>
      <td>rad</td>
      <td>7.22</td>
    </tr>
    <tr>
      <td>735</td>
      <td>-null-</td>
      <td>sal</td>
      <td>0.06</td>
    </tr>
    <tr>
      <td>735</td>
      <td>-null-</td>
      <td>temp</td>
      <td>-26.0</td>
    </tr>
    <tr>
      <td>751</td>
      <td>pb</td>
      <td>rad</td>
      <td>4.35</td>
    </tr>
    <tr>
      <td>751</td>
      <td>pb</td>
      <td>temp</td>
      <td>-18.5</td>
    </tr>
    <tr>
      <td>751</td>
      <td>lake</td>
      <td>sal</td>
      <td>0.1</td>
    </tr>
    <tr>
      <td>752</td>
      <td>lake</td>
      <td>rad</td>
      <td>2.19</td>
    </tr>
    <tr>
      <td>752</td>
      <td>lake</td>
      <td>sal</td>
      <td>0.09</td>
    </tr>
    <tr>
      <td>752</td>
      <td>lake</td>
      <td>temp</td>
      <td>-16.0</td>
    </tr>
    <tr>
      <td>752</td>
      <td>roe</td>
      <td>sal</td>
      <td>41.6</td>
    </tr>
    <tr>
      <td>837</td>
      <td>lake</td>
      <td>rad</td>
      <td>1.46</td>
    </tr>
    <tr>
      <td>837</td>
      <td>lake</td>
      <td>sal</td>
      <td>0.21</td>
    </tr>
    <tr>
      <td>837</td>
      <td>roe</td>
      <td>sal</td>
      <td>22.5</td>
    </tr>
    <tr>
      <td>844</td>
      <td>roe</td>
      <td>rad</td>
      <td>11.25</td>
    </tr>
  </tbody>
</table>

Now lets try *querying* the data with SQL