# Introduction to SQL

_SQL_ stands for 'structured query language.' For this ndemonstration we will be using an iPython notebook SQL library developed by Caterine Devlin at [https://github.com/catherinedevlin/ipython-sql](https://github.com/catherinedevlin/ipython-sql)

Note that in order to run SQL commands within a Jupyter Notebooks, code blocks need to begin with a 'magic' function:

```
%sql
```

for inline SQL or

```
%%sql
```

for multiple lines of SQL in a code block.

This is a minor addition that is not needed within a standard SQL database or interface, but we like this option because it's notebook friendly and the SQL syntax is otherwise the same.

It may be necessary to install the library:

In [3]:
#!pip install ipython-sql # https://github.com/catherinedevlin/ipython-sql

## Load the extension and connect to a database

In this case we're using SQLite. As a file-based database, this means we don't need to configure a connection to a server.

In [46]:
%load_ext sql
%sql sqlite://

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


'Connected: None@None'

## Getting Info

Not using a GUI for this demo makes some things more difficult. In particular, database GUIs for SQL and SQLite make it easy to find out information about the database and database objects. But even without a GUI we often need to know things about the database including: 

* the number and names of tables
* structural information and schema

To do this in a notebook we can use SQLite PRAGMA statements. These commands provide control over and information about the SQLite environment. The base syntax needed for using PRAGMA commands here is

```
%sql pragma [command]
```

For example:

```
%sql pragma database_list;
%sql pragma stats;
%sql pragma table_info([table name]);
```

Those three should satisfy most of our requirements today. More information about PRAGMA statements is available in the SQLite documentation [https://www.sqlite.org/pragma.html](https://www.sqlite.org/pragma.html).

### Create a simple table

Note that the **IF EXISTS** and **IF NOT EXISTS** flags in the commands below are optional. We use them here because we want a clean, new table each time we run the code block. Without these flags we are more likely to get an error or generate duplicate data.

In [83]:
%%sql
DROP TABLE IF EXISTS donuts;
CREATE TABLE IF NOT EXISTS donuts (doNum, dough, glaze, filling);
INSERT INTO donuts VALUES (1, 'cake', 'maple', 'none');
INSERT INTO donuts VALUES (2, 'yeast', 'sugar', 'none');

Done.
Done.
1 rows affected.
1 rows affected.


[]

Check the status of the database and get some info about it and our table using PRAGMA statements.

In [88]:
%sql PRAGMA DATABASE_LIST;

Done.


seq,name,file
0,main,


In [89]:
%sql PRAGMA STATS;

Done.


table,index,width,height
donuts,,43,200
sqlite_master,,65,200


Note that the output above references a system generated table, *sqlite_master* in addition to the table _donuts_ that we just created. This table provides metadata and gives us another way to access info about our tables:

In [90]:
%sql SELECT * from sqlite_master;

Done.


type,name,tbl_name,rootpage,sql
table,donuts,donuts,2,"CREATE TABLE donuts (doNum, dough, glaze, filling)"


In either case, once we know the name of a table we can get more detailed info about it:

In [91]:
%sql PRAGMA TABLE_INFO(donuts);

Done.


cid,name,type,notnull,dflt_value,pk
0,doNum,,0,,0
1,dough,,0,,0
2,glaze,,0,,0
3,filling,,0,,0


### The SELECT statement

In addition to not being very interesting, our _donuts_ table is poorly structured. There is nothing to prevent duplicate data, enforce non-null or data type requirements, etc. Before going much farther, we want to design a better table. But while we've so far been able to get into ABOUT the table, we haven't yet gotten info FROM the table. Doing so requires use of a SELECT statement. 

The basic syntax is straightforward:

```
SELECT column_1, column_2 FROM table
```

where a comma separated list of a table's column names is used in place of 'column_1, column_2' above, and 'table' is replaced with the name of the table we want to view.

Often an asterisk is used as a shortcut for all columns.

In [94]:
%sql SELECT * FROM donuts

Done.


doNum,dough,glaze,filling
1,cake,maple,none
2,yeast,sugar,none


Keep in mind that the SELECT statement refers to columns, not rows. If we want to limit the number of columns returned, we do that accordingly:

```
SELECT glaze FROM donuts
```

which will return just the _glaze_ column and all rows in the table. Limiting the number of rows returned can be done various ways, notably by using a WHERE clause as detailed below.

In [95]:
%sql SELECT glaze FROM donuts

Done.


glaze
maple
sugar


In [97]:
%%sql
INSERT INTO donuts VALUES (3, 'yeast', 'maple', 'boston creme');
INSERT INTO donuts VALUES (4, 'yeast', 'chocolate', 'none');
INSERT INTO donuts VALUES (5, 'cruller', 'vanilla', 'none');
INSERT INTO donuts VALUES (6, 'yeast', 'chocolate ', 'boston creme');

1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


[]

In [99]:
%sql SELECT * FROM donuts

Done.


doNum,dough,glaze,filling
1,cake,maple,none
2,yeast,sugar,none
3,yeast,maple,boston creme
4,yeast,chocolate,none
5,cruller,vanilla,none
6,yeast,chocolate,boston creme
