Creating tables in SQL
---------------------

Before we actually get into basic SQL queries (**asking questions _of_ data in tables**), we'll look at some of the basics about how to **create** tables.

**NOTE: Make sure to have a copy of the database file, "dataset_1.db", from the last lecture downloaded and in this directory for the below to work!**

In [1]:
%load_ext sql
%sql sqlite:///dataset_1.db

u'Connected: None@dataset_1.db'

Activity 2-1:
------------

Schemas & table creation

Recall that the database we just contains a table (among others), called `precipitation_full`, with the following schema:

> * `state_code`
> * `station_id`
> * `year`
> * `month`
> * `day`
> * `hour`
> * `precipitation`
> * `flag_1`
> * `flag_2`

Each tuple in this table describes one hour of rainfall (`precipitation`- in hundredths of an inch) at one station (`station_id`) in one state (`state_code`).  Note that tuples with `hour=25` record the total rainfall for that day, and that we can ignore the values of attributes `flag_1` and `flag_2` for now.

Now, however, let's see how to view the **schema** of existing tables on your own; there are several ways, including but not limited to:
* DESCRIBE tablename
* SHOW CREATE TABLE tablename
* SHOW COLUMNS tablename

Unfortunately, support for these varies widely between DBMSs, and is also limited by our IPython interface (for example sqlite, which we are using, does not support the above; it does have a `.schema tablename` command, however this doesn't work in IPython notebooks...)

One that does work for us here though is:

In [2]:
%sql PRAGMA table_info(precipitation_full);

Done.


cid,name,type,notnull,dflt_value,pk
0,state_code,INT,0,,0
1,station_id,INT,0,,0
2,year,INT,0,,0
3,month,INT,0,,0
4,day,INT,0,,0
5,hour,INT,0,,0
6,precipitation,INT,0,,0
7,flag_1,VARCHAR(1),0,,0
8,flag_2,VARCHAR(1),0,,0


A bit verbose, but gets the job done!

And, we can get the exact statement used to create the table as follows (**a great way to find guidance here!!**):

In [3]:
%sql SELECT sql FROM sqlite_master WHERE name = 'precipitation_full';

Done.


sql
"CREATE TABLE precipitation_full(state_code INT, station_id INT, year INT, month INT, day INT, hour INT, precipitation INT, flag_1 VARCHAR(1), flag_2 VARCHAR(1))"


Without going into full detail (yet), the above table contains one record for each hour at each station, and contains the amount of precipitation that was measured during that hour.

Let's create another table that will hold the same information. The name of the new table must not be that of an existing table.

In [4]:
%%sql 
CREATE TABLE precipitation_full_2(
    state_code INT, 
    station_id INT, 
    year INT, 
    month INT, 
    day INT, 
    hour INT, 
    precipitation INT, 
    flag_1 VARCHAR(1), 
    flag_2 VARCHAR(1)
);

Done.


[]

In [5]:
%sql PRAGMA table_info(precipitation_full_2);

Done.


cid,name,type,notnull,dflt_value,pk
0,state_code,INT,0,,0
1,station_id,INT,0,,0
2,year,INT,0,,0
3,month,INT,0,,0
4,day,INT,0,,0
5,hour,INT,0,,0
6,precipitation,INT,0,,0
7,flag_1,VARCHAR(1),0,,0
8,flag_2,VARCHAR(1),0,,0


In [6]:
%sql SELECT sql FROM sqlite_master WHERE name = 'precipitation_full_2';

Done.


sql
"CREATE TABLE precipitation_full_2(  state_code INT, station_id INT, year INT, month INT, day INT, hour INT, precipitation INT, flag_1 VARCHAR(1), flag_2 VARCHAR(1) )"


Now, let's remove the duplicate table.

In [7]:
%sql DROP table 'precipitation_full_2';

Done.


[]

In [8]:
%sql SHOW tables;

(sqlite3.OperationalError) near "SHOW": syntax error [SQL: u'SHOW tables;']


SQLite does not implement the command "SHOW tables". Instead, we can query the table 'sqlite_master'.

In [9]:
%sql SELECT * FROM sqlite_master WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,precipitation_full,precipitation_full,2,"CREATE TABLE precipitation_full(state_code INT, station_id INT, year INT, month INT, day INT, hour INT, precipitation INT, flag_1 VARCHAR(1), flag_2 VARCHAR(1))"
table,states,states,468,"CREATE TABLE states(code INT, name VARCHAR(30), abbrev VARCHAR(2))"
table,precipitation,precipitation,469,"CREATE TABLE precipitation(  station_id INT,  day INT,  precipitation INT )"
table,A,A,472,"CREATE TABLE A(i INT, j INT, val INT)"
table,B,B,473,"CREATE TABLE B(i INT, j INT, val INT)"
table,streets,streets,474,"CREATE TABLE streets(id INT, direction CHAR(1), A text, B text, d INT, PRIMARY KEY (id, direction))"
table,franchise,franchise,476,"CREATE TABLE franchise (name TEXT, db_type TEXT)"
table,store,store,477,"CREATE TABLE store (franchise TEXT, location TEXT)"
table,bagel,bagel,478,"CREATE TABLE bagel (name TEXT, price MONEY, made_by TEXT)"
table,purchase,purchase,479,"CREATE TABLE purchase (bagel_name TEXT, franchise TEXT, date INT, quantity INT, purchaser_age INT)"


Now, on to our in-class activity...

Suppose that our class was asked to assist in collecting rainfall data!  Based on what
we've covered so far, the above example, and the Internet, create a table
for storing the staff assignments.  Table requirements:
* Everyone in the class will be holding a cup in the rain for 
a specific several-hour shift at a specific station; this assignment will
remain the same every day
* Each person will have one off-day per week
* Each person's cup might be of a different size, measured as a float value
* The Dept. of Interior data servers can't handle the full dataset we would
generate, and require a random subsample- so some people will be randomly
chosen to stand in the rain without a cup.  These assignments need to be
recorded somehow in the table too.
* Some people in the class have [Welsh names](https://www.youtube.com/watch?v=fHxO0UdpoxM)

Type your create table statement here:

*NB:* Remember to start with `%sql` for single line sql or `%%sql`

In [10]:
%%sql
CREATE TABLE rain_corps_assignments(
    student_id INT PRIMARY KEY,
    name VARCHAR(1000),
    station_id INT,
    state_code INT,
    start_hour INT,
    end_hour INT,
    holding_cup BOOLEAN,
    cup_size FLOAT,
    off_day INT
);

Done.


[]