For this ndemonstration we will be using an iPython notebook SQL library developed by Caterine Devlin and others at https://github.com/catherinedevlin/ipython-sql

Note that in order to run SQL commands within a Jupyter Notebooks, code blocks need to begin with a 'magic' function:

%sql
for inline SQL or

%%sql
for multiple lines of SQL in a code block.

This is a minor addition that is not needed within a standard SQL database or interface, but we like this option because it's notebook friendly and the SQL syntax is otherwise the same.

It may be necessary to install the library:

## Round 1

In [1]:
#!pip install ipython-sql
#!pip3 install ipython-sql

In [2]:
%load_ext sql
%sql sqlite://

'Connected: None@None'

In [3]:
%%sql
DROP TABLE IF EXISTS observation_list;
CREATE TABLE observation_list (
  'id' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
  'location' TEXT NOT NULL,
  'observer' TEXT,
  'weather' TEXT,
  'date' TEXT NOT NULL,
  'time_start' TEXT NOT NULL,
  'time_end' TEXT NOT NULL,
  'laysan_albatross' INTEGER NULL DEFAULT NULL,
  'black_footed_albatross' INTEGER,
  'wedge_tailed_shearwater' INTEGER,
  'christmas_shearwater' INTEGER,
  'audubons_shearwater' INTEGER,
  'bonin_petrel' INTEGER,
  'phoenix_petrel' INTEGER,
  'bulwers_petrel' INTEGER,
  'sooty_petrel' INTEGER,
  'redtailed_tropicbird' INTEGER,
  'whitetailed_tropicbird' INTEGER,
  'masked_booby' INTEGER,
  'brown_booby' INTEGER,
  'redfooted_booby' INTEGER,
  'great_frigatebird' INTEGER,
  'golden_plover' INTEGER,
  'ruddy_turnstone' INTEGER,
  'wandering_tattler' INTEGER,
  'sanderling' INTEGER,
  'bristlethighed_curlew' INTEGER,
  'sooty_tern' INTEGER,
  'graybacked_tern' INTEGER,
  'brownwinged_tern' INTEGER,
  'common_noddy' INTEGER,
  'hawaiian_noddy' INTEGER,
  'bluegray_noddy' INTEGER,
  'fairy_tern' INTEGER ,
  'remarks' TEXT,
  'total_birds' INTEGER
);

Done.
Done.


[]

In [4]:
%sql PRAGMA TABLE_INFO(observation_list);

Done.


cid,name,type,notnull,dflt_value,pk
0,id,INTEGER,1,,1
1,location,TEXT,1,,0
2,observer,TEXT,0,,0
3,weather,TEXT,0,,0
4,date,TEXT,1,,0
5,time_start,TEXT,1,,0
6,time_end,TEXT,1,,0
7,laysan_albatross,INTEGER,0,,0
8,black_footed_albatross,INTEGER,0,,0
9,wedge_tailed_shearwater,INTEGER,0,,0


In [5]:
try:
    %sql INSERT INTO observation_list ('location', 'date', 'time_start', 'time_end', 'wedge_tailed_shearwater', 'redfooted_booby', 'great_frigatebird', 'sooty_tern', 'common_noddy', 'skua', 'tern', 'pterochroza', 'remarks', 'total_birds') VALUES ('oahu to 20.38 N 158.34 W', '1964-10-01', '14:20', '17:30', 119, 5, 1, 6, 7, 1, 2, 5, "37.2 and 1.9", 148);
except Exception as e:
    print(str(e))

(sqlite3.OperationalError) table observation_list has no column named skua [SQL: 'INSERT INTO observation_list (\'location\', \'date\', \'time_start\', \'time_end\', \'wedge_tailed_shearwater\', \'redfooted_booby\', \'great_frigatebird\', \'sooty_tern\', \'common_noddy\', \'skua\', \'tern\', \'pterochroza\', \'remarks\', \'total_birds\') VALUES (\'oahu to 20.38 N 158.34 W\', \'1964-10-01\', \'14:20\', \'17:30\', 119, 5, 1, 6, 7, 1, 2, 5, "37.2 and 1.9", 148);']


In [6]:
try:
    %sql INSERT INTO observation_list ('location', 'date', 'time_start', 'time_end', 'wedge_tailed_shearwater', 'redfooted_booby', 'great_frigatebird', 'sooty_tern', 'common_noddy', 'remarks', 'total_birds') VALUES ('oahu to 20.38 N 158.34 W', '1964-10-01', '14:20', '17:30', 119, 5, 1, 6, 7, "37.2 and 1.9", 148);
except Exception as e:
    print(str(e))

1 rows affected.


In [7]:
# Success, but - 
# Can't analyze location, can't align remarks with birds, and we are missing observations - the total doesn't add up
# also in some cases we have start and end locations
# also, what about abundance and breeding?

%sql select * from observation_list

Done.


id,location,observer,weather,date,time_start,time_end,laysan_albatross,black_footed_albatross,wedge_tailed_shearwater,christmas_shearwater,audubons_shearwater,bonin_petrel,phoenix_petrel,bulwers_petrel,sooty_petrel,redtailed_tropicbird,whitetailed_tropicbird,masked_booby,brown_booby,redfooted_booby,great_frigatebird,golden_plover,ruddy_turnstone,wandering_tattler,sanderling,bristlethighed_curlew,sooty_tern,graybacked_tern,brownwinged_tern,common_noddy,hawaiian_noddy,bluegray_noddy,fairy_tern,remarks,total_birds
1,oahu to 20.38 N 158.34 W,,,1964-10-01,14:20,17:30,,,119,,,,,,,,,,,5,1,,,,,,6,,,7,,,,37.2 and 1.9,148


## Round 2

Relevant to entities and attributes, and keys (primary and foreign)

Still have a problem with observation->species relationship. Other relationships are 1:1, as far as we can tell from the data.

As given, species table is most obviously not normalized.

As implemented this requires us to record one complete observation for each observed species.

In [17]:
%%sql
DROP TABLE IF EXISTS location;
CREATE TABLE location (
    'id' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'start_northing' TEXT NOT NULL,
    'start_easting' TEXT NOT NULL,
    'end_northing' TEXT NOT NULL,
    'end_easting' TEXT NOT NULL,
    'start_name' TEXT,
    'end_name' TEXT
);
DROP TABLE IF EXISTS observer;
CREATE TABLE observer (
    'id' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'fname' TEXT,
    'lname' TEXT,
    'org' TEXT
);
DROP TABLE IF EXISTS species;
CREATE TABLE species (
    'id' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'taxon' TEXT,
    'common_name' TEXT,
    'count' INTEGER,
    'remarks' TEXT
);
DROP TABLE IF EXISTS observation;
CREATE TABLE observation (
    'id' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'date' TEXT NOT NULL,
    'location' INTEGER NOT NULL,
    'observer' INTEGER NOT NULL,
    'species' INTEGER NOT NULL
);

Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.


[]

In [18]:
try:
    %sql INSERT INTO location ('start_name', 'start_northing', 'start_easting', 'end_northing', 'end_easting') VALUES ('Oahu', '20.50 N', '158.20 W', '20.38 N', '158.34 W');
    %sql INSERT INTO observer ('org') VALUES ('ATF')
    %sql INSERT INTO species ('common_name', 'count', 'remarks') VALUES ('wedge-tailed shearwater', 119, '37.2')
    %sql INSERT INTO species ('common_name', 'count') VALUES ('red-footed booby', 5)
    %sql INSERT INTO species ('common_name', 'count') VALUES ('great frigatebird', 1)
    %sql INSERT INTO species ('common_name', 'count', 'remarks') VALUES ('sooty tern', 6, '1.9')
    %sql INSERT INTO species ('common_name', 'count') VALUES ('common noddy', 7)
    %sql INSERT INTO species ('common_name', 'count') VALUES ('skua', 1)
    %sql INSERT INTO species ('common_name', 'count') VALUES ('tern', 2)
    %sql INSERT INTO species ('common_name', 'count') VALUES ('pterochroza', 5)
except Exception as e:
    print(str(e))

1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


In [19]:
# now can reference ids to insert into observations
# note - this is not an example of good design!
%sql select * from location

Done.


id,start_northing,start_easting,end_northing,end_easting,start_name,end_name
1,20.50 N,158.20 W,20.38 N,158.34 W,Oahu,


In [20]:
%sql select * from observer

Done.


id,fname,lname,org
1,,,ATF


In [21]:
%sql select * from species

Done.


id,taxon,common_name,count,remarks
1,,wedge-tailed shearwater,119,37.2
2,,red-footed booby,5,
3,,great frigatebird,1,
4,,sooty tern,6,1.9
5,,common noddy,7,
6,,skua,1,
7,,tern,2,
8,,pterochroza,5,


In [22]:
try:
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 1);
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 2);
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 3);
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 4);
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 5);
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 6);
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 7);
    %sql INSERT INTO observation ('date', 'location', 'observer', 'species') VALUES ('1964-10-01', 1, 1, 8);
except Exception as e:
    print(str(e))

1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


In [23]:
# the problem here is that without foreign keys or a lot of joins we can't deference the ids of location, observer, etc.
# but for the most part this is a (poor) solution to the problems we encountered trying to capture
# observation data with a flat, spreadsheet-like design

# join example below

%sql select * from observation

Done.


id,date,location,observer,species
1,1964-10-01,1,1,1
2,1964-10-01,1,1,2
3,1964-10-01,1,1,3
4,1964-10-01,1,1,4
5,1964-10-01,1,1,5
6,1964-10-01,1,1,6
7,1964-10-01,1,1,7
8,1964-10-01,1,1,8


In [24]:
%%sql
SELECT observation.date, species.common_name, species.count, species.remarks
FROM observation
INNER JOIN species ON observation.species = species.id

Done.


date,common_name,count,remarks
1964-10-01,wedge-tailed shearwater,119,37.2
1964-10-01,red-footed booby,5,
1964-10-01,great frigatebird,1,
1964-10-01,sooty tern,6,1.9
1964-10-01,common noddy,7,
1964-10-01,skua,1,
1964-10-01,tern,2,
1964-10-01,pterochroza,5,


## Round 3

Relevant to normalization - 'observations' and 'species' has a M:N cardinality that needs to be resolved

Also, do other tables as defined satisfy 1NF, 2NF, and 3NF?

* 1NF - no repeating columns
* 2NF - 1NF AND a) PK is a single attribute or if composite b) each non-key attribute must be dependent on the entire key for uniqueness (eliminate redundant values)s
* 3NF - 2NF AND elinimate transitive dependency: non-key attributes may not be functionally dependent on another non-key attribute (https://opentextbc.ca/dbdesign01/chapter/chapter-12-normalization/)

So Round 2 definitions were 1NF, and also 2NF since PK is a single attribute
If we had created composite keys with species and observation, they would not be 2NF

location is 2NF
observer is 2NF
species is 2NF with transitive dependencies
observation is 2NF with transitive dependencies

Now achieve 3NF for all

In observation, what makes each row unique is the species. Create a dependent entity to resolve M:N relationship

In [47]:
%%sql
DROP TABLE IF EXISTS location;
CREATE TABLE location (
    'locID' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'start_northing' TEXT NOT NULL,
    'start_easting' TEXT NOT NULL,
    'end_northing' TEXT NOT NULL,
    'end_easting' TEXT NOT NULL,
    'start_name' TEXT,
    'end_name' TEXT
);
DROP TABLE IF EXISTS observer;
CREATE TABLE observer (
    'observerID' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'fname' TEXT,
    'lname' TEXT,
    'org' TEXT
);
DROP TABLE IF EXISTS species;
CREATE TABLE species (
    'speciesID' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'taxon' TEXT,
    'common_name' TEXT
);
DROP TABLE IF EXISTS observation;
CREATE TABLE observation (
    'observationID' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'date' TEXT NOT NULL,
    'locID' INTEGER NOT NULL,
    'observerID' INTEGER NOT NULL,
    FOREIGN KEY('locID') REFERENCES location('locID'),
    FOREIGN KEY('observerID') REFERENCES observer('observerID')
);
DROP TABLE IF EXISTS observed_species;
CREATE TABLE observed_species (
    'id' INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    'observationID' INTEGER NOT NULL,
    'speciesID' INTEGER NOT NULL,
    'count' INTEGER NOT NULL,
    'remarks' TEXT,
    FOREIGN KEY('observationID') REFERENCES observation('observationID'),
    FOREIGN KEY('speciesID') REFERENCES species('speciesID')
);

Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.


[]

In [48]:
# location and observer tables haven't changed, so we can reuse previous insert statements

try:
    %sql INSERT INTO location ('start_name', 'start_northing', 'start_easting', 'end_northing', 'end_easting') VALUES ('Oahu', '20.50 N', '158.20 W', '20.38 N', '158.34 W');
    %sql INSERT INTO observer ('org') VALUES ('ATF')
except Exception as e:
    print(str(e))

1 rows affected.
1 rows affected.


In [49]:
# a much simpler insert statement for species - populate all at once

try:
    %sql INSERT INTO species ('common_name') VALUES ('laysan_albatross')
    %sql INSERT INTO species ('common_name') VALUES ('black_footed_albatross' )
    %sql INSERT INTO species ('common_name') VALUES ('wedge_tailed_shearwater')
    %sql INSERT INTO species ('common_name') VALUES ('christmas_shearwater')
    %sql INSERT INTO species ('common_name') VALUES ('audubons_shearwater')
    %sql INSERT INTO species ('common_name') VALUES ('bonin_petrel')
    %sql INSERT INTO species ('common_name') VALUES ('phoenix_petrel')
    %sql INSERT INTO species ('common_name') VALUES ('bulwers_petrel')
    %sql INSERT INTO species ('common_name') VALUES ('sooty_petrel')
    %sql INSERT INTO species ('common_name') VALUES ('redtailed_tropicbird' )
    %sql INSERT INTO species ('common_name') VALUES ('whitetailed_tropicbird')
    %sql INSERT INTO species ('common_name') VALUES ('masked_booby')
    %sql INSERT INTO species ('common_name') VALUES ('brown_booby')
    %sql INSERT INTO species ('common_name') VALUES ('redfooted_booby')
    %sql INSERT INTO species ('common_name') VALUES ('great_frigatebird')
    %sql INSERT INTO species ('common_name') VALUES ('golden_plover')
    %sql INSERT INTO species ('common_name') VALUES ('ruddy_turnstone')
    %sql INSERT INTO species ('common_name') VALUES ('wandering_tattler' )
    %sql INSERT INTO species ('common_name') VALUES ('sanderling')
    %sql INSERT INTO species ('common_name') VALUES ('bristlethighed_curlew')
    %sql INSERT INTO species ('common_name') VALUES ('sooty_tern')
    %sql INSERT INTO species ('common_name') VALUES ('graybacked_tern')
    %sql INSERT INTO species ('common_name') VALUES ('brownwinged_tern')
    %sql INSERT INTO species ('common_name') VALUES ('common_noddy')
    %sql INSERT INTO species ('common_name') VALUES ('hawaiian_noddy')
    %sql INSERT INTO species ('common_name') VALUES ('redtailed_tropicbird' )
    %sql INSERT INTO species ('common_name') VALUES ('bluegray_noddy')
    %sql INSERT INTO species ('common_name') VALUES ('fairy_tern')
except Exception as e:
    print(str(e))

1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


In [50]:
%sql select * from location

Done.


locID,start_northing,start_easting,end_northing,end_easting,start_name,end_name
1,20.50 N,158.20 W,20.38 N,158.34 W,Oahu,


In [51]:
%sql select * from observer

Done.


observerID,fname,lname,org
1,,,ATF


In [52]:
try:
    %sql INSERT INTO observation ('date', 'locID', 'observerID') VALUES ('1964-10-01', 1, 1);
except Exception as e:
    print(str(e))

1 rows affected.


In [53]:
%sql select * from observation

Done.


observationID,date,locID,observerID
1,1964-10-01,1,1


In [54]:
%sql select * from species

Done.


speciesID,taxon,common_name
1,,laysan_albatross
2,,black_footed_albatross
3,,wedge_tailed_shearwater
4,,christmas_shearwater
5,,audubons_shearwater
6,,bonin_petrel
7,,phoenix_petrel
8,,bulwers_petrel
9,,sooty_petrel
10,,redtailed_tropicbird


In [55]:
# for each observation we can add any birds not on the list
# then add observations

try:
    %sql INSERT INTO species ('common_name') VALUES ('skua')
    %sql INSERT INTO species ('common_name') VALUES ('tern' )
    %sql INSERT INTO species ('common_name') VALUES ('pterochroza')
except Exception as e:
    print(str(e))

1 rows affected.
1 rows affected.
1 rows affected.


In [56]:
%sql select * from species

Done.


speciesID,taxon,common_name
1,,laysan_albatross
2,,black_footed_albatross
3,,wedge_tailed_shearwater
4,,christmas_shearwater
5,,audubons_shearwater
6,,bonin_petrel
7,,phoenix_petrel
8,,bulwers_petrel
9,,sooty_petrel
10,,redtailed_tropicbird


In [57]:
try:
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count', 'remarks') VALUES (1, 3, 119, '37.2');
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count') VALUES (1, 14, 5);
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count') VALUES (1, 15, 1);
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count', 'remarks') VALUES (1, 21, 6, '1.9');
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count') VALUES (1, 24, 7);
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count') VALUES (1, 29, 1);
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count') VALUES (1, 30, 2);
    %sql INSERT INTO observed_species ('observationID', 'speciesID', 'count') VALUES (1, 31, 5);
except Exception as e:
    print(str(e))

1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
