# Referential integrity and referential actions
## Solution Notebook

This is a set of worked solutions to the `10.6 Referential integrity and referential actions` Notebook. 
The solutions given here should be viewed as guide only: other equally acceptable solutions may be possible.

Enable access to the PostgreSQL database engine via [SQL Cell Magic](https://pypi.python.org/pypi/ipython-sql).

In [2]:
%load_ext sql
%sql postgresql://test:test@localhost:5432/tm351test

'Connected: test@tm351test'

As the `doctor` and `patient` tables may have been updated by another Notebook, recreate them.

In [3]:
%%sql
DROP TABLE IF EXISTS patient CASCADE;
DROP TABLE IF EXISTS doctor CASCADE;

CREATE TABLE doctor (
 doctor_id CHAR(3) NOT NULL
  CHECK (doctor_id SIMILAR TO 'd[0-9][0-9]'),
 doctor_name VARCHAR(20) NOT NULL,
 date_of_birth DATE NOT NULL,
 PRIMARY KEY (doctor_id)
 );

CREATE TABLE patient (
  patient_id CHAR(4) NOT NULL
    CHECK (patient_id SIMILAR TO 'p[0-9][0-9][0-9]'),
  patient_name VARCHAR(20) NOT NULL,
  date_of_birth DATE NOT NULL,
  gender CHAR(1) NOT NULL
    CHECK (gender = 'F' OR gender = 'M'),
  height DECIMAL(4,1)
    CHECK (height > 0),
  weight DECIMAL(4,1)
    CHECK (weight > 0),
  doctor_id CHAR(3),
 PRIMARY KEY (patient_id),
 FOREIGN KEY (doctor_id) REFERENCES doctor(doctor_id)
 );

Done.
Done.
Done.
Done.


[]

Populate the tables from a CSV files using [Psycopg](http://initd.org/psycopg/docs/index.html), 
a PostgreSQL database adapter for Python.

In [4]:
import psycopg2 as pg
import pandas as pd
import pandas.io.sql as psqlg

In [5]:
# open a connection to the PostgreSQL database tm351test
conn = pg.connect(dbname='tm351test', host='localhost', user='test', password='test', port=5432)
# create a cursor
c = conn.cursor()

# open doctor.dat
io = open('data/doctor.dat', 'r')
# execute the PostgreSQL copy command
c.copy_from(io, 'doctor')
# close doctor.dat
io.close()
# commit transaction
conn.commit()

# open patient+doctor_id.dat
io = open('data/patient+doctor_id.dat', 'r')
# execute the PostgreSQL copy command
c.copy_from(io, 'patient')
# close patient+doctor_id.dat
io.close()
# commit transaction
conn.commit()

# close cursor
c.close()
# close database connection
conn.close()

In [6]:
%%sql
SELECT * 
FROM doctor
ORDER BY doctor_id;

5 rows affected.


doctor_id,doctor_name,date_of_birth
d06,Gibson,1954-02-24
d07,Paxton,1960-05-23
d09,Tamblin,1972-12-22
d10,Rampton,1980-09-25
d11,Nolan,1988-04-01


In [7]:
%%sql
SELECT * 
FROM patient
ORDER BY patient_id;

17 rows affected.


patient_id,patient_name,date_of_birth,gender,height,weight,doctor_id
p001,Thornton,1980-01-22,F,162.3,71.6,d06
p007,Tennent,1980-04-01,M,176.8,70.9,d07
p008,James,1980-07-08,M,167.9,70.5,d07
p009,Kay,1980-09-25,F,164.7,53.2,d06
p015,Harris,1980-12-04,M,180.6,64.3,d06
p031,Rubinstein,1980-12-23,F,,,
p037,Boswell,1981-06-11,F,,,
p038,Ming,1981-09-23,M,186.3,85.4,d11
p039,Maher,1981-10-09,F,161.9,73.0,d11
p068,Monroe,1981-10-21,F,165.0,62.6,d10


## Activity 1 - Referential integrity

Referential integrity is enforced by the DBMS, which ensures that referential integrity is not violated, 
for example, in one of the following ways:
- when a row containing an invalid foreign key value is inserted in the *referencing* table
- when a foreign key in the *referencing* table is updated to an invalid value
- when a row with a referenced primary key is deleted from the *referenced* table
- when a referenced primary key is updated in the *referenced table*.

For each of the above, execute an SQL statement to demonstrate that PostgreSQL maintains the integrity of the 
relationship between the `doctor` and `patient` tables.

Notes:

As the `patient` table definition above includes a `FOREIGN KEY` declaration that `REFERENCES` the `doctor` table,
the `patient` table is the *referencing* table and the `doctor` table is the *referenced* table.

- when a row containing an invalid foreign key value is inserted in the referencing table:

In [8]:
%%sql
INSERT INTO patient (patient_id, patient_name, date_of_birth, gender, doctor_id)
            VALUES ('p090','Yamashita','1970-07-27','F','d12');

IntegrityError: (psycopg2.IntegrityError) insert or update on table "patient" violates foreign key constraint "patient_doctor_id_fkey"
DETAIL:  Key (doctor_id)=(d12) is not present in table "doctor".
 [SQL: "INSERT INTO patient (patient_id, patient_name, date_of_birth, gender, doctor_id)\n            VALUES ('p090','Yamashita','1970-07-27','F','d12');"]

- when a foreign key in the referencing table is updated to an invalid value:

In [9]:
%%sql
 UPDATE patient
  SET doctor_id = 'd12'
  WHERE patient_id = 'p089';

IntegrityError: (psycopg2.IntegrityError) insert or update on table "patient" violates foreign key constraint "patient_doctor_id_fkey"
DETAIL:  Key (doctor_id)=(d12) is not present in table "doctor".
 [SQL: "UPDATE patient\n  SET doctor_id = 'd12'\n  WHERE patient_id = 'p089';"]

- when a row with a referenced primary key is deleted from the referenced table:

In [10]:
%%sql
DELETE FROM doctor
WHERE doctor_id = 'd11';

IntegrityError: (psycopg2.IntegrityError) update or delete on table "doctor" violates foreign key constraint "patient_doctor_id_fkey" on table "patient"
DETAIL:  Key (doctor_id)=(d11) is still referenced from table "patient".
 [SQL: "DELETE FROM doctor\nWHERE doctor_id = 'd11';"]

Notes:

This restriction also prevents a *referenced* table from being deleted. For example, the following statement will 
fail because the `doctor` table is dependent on the `patient` table.

`DROP TABLE doctor`;

However, a *referenced* table may be deleted if we delete the *referencing* table beforehand. 

`DROP TABLE patient`;

`DROP TABLE doctor`;

If we include the `CASCADE` option on a [`DROP TABLE`](http://www.postgresql.org/docs/9.3/static/sql-droptable.html) 
statement when deleting a *referenced* table, the `FOREIGN KEY` declarations in all the *referencing* tables will be 
deleted which allows the *referenced* table to be deleted. This is illustrated in the following sequence of SQL 
statements.

In [11]:
%%sql
DROP TABLE IF EXISTS referenced_table;
CREATE TABLE referenced_table (
 referenced_table_primary_key CHAR(2),
 PRIMARY KEY (referenced_table_primary_key)
);

DROP TABLE IF EXISTS referencing_table;
CREATE TABLE referencing_table (
 referencing_table_primary_key CHAR(2),
 referencing_table_foreign_key CHAR(2),
 PRIMARY KEY (referencing_table_primary_key),
 FOREIGN KEY (referencing_table_foreign_key) REFERENCES referenced_table(referenced_table_primary_key)
);

Done.
Done.
Done.
Done.


[]

In [12]:
%%sql
-- try to delete referenced_table.
DROP TABLE referenced_table;

InternalError: (psycopg2.InternalError) cannot drop table referenced_table because other objects depend on it
DETAIL:  constraint referencing_table_referencing_table_foreign_key_fkey on table referencing_table depends on table referenced_table
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
 [SQL: '-- try to delete referenced_table.\nDROP TABLE referenced_table;']

In [13]:
%%sql
-- display constraints defined on the referencing_table, noting the presence of the FOREIGN KEY constraint.
-- See 09.1 SQL DDL Notebook, Information schema
SELECT constraint_name, constraint_type 
FROM information_schema.table_constraints 
WHERE table_name = 'referencing_table';

3 rows affected.


constraint_name,constraint_type
referencing_table_pkey,PRIMARY KEY
referencing_table_referencing_table_foreign_key_fkey,FOREIGN KEY
2200_24801_1_not_null,CHECK


In [14]:
%%sql
-- try to delete referenced_table after removing FOREIGN KEY constraint in referencing_table.
DROP TABLE referenced_table CASCADE;

Done.


[]

In [15]:
%%sql
-- display constraints defined on the referencing_table, noting the absence of the FOREIGN KEY constraint.
SELECT constraint_name, constraint_type  
FROM information_schema.table_constraints 
WHERE table_name = 'referencing_table';

2 rows affected.


constraint_name,constraint_type
referencing_table_pkey,PRIMARY KEY
2200_24801_1_not_null,CHECK


- when a referenced primary key is updated in the referenced table:

In [16]:
%%sql
UPDATE doctor
 SET doctor_id = 'd11'
 WHERE doctor_id = 'd12';

0 rows affected.


[]

In [17]:
%%sql
SELECT * 
FROM doctor
ORDER BY doctor_id;

5 rows affected.


doctor_id,doctor_name,date_of_birth
d06,Gibson,1954-02-24
d07,Paxton,1960-05-23
d09,Tamblin,1972-12-22
d10,Rampton,1980-09-25
d11,Nolan,1988-04-01


Notes:
    
As PostgreSQL supports the notion that primary key values should be immutable, that is, never change, this restriction 
is implemented.

## Activity 2 - Referential actions

Which would be the appropriate referential action to be taken when a row is deleted from the `doctor` table 
(for example, when a doctor leaves the surgery)?



`SET NULL` would be the appropriate referential action as it would automatically set the value of `patient.doctor_id` 
of the referencing rows to `null` when a row is deleted from the `doctor` table, denoting that these patients are not 
under the care of a doctor.

We can revise the `FOREIGN KEY` declaration in the `patient` table as follows to define and illustrate the effect of 
the `SET NULL` referential action.

In [21]:
%%sql
-- determine the name of the foreign key constraint
SELECT constraint_name, constraint_type 
FROM information_schema.table_constraints 
WHERE table_name = 'patient';

10 rows affected.


constraint_name,constraint_type
patient_patient_id_check,CHECK
patient_gender_check,CHECK
patient_height_check,CHECK
patient_weight_check,CHECK
patient_pkey,PRIMARY KEY
patient_doctor_id_fkey,FOREIGN KEY
2200_24782_1_not_null,CHECK
2200_24782_2_not_null,CHECK
2200_24782_3_not_null,CHECK
2200_24782_4_not_null,CHECK


In [19]:
%%sql
-- replace foreign constraint to define referential action
ALTER TABLE patient
 DROP CONSTRAINT patient_doctor_id_fkey;

ALTER TABLE patient
 ADD CONSTRAINT patient_doctor_id_fkey 
  FOREIGN KEY (doctor_id) REFERENCES doctor(doctor_id) ON DELETE SET NULL;

Done.
Done.


[]

In [22]:
%%sql
-- delete a doctor
DELETE FROM doctor 
WHERE doctor_id = 'd10';

SELECT * 
FROM patient
ORDER BY patient_id;

0 rows affected.
17 rows affected.


patient_id,patient_name,date_of_birth,gender,height,weight,doctor_id
p001,Thornton,1980-01-22,F,162.3,71.6,d06
p007,Tennent,1980-04-01,M,176.8,70.9,d07
p008,James,1980-07-08,M,167.9,70.5,d07
p009,Kay,1980-09-25,F,164.7,53.2,d06
p015,Harris,1980-12-04,M,180.6,64.3,d06
p031,Rubinstein,1980-12-23,F,,,
p037,Boswell,1981-06-11,F,,,
p038,Ming,1981-09-23,M,186.3,85.4,d11
p039,Maher,1981-10-09,F,161.9,73.0,d11
p068,Monroe,1981-10-21,F,165.0,62.6,


`CASCADE` would be an inappropriate referential action as it would automatically delete all the rows of the `patient` 
table where `patient.doctor_id` has the same value as the primary key value of the deleted row in the `doctor` table. 
That is, all the details of the patients who under the care of a doctor would be deleted when that doctor leaves the 
surgery.

We can revise the FOREIGN KEY declaration in the `patient` table as follows to define and illustrate the effect of the 
`CASCADE` referential action.

In [23]:
%%sql
-- replace foreign constraint to define referential action
ALTER TABLE patient
 DROP CONSTRAINT patient_doctor_id_fkey;

ALTER TABLE patient
 ADD CONSTRAINT patient_doctor_id_fkey 
  FOREIGN KEY (doctor_id) REFERENCES doctor(doctor_id) ON DELETE CASCADE;

Done.
Done.


[]

In [24]:
%%sql
-- delete a doctor
DELETE FROM doctor 
WHERE doctor_id = 'd07';

SELECT * 
FROM patient
ORDER BY patient_id;

1 rows affected.
13 rows affected.


patient_id,patient_name,date_of_birth,gender,height,weight,doctor_id
p001,Thornton,1980-01-22,F,162.3,71.6,d06
p009,Kay,1980-09-25,F,164.7,53.2,d06
p015,Harris,1980-12-04,M,180.6,64.3,d06
p031,Rubinstein,1980-12-23,F,,,
p037,Boswell,1981-06-11,F,,,
p038,Ming,1981-09-23,M,186.3,85.4,d11
p039,Maher,1981-10-09,F,161.9,73.0,d11
p068,Monroe,1981-10-21,F,165.0,62.6,
p071,Harris,1981-12-12,M,186.3,76.7,
p078,Hunt,1982-02-25,M,179.9,74.3,
