# Working With CONSTRAINT Actions

By adding foreign key constraints to a table definition, we can prevent rows being added to a child table that do not contain a valid cross-reference into the parent table(s). But what if a row in the parent table that is referenced by one or more records in the child table is deleted? In that case, won't the child table contain orphaned elements?

In this notebook, we will look at some of the different ways in which the database can behave when referential constraints risk being violated. This gives us a high level of control over how database integrity is maintained.

You should spend about 2 hours on this notebook.

## Setting up

The next group of cells set up your database connection, and reset the database to a clean state. Check notebook *08.1 Data Definition Language in SQL* if you are unsure what the next cells do.

You may need to change the given values of the variables `DB_USER` and `DB_PWD`, depending on which environment you are using.

In [None]:
# Make the connection

%run sql_init.ipynb
print("Connecting with connection string : {}".format(DB_CONNECTION))
%sql $DB_CONNECTION

In [None]:
%run reset_databases.ipynb

## The `hospital` schema

As with notebook *09.2 Using foreign keys in SQL*, in this notebook we will use the `hospital` schema. This schema contains the four populated tables `patient`, `doctor`, `drug` and `prescription`, as well as the primary key constraints and foreign key constraints defined on them, as in Activity 3 of notebook *09.1 Defining Foreign Keys in SQL*.


As before, we can use the built-in variable `search_path` to tell PostgreSQL to search the `hospital` schema before the `public` schema:

In [None]:
%%sql

SET search_path TO hospital, public;

And check the value of `search_path` with:

In [None]:
%%sql

SHOW search_path;

which should list `hospital` before `public`.

## Referential actions

The previous notebooks have defined `doctor_id` as a column in the child table `patient`, which references the parent `doctor` table. If we look at the  `patient` table, we can see which values in the `patient` table reference values in the `doctor` table:

In [None]:
%%sql

SELECT *
FROM patient;

If you ran the cell from the initial database state, the doctors which are referenced have identifiers `d06`, `d07`, `d10` and `d11`. We can see these in the `doctor` table:

In [None]:
%%sql

SELECT * 
FROM doctor;

### Activity 1

The statements in the next two cells attempt to remove the rows from the `doctor` table with values of `d09` and `d10` in the `doctor_id` column. What do you think will happen if you try execute the statements (with the tables as they are populated at this point in the notebook)? Why?

```sql
DELETE FROM doctor
WHERE doctor_id='d09';

DELETE FROM doctor
WHERE doctor_id='d10';
```


Write your answer in this cell

#### Our solution

To reveal our solution, run this cell or click on the triangle symbol on the left-hand side of the cell.

The following cell executes the first statement:

In [None]:
%%sql

DELETE FROM doctor
WHERE doctor_id='d09';

You should find that this deletes the row from `doctor` with a value of `doctor_id` of `'d09'`. Check this with a `SELECT` query:

In [None]:
%%sql

SELECT * 
FROM doctor;

You should have found that the row from `doctor` with a value of `doctor_id` of `'d09'` has been deleted. However, compare this with the behaviour when we try to delete the row with value `'d10'`:

In [None]:
%%sql

DELETE FROM doctor
WHERE doctor_id='d10';

If you execute the cell, you should find that an `IntegrityError` is raised, with the the message `update or delete on table "doctor" violates foreign key constraint "hospital_patient_doctor_fk" on table "patient"`. As we have seen ([Part 9, Section 4](https://learn2.open.ac.uk/mod/oucontent/view.php?id=1349963&section=4)), the foreign key constraint requires that each value that exists in the foreign key column (here, `patient.doctor_id`) must match a value in the `doctor.doctor_id` column. If we try to delete the row in `doctor` in which `doctor_id` is equal to `d06`, then there will be rows in `patient` in which the value of `doctor_id` does not correspond to any row in `doctor`. This is stated in the error message as: `Key (doctor_id)=(d10 ) is still referenced from table "patient".`.

Therefore, the deletion is prevented.

#### End of Activity 1

------------------------------------------------

If you have read [Part 9, section 9: Taking actions when foreign key constraints are violated](https://learn2.open.ac.uk/mod/oucontent/olinkremote.php?website=TM351&targetdoc=Part%209%20Relational%20data%20modelling&targetptr=9.1) of the online materials, you will have seen that there are several different actions that a RDBMS can take when a referenced foreign key value is deleted. In this case, the behaviour is the default behaviour for PostgreSQL, which is `NO ACTION` for *updates* which would result in a constraint violation, and `RESTRICT` for *deletions* which would result in a constraint violation.

As a reminder, here is the table from the materials which summarises the behaviours described in Harrington’s section [Foreign Keys](https://www.open.ac.uk/libraryservices/resource/bookchapter:117139&f=28414):

|<p align="left">SQL</p> |<p align="left">behaviour</p> |ON UPDATE | ON DELETE|
|--|--|--|--|
|<p align="left">SET NULL </p>    |<p align="left">Replace the foreign key value with NULL; this isn’t <br> possible when the foreign key  is part of the primary key <br> of its table, or where it has been defined as NOT NULL <br> as part of the column definition </p> |  Y | Y |
|<p align="left">SET DEFAULT  | <p align="left">Replace the foreign key value with the column’s default value      |    Y | Y |
|<p align="left">CASCADE      | <p align="left">Delete or update all the foreign key rows which contain unmatched <br> values      |     Y | Y |
|<p align="left">NO ACTION    | <p align="left">On update, make no modifications of foreign key values      |     Y | N |
|<p align="left">RESTRICT     | <p align="left">Do not allow deletions of primary key rows if this would result <br> in a foreign key constraint violation      |     N | Y |

To specify a particular action when a foreign key constraint is violated, we use an additional `ON DELETE` clause, so that the constraint definition syntax becomes:

<code>ALTER TABLE &#x2329;table_name&#x232A;
ADD CONSTRAINT &#x2329;constraint_name&#x232A;
    FOREIGN KEY (&#x2329;column1, column2, ...&#x232A;) REFERENCES &#x2329;table_name&#x232A;
        ON UPDATE &#x2329;action&#x232A;
        ON DELETE &#x2329;action&#x232A;;
</code>

The clause <code>ON UPDATE &#x2329;action&#x232A;</code> is optional; if it is not specified, then the default behaviour is <code>ON UPDATE NO ACTION</code>. Similarly, the <code>ON DELETE &#x2329;action&#x232A;</code> clause is optional: if it is not specified, then the default behaviour is <code>ON DELETE RESTRICT</code>.

In Activity 1 of this notebook, attempting to delete the referenced row in the `doctor` table invoked the `RESTRICT` behaviour of the DBMS. We will now look at two of the other possible behaviours: `ON DELETE/UPDATE CASCADE` and `ON DELETE/UPDATE SET NULL`.

### ON DELETE CASCADE

In order to experiment with the different constraint management behaviours, we need to be able to redefine the constraints on the various tables. If you check the error message generated in Activity 1, you can find the name of the constraint which was violated:

<code>update or delete on table "doctor" violates foreign key constraint "hospital_patient_doctor_fk" on table "patient"
DETAIL:  Key (doctor_id)=(d10 ) is still referenced from table "patient".
</code>

So the foreign key constraint in this case is called `hospital_patient_doctor_fk` (which is similar to the name we gave the foreign key constraint in notebook *09.1 Defining Foreign Keys in SQL*, but with `hospital_` prepended to clarify that the constraint has been defined in the `hospital` schema). Let's start by removing this constraint: 

In [None]:
%%sql

ALTER TABLE patient
DROP CONSTRAINT hospital_patient_doctor_fk;

Now we can redefine the constraint, with the constraint action on deletion being `CASCADE`, rather than `RESTRICT` and `NO ACTION`:

In [None]:
%%sql

ALTER TABLE patient
ADD CONSTRAINT hospital_patient_doctor_fk
    FOREIGN KEY (doctor_id) REFERENCES doctor ON DELETE CASCADE;

Now, let's see what happens when we try the `DELETE` statement again: 

In [None]:
%%sql

DELETE FROM doctor
WHERE doctor_id='d10';

If we check the `doctor` table, we should now find that the row with `d10` in the `doctor_id` column has been removed:

In [None]:
%%sql

SELECT *
FROM doctor;

As we would hope, the row has now been deleted. But what about the rows in `patient` which referenced the deleted row? To see, we can look at the `patient` table:

In [None]:
%%sql

SELECT *
FROM patient;

You should see that all the rows in `patient` which had contained `d10` in the `doctor_id` column have also been deleted.

In [None]:
%%sql

SELECT *
FROM patient
WHERE doctor_id = 'd10';

What will happen if we try to change a value in the `doctor` table? The following statement will attempt to change the row:

|doctor_id|doctor_name|
|---------|-----------|
|d11      |Nolan

in `doctor` to:

|doctor_id|doctor_name|
|---------|-----------|
|d17      |Nolan



In [None]:
%%sql

UPDATE doctor
SET doctor_id='d17'
    WHERE doctor_id='d11';

When the previous cell is executed, you should find that an `IntegrityError` is raised, with the explanatory text `update or delete on table "doctor" violates foreign key constraint "hospital_patient_doctor_fk" on table "patient"`. This indicates that our updated foreign key constraint only specified what should happen in case of `ON DELETE`: we failed to specify what should happen in case of `ON UPDATE`. Let's redefine the `hospital_patient_doctor_fk` constraint, but this time specify that the `CASCADE` action should take effect for both deletions and updates:

In [None]:
%%sql

ALTER TABLE patient
DROP CONSTRAINT hospital_patient_doctor_fk;

ALTER TABLE patient
ADD CONSTRAINT hospital_patient_doctor_fk
    FOREIGN KEY (doctor_id) REFERENCES doctor 
        ON UPDATE CASCADE 
        ON DELETE CASCADE;

Now we can try to update the table again:

In [None]:
%%sql

UPDATE doctor
SET doctor_id='d17'
WHERE doctor_id='d11';

Having executed the previous cell, we should now find that the row in the `doctor` table whose value of `doctor_name` is `Nolan` should have `d17` as the value in `doctor_id`.

In [None]:
%%sql

SELECT *
FROM doctor;

How has this affected the values in `patient`? We can check the table:

In [None]:
%%sql

SELECT *
FROM patient
WHERE doctor_id='d11' OR doctor_id='d17';

You should find that the update has cascaded through to the referring values in `patient`, and that the values in `patient.doctor_id` which had been `d11` are now `d17`.

## Constraint interactions can be complicated!

It is often the case that several constraints will act upon a given row or table at the same time. In these circumstances, it can be tricky working out what the final behaviour will be. Remember, that a change to the database will only be made if the change does not cause a constraint violation anywhere in the database! 

### Activity 2

The statement in the next cell attempts to remove the row from the `doctor` table with value of `d06` in the `doctor_id` column. What do you think will happen if you try execute the statement (with the tables as they are populated at this point in the notebook)? Why?

```sql
DELETE FROM doctor
WHERE doctor_id='d06'
```

Write your answer in this cell

#### Our solution

To reveal our solution, run this cell or click on the triangle symbol on the left-hand side of the cell.

The following cell executes the given statement:

In [None]:
%%sql

DELETE FROM doctor
WHERE doctor_id='d06';

If you execute the cell, you should find that an `IntegrityError` is raised, with the the message ` update or delete on table "doctor" violates foreign key constraint "prescription_doctor_fk" on table "prescription"`. Although we changed the foreign key definition between `patient` and `doctor` to `CASCADE`, we did not change the definition between `prescription` and `doctor`, which is still `RESTRICT`.

Therefore, although the constraint `patient_doctor_fk` would allow the row in `doctor` to be deleted, the deletion is prevented by the constraint `prescription_doctor_fk`.


#### End of Activity 2

----------------------------------------------------

In Activity 2, we saw that the `prescription_doctor_fk` constraint could block the deletion of a row from the `doctor` table. What happens if we now drop that constraint?

The statement in the following cell drops the foreign key constraint defined on the `prescription` table referencing the `doctor` table:

In [None]:
%%sql

ALTER TABLE prescription
DROP CONSTRAINT hospital_prescription_doctor_fk;

### Activity 3

The statement in the next cell again attempts to remove the row from the `doctor` table with value of `d06` in the `doctor_id` column. What do you think will happen if you try execute the statement (with the tables as they are populated at this point in the notebook)? Why?

```sql
DELETE FROM doctor
WHERE doctor_id='d06';
```

Write your answer in this cell

#### Our solution

To reveal our solution, run this cell or click on the triangle symbol on the left-hand side of the cell.

The following cell executes the given statement:

In [None]:
%%sql

DELETE FROM doctor
WHERE doctor_id='d06';

If you execute the cell, you should find that an `IntegrityError` is raised again. This time, the message states `update or delete on table "patient" violates foreign key constraint "hospital_prescription_patient_fk" on table "prescription"`. What's happened here?

Although we dropped the foreign key constraint between `prescription` and `doctor`, we did not change the definition between `prescription` and `patient`, and this constraint still has the default ON DELETE action of RESTRICT.

In this case, when we attempted to delete the row of `doctor` with `d06` in the `doctor.doctor_id` column, the ON DELETE CASCADE behaviour of the constraint `hospital_patient_doctor_fk` means that the rows in `patient` with `d06` in the `patient.doctor_id` column would be deleted. To see those rows, we can do a SELECT query:

In [None]:
%%sql

SELECT *
FROM patient
WHERE doctor_id='d06';

All these rows would be deleted following the `ON DELETE CASCADE` action. However, some of these rows are referenced in the `prescription` table. Because the behaviour defined by `hospital_prescription_patient_fk` is `RESTRICT` (ie. prevent the deletion of referenced rows), the row from `doctor` cannot be deleted without also violating the constraint `hospital_prescription_patient_fk`. The deletion therefore fails.

#### End of Activity 3

-----------------------------------------

As Activity 2 and Activity 3 show, the interaction of constraints in a complete database can be very complex. It is important to be able to reason about how all the constraints in a large database will behave, and plan any changes accordingly.

## ON DELETE SET NULL

We have now seen how we can use the referential actions of `RESTRICT` to prevent deletions, and `CASCADE` to remove rows in referring tables to ensure that referential integrity is maintained. The final referential action we will look at is `SET NULL`.

If you have made only the adjustments to the database that we have shown in this notebook, both `patient` and `doctor` tables should contain rows with `d07` in the `doctor_id` column, and the foreign keys on the `prescription` table have been `DROP`ped.

If you've changed these tables by experimenting further, that's great! You can always reset the database by executing the next cell, remembering to change the cell to be Code, rather than Raw NBConvert.

We can check that the `doctor` table contains rows for the doctor with id `d07`:

In [None]:
%%sql 

-- Check that the value 'd07' exists in the 'doctor.doctor_id' column

SELECT *
FROM doctor
WHERE doctor_id='d07';

and that there are some referencing rows in the `patient` table:

In [None]:
%%sql

-- Check that the value 'd07' exists in the 'patient.doctor_id' column

SELECT *
FROM patient
WHERE doctor_id='d07';

As before, to investigate the behaviour of the referential action we can drop the existing constraint and replace it with a new constraint with the new referential action:

In [None]:
%%sql

ALTER TABLE patient
DROP CONSTRAINT hospital_patient_doctor_fk;

ALTER TABLE patient
ADD CONSTRAINT hospital_patient_doctor_fk
    FOREIGN KEY (doctor_id) REFERENCES doctor
        ON DELETE SET NULL;

We can now attempt to remove the row from `doctor` with `d07` in the `doctor_id` column:

In [None]:
%%sql

DELETE FROM doctor
WHERE doctor_id='d07';

The row has been removed as we would expect. We can now see how the `patient` table has been affected:

In [None]:
%%sql

SELECT *
FROM patient;

There are now several rows which do not have a value in the `doctor_id` column (shown as `None` in the returned `DataFrame`). These are where the references to `d07` have been replaced by `NULL` after `d07` was removed from the `doctor` table.

### Activity 4

We can also check that both `prescription` and `drug` contain a row in which the vlaue of `drug_code` is `T02378`:

In [None]:
%%sql

SELECT *
FROM prescription
WHERE drug_code='T02378';

In [None]:
%%sql

SELECT *
FROM drug
WHERE drug_code='T02378';

Suppose we changed the foreign key constraint in `prescription` referencing `drug` so that the referential action on deletion is `SET NULL` rather than `RESTRICT`:

In [None]:
%%sql

ALTER TABLE prescription
DROP CONSTRAINT hospital_prescription_drug_fk;

ALTER TABLE prescription
ADD CONSTRAINT hospital_prescription_drug_fk
    FOREIGN KEY (drug_code) REFERENCES drug
        ON DELETE SET NULL;

The statement in the next cell attempts to remove the row from the `drug` table with value of `T02378` in the `drug_code` column. What do you think will happen if you try execute the statement (with the referential action as `SET NULL`)? Why?


```sql
DELETE FROM drug
WHERE drug_code='T02378';
```

Write your answer in this cell

#### Our solution

To reveal our solution, run this cell or click on the triangle symbol on the left-hand side of the cell.

The following cell executes the given statement:

In [None]:
%%sql

DELETE FROM drug
WHERE drug_code='T02378';

If you execute the cell, you should find that an `IntegrityError` is raised. This time, the message states `null value in column "drug_code" violates not-null constraint`.

In this case, when we attempted to remove the row from `drug`, the referring row in `prescription` attempted to replace the value in `prescription.drug_code` with `NULL` (in accordance with the `SET NULL` referential action). However, `drug_code` is part of the primary key for `prescription`. As we have seen, values in a primary key may not be `NULL`, and so the primary key constraint would be violated. The deletion therefore fails.

#### End of Activity 4

-----------------------------------------------------

We have not discussed the final referential action, `SET DEFAULT` in this notebook. `SET DEFAULT` is not very commonly used, so we will not discuss it in TM351. This action behaves in a similar way to `SET NULL`, but replaces values with a default value rather than `NULL`. If you are interested, the details are [in the PostgresQL documentation](https://www.postgresql.org/docs/9.5/ddl-constraints.html).

## Summary

You have now completed the Part 9 notebooks. In working through Parts 8 and 9, you have seen how data is modelled in a database, and how SQL is used to define and manipulate the data. Primary keys are used to ensure that individual entities in a database can be identified, and foreign keys are used to define the relationships between entities.

In Part 10, we will see how normalisation is used to minimise redundancy in a database, and so make it easier to maintain a database's integrity.