### Week 4 - Normalisation

#### INSERT, UPDATE, DELETE Anomalies
▪ INSERT Anomaly  
    – When adding data to a relation you are required to add other (related) data  
    – Danger: other data may not be available so cannot proceed with the insert  
▪ UPDATE Anomaly  
    – Changing a value for an attribute requires multiple tuples to be changed  
    – Danger: only some tuples will be updated leading to inconsistent data  
▪ DELETE Anomaly  
    – When a tuple in a relation is deleted, all tuple data is removed  
    – Danger: related data, which may be the only such data will be lost  

#### Data Normalisation
▪ Relations MUST be normalised in order to avoid anomalies which may
occur when inserting, updating and deleting data.  
▪ Normalisation is a systematic series of steps for progressively refining the
data model.  
▪ A formal approach to analysing relations based on their primary key /
candidate keys and functional dependencies.  
▪ Used:  
    ▪ as a design technique "bottom up design", and  
    ▪ as a way of validating structures produced via "top down design" (ER model
    converted to a logical model - see next topic)  
    ▪ for this unit only concerned with conversion to third normal form - higher
    normal forms exist (Boyce Codd Normal Form, fourth normal form … )  



#### Normalisation Process Goals
▪ Creating valid relations, i.e. each relation meets the properties of the
relational model. In particular:  
    – Entity integrity  
    – Referential integrity  
    – No many-to-many relationship  
    – Each cell contains a single value (is atomic).  

▪ In practical terms when implemented in an RDBMS:  
    – Each table represents a single subject  
    – No data item will be unnecessarily stored in more than one table (remember
    some redundancy still exists - minimal redundancy).  
    – The relationship between tables can be established (via PK and FK pairs).  
    – Each table is void of insert, update and delete anomalies.  
    

#### Unnormalised Form (UNF)
![image.png](attachment:image.png)
<style type="text/css">
    img {
        width: 400px;
    }
</style>
- UNF representation of a relation is the representation which you have mapped from your inspection of the form.  
- This is a single named representation (not pluralised name)  
- No primary key  
STOCK_DETAILS(part_no, part_name, cat_code, cat_name, part_stock, part_sell(vendor_no, vendor_name, restock_date_purchased, restock_costpu, restock_qtysupplied, restock_payment))  


#### Functional Dependency
B is functionally dependent on A if A fully determines B  

Total Dependency  
- A determines B and B also determines A  

Full dependency  
- Attribute is always dependent on all attributes in composite primary key

Partial dependency
- Not fully dependent  

Transitive dependency  
- B depends on A, and C depends on B
- and B is not a candidate key


▪ Dependencies are depicted with the help of a Dependency Diagram.
▪ Normalisation converts a relation into relations of progressively smaller
number of attributes and tuples until an optimum level of decomposition is
reached - little or no data redundancy exists.
▪ The output from normalisation is a set of relations that meet all conditions
set in the relational model principles.

#### First Normal Form
• a unique primary key has been identified for each tuple/row.  
• it is a valid relation  
    – Entity integrity (no part of PK is null)  
    – Single value for each cell ie. no repeating group
    (multivalued attribute).  
• all attributes are functionally dependent on all or part of the
primary key  

▪ Move from UNF to 1NF by:
    – identifying a unique identifier for the repeating group.
    – remove any repeating group along with the PK of the main relation.
    – The PK of the new relation resulting from the removal of repeating
    group will normally have a composite PK made up of the PK of the
    main relation and the unique identifier chosen in 1. above, but this
    must be checked.
    

#### Second Normal Form
– all non key attributes are fully functionally dependent on the primary
key (simple definition)
    • used by the textbook in examples:
    • see textbook section 6-3 (last paragraph immediately below table 6.2), "Although
    normalization is typically presented from the perspective of candidate keys, this initial
    discussion assumes for the sake of simplicity that each table has only one candidate
    key"
– all non key attributes are fully functionally dependent on any
candidate key (general definition)
    • General is the requirement for our unit

    

#### Third Normal Form
    – all transitive dependencies have been removed - check for non key
    attribute dependent on another non key attribute

▪ Move from 2NF to 3NF by removing transitive dependencies
    – Remove the attributes with transitive dependency into a new relation.
    – The determinant will be an attribute in both the original and new
    relations (it will become a PK / FK relationship)
    – Assign the determinant to be the PK of the new relation

    

DRONE_RENTAL(drone_id, drone_type, drone_manufacturer, drone_date_purchased, drone_rental(rental_number, rental_date, employee_no_rental, return_date, employee_no_return, drone_damage))



Summary
Represent form as presented, no interpreted, to yield starting point (UNF)
Functional dependency
Process of removing attributes in relations based on the concept of 1NF, 2NF and 3NF
- UNF to 1NF define PK and remove repeating group
- 1NF to 2NF remove partial dependency
- 2NF to 3NF remove transitive dependency


