# Codd’s Rules and Normalization

As part of this class we have covered

* Rationale behind DBMS/RDBMS
* Database principles such as Codd’s rules
* Normalization principles
* Data Definition Language

### CODD’s Rules:

* E F Codd has defined some principles for RDBMS that need to be followed.
    * **Rule 0**: The foundation rule: relational Capabilities
    * **Rule 1**: The information rule: Information => values in table
    * **Rule 2**: The guaranteed access rule: Each and every row should be accessible by using names – such as tables, columns etc
    * **Rule 3**: Systematic treatment of null values.
    * **Rule 4**: Dynamic online catalog based on the relational model.
    * **Rule 5**: The comprehensive data sublanguage rule.
    * **Rule 6**: The View updating rule.
    * **Rule 7**: High level insert, update and delete.
    * **Rule 8**: Physical data independence: No impact on changing the physical layer of data.
    * **Rule 9**: Logical data Independence: No impact on changing table structures.
    * **Rule 10**: Integrity Independence: Constraints
    * **Rule 11**: Distribution Independence.
    * **Rule 12**: The nonsubversion rule.
* The information must be values in the table Ex: <mark>**select * from employees;**</mark>
* Guaranteed access rule means query the data by using appropriate column name and table name Ex: <mark>**select employee_id,first_name,last_name from employees;**</mark>
* Systematic treatment of null values. Ex:<mark>**select * from employees;**</mark> for some of the commission records the values are null.
* To get all the not null values for commission use <mark>**select * from employees where commission_pct is not null;**</mark>
* To get the list of tables for a user use <mark>**select * from user_tables;**</mark>
* select * from user_tab_cols retrieves the list of all columns for corresponding tables.
* **DDL** commands update the structure of data Ex: <mark>**CREATE, DROP, ALTER.**</mark>
* **DML** commands are <mark>**SELECT, INSERT, UPDATE, DELETE.**</mark>
* **TCL** (Transaction Control Language) commands are <mark>**COMMIT, ROLLBACK.**</mark>
* Constraints are <mark>**PRIMARY KEY, FOREIGN KEY, NOT NULL, CHECK, UNIQUE.**</mark>

### Normalization:

Normalization is all about defining relationship between tables.
* **1st Normal Form**
* **2nd Normal Form**
* **3rd Normal Form**
* **BCNF**
* Typically, we normalize the raw data and store it in multiple tables.
* Normalization has come into picture to address Update anomaly, Insertion anomaly and deletion anomaly issues. For more detailed information go through https://en.wikipedia.org/wiki/Database_normalization
* The first normalization form says the data should always be represented in same format for every record.
* The second normalization form defies about **functional dependency.** It means if one attribute is functionally dependent on other then we need to create a new table and insert all the attributes which are functionally dependent on other table.
* Ex: order item is functionally dependent on order and order is functionally dependent on customer. It means one customer can place as many orders and one order can have multiple order items.
* If there is one to many relationships between two attributes, then it is called as functional dependency.
* If there is **transitive dependency** example customer derives order and order derives order items and even if there is a one to one relationship between tables we need to divide the tables into multiple tables.
* BCNF is Boyce Codd normal form is implemented only if a primary key has multiple columns.

### Creating Tables:

* CREATE command is used to create table definition. It is a DDL command.
* A typical table should contain a primary key constraint which is unique and not null.

SYNTAX:

```CREATE TABLE <TABLE_NAME> ( <COLUMN_NAME>, <COLUMN_NAME> )```

CREATE TABLE orders (

order_id INTEGER,

order_date DATE,

order_customer_id INTEGER,

order_status VARCHAR2(30),

CONSTRAINT orders_pk PRIMARY KEY (order_id)

);

CREATE TABLE order_items (

order_item_id INTEGER,

order_item_order_id INTEGER,

order_item_product_id INTEGER,

order_item_quantity INTEGER,

order_item_subtotal NUMBER(8,2),

order_item_product_price NUMBER(8,2)

);

ALTER TABLE order_items ADD CONSTRAINT order_item_pk PRIMARY KEY (order_item_id);

ALTER TABLE order_items ADD CONSTRAINT order_item_fk FOREIGN KEY (order_item_order_id) REFERENCES orders (order_id);

* There are multiple ways to define a primary key for table. We can create primary key while creating a table or after creating table.
* Similarly create the rest of the tables as per the data model.
* Whenever we create a primary key there will be a index associated with it. The reason is index on top of unique sorts the data. Searching on index is faster.