## Exercises - Managing Database Objects

This exercise is primarily to assess your capabilities related to put all important DDL concepts in practice by coming up with solution for a typical data migration problem from one database (mysql) to another (postgres).
* Here are the high level steps for database migration from one type of database to another type of database.
  * Extract DDL Statements from source database (MySQL).
  * Extract the data in the form of delimited files and ship them to target database.
  * Refactor scripts as per target database (Postgres).
  * Create tables in the target database.
  * Execute pre-migration steps (disable constraints, drop indexes etc).
  * Load the data using native utilities.
  * Execute post-migration steps (enable constraints, create or rebuild indexes, reset sequences etc).
  * Sanity checks with basic queries.
  * Make sure all the impacted applications are validated thoroughly.
* We have scripts and data set available in our GitHub repository. If you are using our environment the repository is already cloned under **/data/retail_db**.
* It have scripts to create tables with primary keys. Those scripts are generated from MySQL tables and refactored for Postgres.
  * Script to create tables: **create_db_tables_pg.sql**
  * Load data into tables: **load_db_tables_pg.sql**
* Here are the steps you need to perform to take care of this exercise.
  * Create tables
  * Load data
  * All the tables have surrogate primary keys. Here are the details.
    * orders.order_id
    * order_items.order_item_id
    * customers.customer_id
    * products.product_id
    * categories.category_id
    * departments.department_id
  * Get the maximum value from all surrogate primary key fields.
  * Create sequences for all surrogate primary key fields using maximum value. Make sure to use standard naming conventions for sequences.
  * Ensure sequences are mapped to the surrogate primary key fields.
  * Create foreign key constraints based up on this information.
    * orders.order_customer_id to customers.customer_id
    * order_items.order_item_order_id to orders.order_id
    * order_items.order_item_product_id to products.product_id
    * products.product_category_id to categories.category_id
    * categories.category_department_id to departments.department_id
  * Insert few records in `departments` to ensure that sequence generated numbers are used for `department_id`.
* Here are the commands to launch `psql` and run scripts to create tables as well as load data into tables.

```sql
psql -U itversity_retail_user \
  -h localhost \
  -p 5432 \
  -d itversity_retail_db \
  -W

\i /data/retail_db/create_db_tables_pg.sql

\i /data/retail_db/load_db_tables_pg.sql
```
* We use this approach of creating tables, loading data and then adding constraints as well as resetting sequences for large volume data migrations from one database to another database.
* Here are the commands or queries you need to come up with to solve this problem.

In [1]:
%load_ext sql

In [5]:
%env DATABASE_URL=postgresql://itv001477_retail_user:xndxklzpkk2h37en8z32vzeekbuvwkca@m01.itversity.com:5433/itv001477_retail_db

env: DATABASE_URL=postgresql://itv001477_retail_user:xndxklzpkk2h37en8z32vzeekbuvwkca@m01.itversity.com:5433/itv001477_retail_db


### Exercise 1

Queries to get maximum values from surrogate primary keys.

In [13]:
%%sql
select constraint_name,constraint_type from information_schema.table_constraints limit 6

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
6 rows affected.


constraint_name,constraint_type
departments_pkey,PRIMARY KEY
categories_pkey,PRIMARY KEY
products_pkey,PRIMARY KEY
customers_pkey,PRIMARY KEY
orders_pkey,PRIMARY KEY
order_items_pkey,PRIMARY KEY


In [20]:
%%sql
select max(order_id) from orders

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
1 rows affected.


max
68883


In [31]:
%%sql
select max(order_item_id) from order_items

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
1 rows affected.


max
172198


In [35]:
%%sql
select max(category_id) from categories

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
1 rows affected.


max
58


In [34]:
%%sql
select max(product_id) from products

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
1 rows affected.


max
1345


In [33]:
%%sql
select max(customer_id) from customers

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
1 rows affected.


max
12435


In [32]:
%%sql
select max(department_id) from departments

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
1 rows affected.


max
7


### Exercise 2

Commands to add sequences with `START WITH` pointing to the maximum value for the corresponding surrogate primary key fields. Make sure to use meaningful names to sequences **TABLENAME_SURROGATEFIELD_seq** (example: users_user_id_seq for users.user_id)

In [41]:
%%sql
create sequence if not exists orders_order_id_seq
start with 68883;
create sequence if not exists order_items_order_item_id_seq
start with 172198;
create sequence if not exists categories_category_id_seq
start with 58;
create sequence if not exists products_product_id_seq
start with 1345;
create sequence if not exists customers_customer_id_seq
start with 12435;
create sequence if not exists departments_department_id_seq
start with 7;

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.
Done.
Done.
Done.
Done.
Done.


[]

### Exercise 3

Commands to alter sequences to bind them to corresponding surrogate primary key fields.

In [42]:
%%sql
alter sequence orders_order_id_seq
owned by orders.order_id;
alter sequence order_items_order_item_id_seq
owned by order_items.order_item_id;
alter sequence products_product_id_seq
owned by products.product_id;
alter sequence customers_customer_id_seq
owned by customers.customer_id;
alter sequence categories_category_id_seq
owned by categories.category_id;
alter sequence departments_department_id_seq
owned by departments.department_id;

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.
Done.
Done.
Done.
Done.
Done.


[]

In [47]:
%%sql
select sequence_name from information_schema.sequences

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
6 rows affected.


sequence_name
orders_order_id_seq
order_items_order_item_id_seq
categories_category_id_seq
products_product_id_seq
customers_customer_id_seq
departments_department_id_seq


### Exercise 4

Add Foreign Key constraints to the tables.
* Validate if the tables have data violataing foreign key constraints (Hint: You can use left outer join to find rows in child table but not in parent table)
* Alter tables to add foreign keys as specified.
* Here are the relationships for your reference.
  * orders.order_customer_id to customers.customer_id
  * order_items.order_item_order_id to orders.order_id
  * order_items.order_item_product_id to products.product_id
  * products.product_category_id to categories.category_id
  * categories.category_department_id to departments.department_id
  * There might be data in child table, which might not be in parent table. You should figure out as part of the validations.
  * If there are foreign key violations, make sure to update the data in the child table's foreign key column to null values.
* Solution should contain the following:
  * Commands to add foreign keys to the tables.

In [48]:
%%sql 
alter table orders
add foreign key(order_customer_id)
references customers(customer_id)

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.


[]

In [49]:
%%sql 
alter table order_items
add foreign key(order_item_order_id)
references orders(order_id)

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.


[]

In [50]:
%%sql 
alter table order_items
add foreign key(order_item_product_id)
references products(product_id)

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.


[]

In [69]:
%%sql
select * from information_schema.table_constraints where table_name='products' 

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
7 rows affected.


constraint_catalog,constraint_schema,constraint_name,table_catalog,table_schema,table_name,constraint_type,is_deferrable,initially_deferred,enforced
itv001477_retail_db,public,products_pkey,itv001477_retail_db,public,products,PRIMARY KEY,NO,NO,YES
itv001477_retail_db,public,2200_58750_1_not_null,itv001477_retail_db,public,products,CHECK,NO,NO,YES
itv001477_retail_db,public,2200_58750_2_not_null,itv001477_retail_db,public,products,CHECK,NO,NO,YES
itv001477_retail_db,public,2200_58750_3_not_null,itv001477_retail_db,public,products,CHECK,NO,NO,YES
itv001477_retail_db,public,2200_58750_4_not_null,itv001477_retail_db,public,products,CHECK,NO,NO,YES
itv001477_retail_db,public,2200_58750_5_not_null,itv001477_retail_db,public,products,CHECK,NO,NO,YES
itv001477_retail_db,public,2200_58750_6_not_null,itv001477_retail_db,public,products,CHECK,NO,NO,YES


In [70]:
%%sql
alter table products 
 alter column product_category_id drop not null

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.


[]

In [71]:
%%sql
update products
    set product_category_id=NULL
    from products p
    left outer join categories c on c.category_id=p.product_category_id
    where c.category_id IS NULL

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
1345 rows affected.


[]

In [72]:
%%sql 
alter table products
add foreign key(product_category_id)
references categories(category_id)

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.


[]

In [73]:
%%sql
alter table categories 
 alter column category_department_id drop not null

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.


[]

In [75]:
%%sql
update categories
    set category_department_id=NULL
    from categories c
    left outer join departments d on d.department_id=c.category_department_id
    where d.department_id IS NULL

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
58 rows affected.


[]

In [76]:
%%sql 
alter table categories
add foreign key(category_department_id)
references departments(department_id)

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
Done.


[]

### Exercise 5

Queries to validate whether constraints are created or not. You can come up with queries against `information_schema` tables such as `columns`, `sequences` etc.

In [80]:
%%sql
select * from information_schema.sequences

 * postgresql://itv001477_retail_user:***@m01.itversity.com:5433/itv001477_retail_db
6 rows affected.


sequence_catalog,sequence_schema,sequence_name,data_type,numeric_precision,numeric_precision_radix,numeric_scale,start_value,minimum_value,maximum_value,increment,cycle_option
itv001477_retail_db,public,orders_order_id_seq,bigint,64,2,0,68883,1,9223372036854775807,1,NO
itv001477_retail_db,public,order_items_order_item_id_seq,bigint,64,2,0,172198,1,9223372036854775807,1,NO
itv001477_retail_db,public,categories_category_id_seq,bigint,64,2,0,58,1,9223372036854775807,1,NO
itv001477_retail_db,public,products_product_id_seq,bigint,64,2,0,1345,1,9223372036854775807,1,NO
itv001477_retail_db,public,customers_customer_id_seq,bigint,64,2,0,12435,1,9223372036854775807,1,NO
itv001477_retail_db,public,departments_department_id_seq,bigint,64,2,0,7,1,9223372036854775807,1,NO
