## Exercises - Managing Database Objects

This exercise is primarily to assess your capabilities related to put all important DDL concepts in practice by coming up with solution for a typical data migration problem from one database (mysql) to another (postgres).
* Here are the high level steps for database migration from one type of database to another type of database.
  * Extract DDL Statements from source database (MySQL).
  * Extract the data in the form of delimited files and ship them to target database.
  * Refactor scripts as per target database (Postgres).
  * Create tables in the target database.
  * Execute pre-migration steps (disable constraints, drop indexes etc).
  * Load the data using native utilities.
  * Execute post-migration steps (enable constraints, create or rebuild indexes, reset sequences etc).
  * Sanity checks with basic queries.
  * Make sure all the impacted applications are validated thoroughly.
* We have scripts and data set available in our GitHub repository. If you are using our environment the repository is already cloned under **/data/retail_db**.
* It have scripts to create tables with primary keys. Those scripts are generated from MySQL tables and refactored for Postgres.
  * Script to create tables: **create_db_tables_pg.sql**
  * Load data into tables: **load_db_tables_pg.sql**
* Here are the steps you need to perform to take care of this exercise.
  * Create tables
  * Load data
  * All the tables have surrogate primary keys. Here are the details.
    * orders.order_id
    * order_items.order_item_id
    * customers.customer_id
    * products.product_id
    * categories.category_id
    * departments.department_id
  * Get the maximum value from all surrogate primary key fields.
  * Create sequences for all surrogate primary key fields using maximum value. Make sure to use standard naming conventions for sequences.
  * Ensure sequences are mapped to the surrogate primary key fields.
  * Create foreign key constraints based up on this information.
    * orders.order_customer_id to customers.customer_id
    * order_items.order_item_order_id to orders.order_id
    * order_items.order_item_product_id to products.product_id
    * products.product_category_id to categories.category_id
    * categories.category_department_id to departments.department_id
  * Insert few records in `departments` to ensure that sequence generated numbers are used for `department_id`.
* Here are the commands to launch `psql` and run scripts to create tables as well as load data into tables.

```sql
psql -U itversity_retail_user \
  -h pg.itversity.com \
  -p 5432 \
  -d itversity_retail_db \
  -W

\i /data/retail_db/create_db_tables_pg.sql

\i /data/retail_db/load_db_tables_pg.sql
```
* We use this approach of creating tables, loading data and then adding constraints as well as resetting sequences for large volume data migrations from one database to another database.
* Here are the commands or queries you need to come up with to solve this problem.

### Exercise 1

Queries to get maximum values from surrogate primary key\i /data/retail_db/load_db_tables_pg.sql\i /data/retail_db/load_db_tables_pg.sqls.

In [69]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [70]:
%env DATABASE_URL=postgresql://itv002461_retail_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_retail_db

env: DATABASE_URL=postgresql://itv002461_retail_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_retail_db


In [71]:
%sql SELECT current_database()

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


current_database
itv002461_retail_db


In [72]:
%%sql 
SELECT * FROM information_schema.sequences 

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
4 rows affected.


sequence_catalog,sequence_schema,sequence_name,data_type,numeric_precision,numeric_precision_radix,numeric_scale,start_value,minimum_value,maximum_value,increment,cycle_option
itv002461_retail_db,public,user_logins_user_login_id_seq,integer,32,2,0,1,1,2147483647,1,NO
itv002461_retail_db,public,users_user_id_seq,integer,32,2,0,1,1,2147483647,1,NO
itv002461_retail_db,public,users_part_user_id_seq,integer,32,2,0,1,1,2147483647,1,NO
itv002461_retail_db,public,products_product_id_seq,bigint,64,2,0,1345,1,9223372036854775807,1,NO


In [73]:
%sql SELECT MAX(orders.order_id) FROM orders

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


max
68883


In [74]:
%sql SELECT MAX(orders.order_id) FROM orders

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


max
68883


In [75]:
%sql SELECT MAX(order_items.order_item_id) FROM order_items

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


max
172198


In [76]:
%sql SELECT MAX(customers.customer_id) FROM customers

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


max
12435


In [77]:
%sql SELECT MAX(products.product_id) FROM products

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


max
1345


In [78]:
%sql SELECT MAX(categories.category_id) FROM categories

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


max
58


In [79]:
%sql SELECT MAX(departments.department_id) FROM departments

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


max
7


### Exercise 2

Commands to add sequences with `START WITH` pointing to the maximum value for the corresponding surrogate primary key fields. Make sure to use meaningful names to sequences **TABLENAME_SURROGATEFIELD_seq** (example: users_user_id_seq for users.user_id)

#### Orders table

In [80]:
%sql DROP SEQUENCE IF EXISTS orders_order_id_seq

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [81]:
%%sql 
CREATE SEQUENCE orders_order_id_seq
START WITH 68883
MINVALUE 1

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [82]:
%%sql 
SELECT nextval('orders_order_id_seq')

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


nextval
68883


In [83]:
%sql SELECT currval('orders_order_id_seq')

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


currval
68883


In [84]:
%sql SELECT nextval('orders_order_id_seq')

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


nextval
68884


In [85]:
%sql SELECT currval('orders_order_id_seq')

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
1 rows affected.


currval
68884


#### Order Item Table

In [86]:
%sql DROP SEQUENCE IF EXISTS order_items_order_id_seq


 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [87]:
%%sql 
CREATE SEQUENCE order_items_order_id_seq
START WITH 12435
MINVALUE 1

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

#### Customers Table

In [88]:
%sql DROP SEQUENCE IF EXISTS customers_id_seq

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [89]:
%%sql
CREATE SEQUENCE customers_customer_id_seq
START WITH 172198
MINVALUE 1

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

#### Product Table

In [90]:
%sql DROP SEQUENCE IF EXISTS products_product_id_seq

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [91]:
%%sql
CREATE SEQUENCE products_product_id_seq
START WITH 1345
MINVALUE 1

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

#### Categories Table

In [92]:
%sql DROP SEQUENCE IF EXISTS categories_categories_id_seq

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [93]:
%%sql
CREATE SEQUENCE categories_categories_id_seq
START WITH 58
MINVALUE 1

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

#### Departments

In [94]:
%sql DROP SEQUENCE IF EXISTS departments_department_id_seq

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [95]:
%%sql
CREATE SEQUENCE departments_department_id_seq
START WITH 7
MINVALUE 1

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

### Exercise 3

Commands to alter sequences to bind them to corresponding surrogate primary key fields.

In [96]:
%%sql

ALTER SEQUENCE orders_order_id_seq
    OWNED BY orders.order_id

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [97]:
%%sql

ALTER SEQUENCE order_items_order_id_seq
    OWNED BY order_items.order_item_id

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [98]:
%%sql

ALTER SEQUENCE customers_customer_id_seq
    OWNED BY customers.customer_id

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [99]:
%%sql
ALTER SEQUENCE categories_categories_id_seq
    OWNED BY categories.category_id

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [100]:
%%sql
ALTER SEQUENCE departments_department_id_seq
    OWNED BY departments.department_id

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

### Exercise 4

Add Foreign Key constraints to the tables.
* Validate if the tables have data violataing foreign key constraints (Hint: You can use left outer join to find rows in child table but not in parent table)
* Alter tables to add foreign keys as specified.
* Here are the relationships for your reference.
  * orders.order_customer_id to customers.customer_id
  * order_items.order_item_order_id to orders.order_id
  * order_items.order_item_product_id to products.product_id
  * products.product_category_id to categories.category_id
  * categories.category_department_id to departments.department_id
  * There might be data in child table, which might not be in parent table. You should figure out as part of the validations.
  * If there are foreign key violations, make sure to update the data in the child table's foreign key column to null values.
* Solution should contain the following:
  * Commands to add foreign keys to the tables.

#### Order and Customer

In [101]:
%%sql

select o.order_customer_id, c.customer_id from
orders o left join customers c
on o.order_customer_id =c.customer_id
where c.customer_id IS NULL

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
0 rows affected.


order_customer_id,customer_id


In [102]:
%%sql
ALTER TABLE orders
ADD FOREIGN KEY (order_customer_id)
REFERENCES customers(customer_id)

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [103]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'orders'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
6 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,orders,PRIMARY KEY,orders_pkey
itv002461_retail_db,orders,FOREIGN KEY,orders_order_customer_id_fkey
itv002461_retail_db,orders,CHECK,2200_153113_1_not_null
itv002461_retail_db,orders,CHECK,2200_153113_2_not_null
itv002461_retail_db,orders,CHECK,2200_153113_3_not_null
itv002461_retail_db,orders,CHECK,2200_153113_4_not_null


In [104]:
%%sql
alter table orders
drop constraint orders_order_customer_id_fkey1 CASCADE 

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
(psycopg2.errors.UndefinedObject) constraint "orders_order_customer_id_fkey1" of relation "orders" does not exist

[SQL: alter table orders drop constraint orders_order_customer_id_fkey1 CASCADE]
(Background on this error at: http://sqlalche.me/e/13/f405)


#### Order Item and Orders

In [105]:
%%sql
select oi.order_item_order_id,o.order_id
from order_items oi left join orders o 
on oi.order_item_order_id=o.order_id
where o.order_id is null

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
0 rows affected.


order_item_order_id,order_id


In [106]:
%%sql
ALTER TABLE order_items
ADD FOREIGN KEY(order_item_order_id)
REFERENCES orders(order_id)

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [107]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'order_items'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
8 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,order_items,PRIMARY KEY,order_items_pkey
itv002461_retail_db,order_items,FOREIGN KEY,order_items_order_item_order_id_fkey
itv002461_retail_db,order_items,CHECK,2200_153118_1_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_2_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_3_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_4_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_5_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_6_not_null


#### Order Item and Products

In [108]:
%%sql
select oi.order_item_product_id, p.product_id
from order_items oi left join products p 
on oi.order_item_product_id=p.product_id
where p.product_id is null

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
0 rows affected.


order_item_product_id,product_id


In [109]:
%%sql
ALTER TABLE order_items
ADD FOREIGN KEY(order_item_product_id)
REFERENCES products(product_id)

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [110]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'order_items'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
9 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,order_items,PRIMARY KEY,order_items_pkey
itv002461_retail_db,order_items,FOREIGN KEY,order_items_order_item_order_id_fkey
itv002461_retail_db,order_items,FOREIGN KEY,order_items_order_item_product_id_fkey
itv002461_retail_db,order_items,CHECK,2200_153118_1_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_2_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_3_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_4_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_5_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_6_not_null


#### Products and category

In [111]:
%%sql

select p.product_category_id, c.category_id 
from products p left join categories c
on p.product_category_id=c.category_id
where c.category_id IS NULL

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
24 rows affected.


product_category_id,category_id
59,
59,
59,
59,
59,
59,
59,
59,
59,
59,


In [112]:
%%sql
ALTER TABLE products
ALTER COLUMN product_category_id DROP NOT NULL

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [113]:
%%sql
UPDATE products
SET product_category_id=null
where product_category_id=59

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
24 rows affected.


[]

In [114]:
%%sql
ALTER TABLE products
ADD FOREIGN KEY(product_category_id)
REFERENCES categories(category_id)

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [115]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'products'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
7 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,products,PRIMARY KEY,products_pkey
itv002461_retail_db,products,FOREIGN KEY,products_product_category_id_fkey
itv002461_retail_db,products,CHECK,2200_153097_1_not_null
itv002461_retail_db,products,CHECK,2200_153097_3_not_null
itv002461_retail_db,products,CHECK,2200_153097_4_not_null
itv002461_retail_db,products,CHECK,2200_153097_5_not_null
itv002461_retail_db,products,CHECK,2200_153097_6_not_null


#### Category and Department

In [116]:
%%sql
select c.category_department_id, d.department_id 
from categories c left join departments d
on c.category_department_id=d.department_id
where d.department_id IS NULL

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


category_department_id,department_id
8,
8,
8,
8,
8,
8,
8,
8,
8,
8,


In [117]:
%%sql
ALTER TABLE categories
ALTER COLUMN category_department_id DROP NOT NULL

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

In [118]:
%%sql
UPDATE categories
SET category_department_id=null
where category_department_id=8

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


[]

In [119]:
%%sql
ALTER TABLE categories
ADD FOREIGN KEY(category_department_id)
REFERENCES departments(department_id)

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
Done.


[]

### Exercise 5

Queries to validate whether constraints are created or not. You can come up with queries against `information_schema` tables such as `columns`, `sequences` etc.

In [120]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'orders'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
6 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,orders,PRIMARY KEY,orders_pkey
itv002461_retail_db,orders,FOREIGN KEY,orders_order_customer_id_fkey
itv002461_retail_db,orders,CHECK,2200_153113_1_not_null
itv002461_retail_db,orders,CHECK,2200_153113_2_not_null
itv002461_retail_db,orders,CHECK,2200_153113_3_not_null
itv002461_retail_db,orders,CHECK,2200_153113_4_not_null


In [121]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'order_items'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
9 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,order_items,PRIMARY KEY,order_items_pkey
itv002461_retail_db,order_items,FOREIGN KEY,order_items_order_item_order_id_fkey
itv002461_retail_db,order_items,FOREIGN KEY,order_items_order_item_product_id_fkey
itv002461_retail_db,order_items,CHECK,2200_153118_1_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_2_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_3_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_4_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_5_not_null
itv002461_retail_db,order_items,CHECK,2200_153118_6_not_null


In [122]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'customers'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,customers,PRIMARY KEY,customers_pkey
itv002461_retail_db,customers,CHECK,2200_153105_1_not_null
itv002461_retail_db,customers,CHECK,2200_153105_2_not_null
itv002461_retail_db,customers,CHECK,2200_153105_3_not_null
itv002461_retail_db,customers,CHECK,2200_153105_4_not_null
itv002461_retail_db,customers,CHECK,2200_153105_5_not_null
itv002461_retail_db,customers,CHECK,2200_153105_6_not_null
itv002461_retail_db,customers,CHECK,2200_153105_7_not_null
itv002461_retail_db,customers,CHECK,2200_153105_8_not_null
itv002461_retail_db,customers,CHECK,2200_153105_9_not_null


In [123]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'categories'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
4 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,categories,PRIMARY KEY,categories_pkey
itv002461_retail_db,categories,FOREIGN KEY,categories_category_department_id_fkey
itv002461_retail_db,categories,CHECK,2200_153092_1_not_null
itv002461_retail_db,categories,CHECK,2200_153092_3_not_null


In [124]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'departments'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
3 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,departments,PRIMARY KEY,departments_pkey
itv002461_retail_db,departments,CHECK,2200_153087_1_not_null
itv002461_retail_db,departments,CHECK,2200_153087_2_not_null


In [125]:
%%sql
SELECT table_catalog,
table_name,
constraint_type,
constraint_name
FROM information_schema.table_constraints  
WHERE table_name = 'products'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
7 rows affected.


table_catalog,table_name,constraint_type,constraint_name
itv002461_retail_db,products,PRIMARY KEY,products_pkey
itv002461_retail_db,products,FOREIGN KEY,products_product_category_id_fkey
itv002461_retail_db,products,CHECK,2200_153097_1_not_null
itv002461_retail_db,products,CHECK,2200_153097_3_not_null
itv002461_retail_db,products,CHECK,2200_153097_4_not_null
itv002461_retail_db,products,CHECK,2200_153097_5_not_null
itv002461_retail_db,products,CHECK,2200_153097_6_not_null
