# Partitioning Tables and Indexes

As part of this section we will primarily talk about partitioning tables as well as indexes.

* Overview of Partitioning
* List Partitioning
* Managing Partitions - List
* Manipulating Data
* Range Partitioning
* Managing Partitions - Range
* Repartitioning - Range
* Hash Partitioning
* Managing Partitions - Hash
* Usage Scenarios
* Sub Partitioning
* Exercise - Paritioning Tables

Here are the key objectives of this section.
* Different partitioning strategies
* How to create and manage partitioned tables?
* How to manipulate data by inserting, updating and deleting data from managed tables?
* How to repartition the tables if partitioning strategy is changed (example: from yearly to monthly)?
* Learn about sub partitioning or nested partitioning or multi level partitioning with examples.
* Self evaluate whether one understood key skills related to partitioned tables or not using exercises.

## Overview of Partitioning

Most of the modern database technologies support wide variety of partitioning strategies. However, here are the most commonly used ones.

In [14]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/cwL9iAaKHSw?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* List Partitioning
* Range Partitioning
* Hash Partitioning
* List and Range are more widely used compared to Hash Partitioning.
* We can also mix and match these to have multi level partitioning. It is known as sub partitioning.
* We can either partition a table with out primary key or partition a table with primary key when partition column is prime attribute (one of the primary key columns).
* Indexes can be added to the partitioned table. If we create on the main table, it is global index and if we create index on each partition then it is partitioned index.

## List Partitioning

Let us understand how we can take care of list partitioning of tables.

In [15]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/8vgJD-l6bzk?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* It is primarily used to create partitions based up on the values.
* Here are the steps involved in creating table using list partitioning strategy.
  * Create table using `PARTITION BY LIST`
  * Add default and value specific partitions
  * Validate by inserting data into the table
* We can detach as well as drop the partitions from the table.


### Create Partitioned Table

Let us create partitioned table with name `users_part`.
* It contains same columns as `users`.
* We will partition based up on `user_role` field.

In [16]:
%load_ext sql

In [17]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


In [18]:
%sql DROP TABLE IF EXISTS users

Done.


[]

In [19]:
%%sql

CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [20]:
%sql DROP TABLE IF EXISTS users_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [21]:
%%sql

CREATE TABLE users_part (
    user_id SERIAL,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (user_role, user_id)
) PARTITION BY LIST(user_role)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{note}
Additional indexes on the users_part table.
```

In [22]:
%%sql

CREATE INDEX users_part_email_id_idx 
    ON users_part(user_email_id)


 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{error}
Below `INSERT` statement will fail as we have not added any partitions to the table `users_part` even though it is created as partitioned table.
```

In [23]:
%%sql

INSERT INTO users_part (user_first_name, user_last_name, user_email_id)
VALUES 
    ('Scott', 'Tiger', 'scott@tiger.com'),
    ('Donald', 'Duck', 'donald@duck.com'),
    ('Mickey', 'Mouse', 'mickey@mouse.com')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db


IntegrityError: (psycopg2.errors.CheckViolation) no partition of relation "users_part" found for row
DETAIL:  Partition key of the failing row contains (user_role) = (U).

[SQL: INSERT INTO users_part (user_first_name, user_last_name, user_email_id)
VALUES 
    ('Scott', 'Tiger', 'scott@tiger.com'),
    ('Donald', 'Duck', 'donald@duck.com'),
    ('Mickey', 'Mouse', 'mickey@mouse.com')]
(Background on this error at: http://sqlalche.me/e/13/gkpj)

## Managing Partitions - List

Let us understand how to manage partitions for a partitioned table using `users_part`.

In [24]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/Tg2HxR0uDn4?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* All users data with `user_role` as **'U'** should go to one partition by name `users_part_u`.
* All users data with `user_role` as **'A'** should go to one partition by name `users_part_a`.
* We can add partition to existing partitioned table using `CREATE TABLE partition_name PARTITION OF table_name`.
* We can have a partition for default values so that all the data that does not satisfy the partition condition can be added to it.
* We can have a partition for each value or for a set of values.
  * We can have one partition for `U` as well as `A` and default partition for all other values.
  * We can have individual partitions for `U`, `A` respectively and default partition for all other values.
  * We can use `FOR VALUES IN (val1, val2)` as part of `CREATE TABLE partition_name PARTITION OF table_name` to specify values for respective table created for partition.
* Once partitions are added, we can insert data into the partitioned table.
* We can detach using `ALTER TABLE` and drop the partition or drop the partition directly. To drop the partition we need to use `DROP TABLE` command.

```{note}
Here is how we can create partition for default values for a list partitioned table **users_part**.
```

In [25]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [26]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


In [27]:
%%sql

CREATE TABLE users_part_default
PARTITION OF users_part DEFAULT

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{note}
All the 3 records will go to default partition as we have not defined any partition for user_role 'U'.
```

In [28]:
%%sql

INSERT INTO users_part (user_first_name, user_last_name, user_email_id, user_role)
VALUES 
    ('Scott', 'Tiger', 'scott@tiger.com', 'U'),
    ('Donald', 'Duck', 'donald@duck.com', 'U'),
    ('Mickey', 'Mouse', 'mickey@mouse.com', 'U')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


[]

In [29]:
%%sql

SELECT * FROM users_part_default

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
2,Scott,Tiger,scott@tiger.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594
3,Donald,Duck,donald@duck.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594
4,Mickey,Mouse,mickey@mouse.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594


In [30]:
%%sql

CREATE TABLE users_part_a 
PARTITION OF users_part  
FOR VALUES IN ('A')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [31]:
%%sql

UPDATE users_part
SET
    user_role = 'A'
WHERE user_email_id = 'scott@tiger.com'

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [32]:
%%sql

SELECT * FROM users_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
2,Scott,Tiger,scott@tiger.com,False,,A,False,2020-11-24,2020-11-24 12:11:46.894594
3,Donald,Duck,donald@duck.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594
4,Mickey,Mouse,mickey@mouse.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594


In [33]:
%%sql

SELECT * FROM users_part_a

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
2,Scott,Tiger,scott@tiger.com,False,,A,False,2020-11-24,2020-11-24 12:11:46.894594


In [34]:
%%sql

SELECT * FROM users_part_default

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
2 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
3,Donald,Duck,donald@duck.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594
4,Mickey,Mouse,mickey@mouse.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594


```{error}
This will fail as there are records with user_role 'U' in default partition. 
```

In [35]:
%%sql

CREATE TABLE users_part_u 
PARTITION OF users_part  
FOR VALUES IN ('U')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db


IntegrityError: (psycopg2.errors.CheckViolation) updated partition constraint for default partition "users_part_default" would be violated by some row

[SQL: CREATE TABLE users_part_u PARTITION OF users_part  
FOR VALUES IN ('U')]
(Background on this error at: http://sqlalche.me/e/13/gkpj)

```{note}
We can detach the partition, add partition for 'U' and load the data from detached partitione into the new partition created.
```

In [36]:
%%sql

ALTER TABLE users_part
    DETACH PARTITION users_part_default

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [37]:
%%sql

CREATE TABLE users_part_u 
PARTITION OF users_part  
FOR VALUES IN ('U')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [38]:
%%sql

INSERT INTO users_part
SELECT * FROM users_part_default

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
2 rows affected.


[]

In [39]:
%%sql

SELECT * FROM users_part_a

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
2,Scott,Tiger,scott@tiger.com,False,,A,False,2020-11-24,2020-11-24 12:11:46.894594


In [40]:
%%sql

SELECT * FROM users_part_u

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
2 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
3,Donald,Duck,donald@duck.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594
4,Mickey,Mouse,mickey@mouse.com,False,,U,False,2020-11-24,2020-11-24 12:11:46.894594


```{note}
We can drop and create partition for default or truncate and attach the existing default partition.
```

In [41]:
%%sql

DROP TABLE users_part_default

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [42]:
%%sql

CREATE TABLE users_part_default
PARTITION OF users_part DEFAULT

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

## Manipulating Data

Let us understand how we can manipulate data for a partitioned table.

In [43]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/9GHGi-3_cf0?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* We can insert data using the table (most preferred way).
* As we define table for each partition, we can insert data using table created for specific partition. 
* In the case of `users_part` partitioned table, we can either use table name`users_part` or partition name `users_part_u` to insert records with user_role **'U'**.

```sql
CREATE TABLE users_part_u 
PARTITION OF users_part  
FOR VALUES IN ('U')
```

* As part of the update, if we change the value in a partitioned column which will result in changing partition, then internally data from one partition will be moved to other.
* We can delete the data using the table or the table created for each partition (either by using table name `users_part` or partitions such as `users_part_u`, `users_part_a` etc

```{note}
DML is same irrespective of the partitioning strategy. This applies to all 3 partitioning strategies - **list**, **range** as well as **hash**.
```

In [44]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [45]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


In [46]:
%%sql

TRUNCATE TABLE users_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [47]:
%%sql

INSERT INTO users_part (user_first_name, user_last_name, user_email_id, user_role)
VALUES 
    ('Scott', 'Tiger', 'scott@tiger.com', 'U'),
    ('Donald', 'Duck', 'donald@duck.com', 'U'),
    ('Mickey', 'Mouse', 'mickey@mouse.com', 'U')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


[]

In [48]:
%%sql

SELECT * FROM users_part_u

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
5,Scott,Tiger,scott@tiger.com,False,,U,False,2020-11-24,2020-11-24 12:12:08.505850
6,Donald,Duck,donald@duck.com,False,,U,False,2020-11-24,2020-11-24 12:12:08.505850
7,Mickey,Mouse,mickey@mouse.com,False,,U,False,2020-11-24,2020-11-24 12:12:08.505850


In [49]:
%%sql

INSERT INTO users_part_a (user_first_name, user_last_name, user_email_id, user_role)
VALUES
    ('Matt', 'Clarke', 'matt@clarke.com', 'A')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [50]:
%%sql

SELECT * FROM users_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
4 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
8,Matt,Clarke,matt@clarke.com,False,,A,False,2020-11-24,2020-11-24 12:12:09.284614
5,Scott,Tiger,scott@tiger.com,False,,U,False,2020-11-24,2020-11-24 12:12:08.505850
6,Donald,Duck,donald@duck.com,False,,U,False,2020-11-24,2020-11-24 12:12:08.505850
7,Mickey,Mouse,mickey@mouse.com,False,,U,False,2020-11-24,2020-11-24 12:12:08.505850


In [51]:
%%sql

UPDATE users_part SET
    user_role = 'A'
WHERE user_email_id = 'donald@duck.com'

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [52]:
%%sql

SELECT * FROM users_part_a

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
2 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
8,Matt,Clarke,matt@clarke.com,False,,A,False,2020-11-24,2020-11-24 12:12:09.284614
6,Donald,Duck,donald@duck.com,False,,A,False,2020-11-24,2020-11-24 12:12:08.505850


In [53]:
%%sql

DELETE FROM users_part WHERE user_email_id = 'donald@duck.com'

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [54]:
%%sql

DELETE FROM users_part_u WHERE user_email_id = 'mickey@mouse.com'

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [55]:
%%sql

SELECT * FROM users_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
2 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
8,Matt,Clarke,matt@clarke.com,False,,A,False,2020-11-24,2020-11-24 12:12:09.284614
5,Scott,Tiger,scott@tiger.com,False,,U,False,2020-11-24,2020-11-24 12:12:08.505850


## Range Partitioning

Let us understand how we can take care of range partitioning of tables.

In [56]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/bvEiDEwItVA?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* It is primarily used to create partitions based up on a given range of values.
* Here are the steps involved in creating table using range partitioning strategy.
  * Create table using `PARTITION BY RANGE`
  * Add default and range specific partitions
  * Validate by inserting data into the table
* We can detach as well as drop the partitions from the table.


### Create Partitioned Table

Let us create partitioned table with name `users_range_part`.
* It contains same columns as `users`.
* We will partition the table based up on `created_dt` field.
* We will create one partition per year with naming convention **users_range_part_yyyy** (users_range_part_2016).

In [57]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [58]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


In [59]:
%sql DROP TABLE IF EXISTS users_range_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [60]:
%%sql

CREATE TABLE users_range_part (
    user_id SERIAL,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (created_dt, user_id)
) PARTITION BY RANGE(created_dt)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{note}
We will not be able to insert the data until we add at least one partition.
```

## Managing Partitions - Range

Let us understand how to manage partitions for the table `users_range_part`.

In [61]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/RhQGZF3QEQ8?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* All users data created in a specific year should go to the respective partition created.
* For example, all users data created in the year of 2016 should go to `users_range_part_2016`.
* We can add partition to existing partitioned table using `CREATE TABLE partition_name PARTITION OF table_name`.
* We can have a partition for default values so that all the data that does not satisfy the partition condition can be added to it.
* We can have a partition for specific range of values using `FOR VALUES FROM (from_value) TO (to_value)` as part of `CREATE TABLE partition_name PARTITION OF table_name`.
* Once partitions are added, we can insert data into the partitioned table.

```{note}
Here is how we can create partition for default values for a range partitioned table **users_range_part**.
```

In [62]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [63]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


In [64]:
%%sql

CREATE TABLE users_range_part_default
PARTITION OF users_range_part DEFAULT

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [65]:
%%sql

CREATE TABLE users_range_part_2016
PARTITION OF users_range_part
FOR VALUES FROM ('2016-01-01') TO ('2016-12-31')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{error}
As there is a overlap between the previous partition and below one, command to create partition for data ranging from 2016-01-01 till 2017-12-31 will fail.
```

In [66]:
%%sql

CREATE TABLE users_range_part_2017
PARTITION OF users_range_part
FOR VALUES FROM ('2016-01-01') TO ('2017-12-31')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
(psycopg2.errors.InvalidObjectDefinition) partition "users_range_part_2017" would overlap partition "users_range_part_2016"

[SQL: CREATE TABLE users_range_part_2017 PARTITION OF users_range_part
FOR VALUES FROM ('2016-01-01') TO ('2017-12-31')]
(Background on this error at: http://sqlalche.me/e/13/f405)


```{note}
This is how we can create partitions for the years **2017**, **2018**, **2019** etc
```

In [67]:
%%sql

CREATE TABLE users_range_part_2017
PARTITION OF users_range_part
FOR VALUES FROM ('2017-01-01') TO ('2017-12-31')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [68]:
%%sql

CREATE TABLE users_range_part_2018
PARTITION OF users_range_part
FOR VALUES FROM ('2018-01-01') TO ('2018-12-31')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [69]:
%%sql

CREATE TABLE users_range_part_2019
PARTITION OF users_range_part
FOR VALUES FROM ('2019-01-01') TO ('2019-12-31')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [70]:
%%sql

CREATE TABLE users_range_part_2020
PARTITION OF users_range_part
FOR VALUES FROM ('2020-01-01') TO ('2020-12-31')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [71]:
%%sql

INSERT INTO users_range_part 
    (user_first_name, user_last_name, user_email_id, created_dt)
VALUES 
    ('Scott', 'Tiger', 'scott@tiger.com', '2018-10-01'),
    ('Donald', 'Duck', 'donald@duck.com', '2019-02-10'),
    ('Mickey', 'Mouse', 'mickey@mouse.com', '2017-06-22')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


[]

In [72]:
%%sql

SELECT user_first_name, user_last_name, user_email_id, created_dt
FROM users_range_part_default

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_first_name,user_last_name,user_email_id,created_dt


In [73]:
%%sql

SELECT user_first_name, user_last_name, user_email_id, created_dt
FROM users_range_part_2017

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_first_name,user_last_name,user_email_id,created_dt
Mickey,Mouse,mickey@mouse.com,2017-06-22


In [74]:
%%sql

SELECT user_first_name, user_last_name, user_email_id, created_dt
FROM users_range_part_2018

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_first_name,user_last_name,user_email_id,created_dt
Scott,Tiger,scott@tiger.com,2018-10-01


In [75]:
%%sql

SELECT user_first_name, user_last_name, user_email_id, created_dt
FROM users_range_part_2019

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_first_name,user_last_name,user_email_id,created_dt
Donald,Duck,donald@duck.com,2019-02-10


In [76]:
%%sql

SELECT user_first_name, user_last_name, user_email_id, created_dt
FROM users_range_part_2020

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_first_name,user_last_name,user_email_id,created_dt


## Repartitioning - Range

Let us understand how we can repartition the existing partitioned table.

In [77]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/QgVo5IRryCg?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* We will use **users_range_part** table. It is originally partitioned for each year.
* Now we would like to partition for each month.
* Here are the steps that are involved in repartitioning from year to month.
  * Detach all yearly partitions from **users_range_part**.
  * Add new partitions for each month.
  * Load data from detached partitions into the table with new partitions for each month.
  * Validate to ensure that all the data is copied.
  * Drop all the detached partitions.

In [78]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [79]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


```{note}
Detach all yearly partitions
```

In [80]:
%%sql

ALTER TABLE users_range_part
    DETACH PARTITION users_range_part_2016

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [81]:
%%sql

ALTER TABLE users_range_part
    DETACH PARTITION users_range_part_2017

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [82]:
%%sql

ALTER TABLE users_range_part
    DETACH PARTITION users_range_part_2018

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [83]:
%%sql

ALTER TABLE users_range_part
    DETACH PARTITION users_range_part_2019

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [84]:
%%sql

ALTER TABLE users_range_part
    DETACH PARTITION users_range_part_2020

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{note}
Add new partitions for every month between 2016 January and 2020 December.
```

In [85]:
!pip install psycopg2

Defaulting to user installation because normal site-packages is not writeable


In [86]:
import pandas as pd
from pandas.tseries.offsets import MonthBegin, MonthEnd

months = pd.date_range(start='1/1/2016', end='3/31/2016', freq='1M')

for month in months:
    begin_date = month - MonthBegin(1)
    end_date = month + MonthEnd(0)
    print(str(month)[:7].replace('-', ''), end=':')
    print(str(begin_date).split(' ')[0], end=':')
    print(str(end_date).split(' ')[0])

201601:2016-01-01:2016-01-31
201602:2016-02-01:2016-02-29
201603:2016-03-01:2016-03-31


In [87]:
import psycopg2

In [88]:
import pandas as pd
from pandas.tseries.offsets import MonthBegin, MonthEnd

months = pd.date_range(start='1/1/2016', end='12/31/2020', freq='1M')

connection = psycopg2.connect(
    host='localhost',
    port='5432',
    database='itversity_sms_db',
    user='itversity_sms_user',
    password='sms_password'
)
cursor = connection.cursor()
table_name = 'users_range_part'
query = '''
CREATE TABLE {table_name}_{yyyymm}
PARTITION OF {table_name}
FOR VALUES FROM ('{begin_date}') TO ('{end_date}')
'''
for month in months:
    begin_date = month - MonthBegin(1)
    end_date = month + MonthEnd(0)
    print(f'Adding partition for {begin_date} and {end_date}')
    cursor.execute(
        query.format(
            table_name=table_name,
            yyyymm=str(month)[:7].replace('-', ''),
            begin_date=str(begin_date).split(' ')[0],
            end_date=str(end_date).split(' ')[0]
        ), ()
    )
connection.commit()
cursor.close()
connection.close()

Adding partition for 2016-01-01 00:00:00 and 2016-01-31 00:00:00
Adding partition for 2016-02-01 00:00:00 and 2016-02-29 00:00:00
Adding partition for 2016-03-01 00:00:00 and 2016-03-31 00:00:00
Adding partition for 2016-04-01 00:00:00 and 2016-04-30 00:00:00
Adding partition for 2016-05-01 00:00:00 and 2016-05-31 00:00:00
Adding partition for 2016-06-01 00:00:00 and 2016-06-30 00:00:00
Adding partition for 2016-07-01 00:00:00 and 2016-07-31 00:00:00
Adding partition for 2016-08-01 00:00:00 and 2016-08-31 00:00:00
Adding partition for 2016-09-01 00:00:00 and 2016-09-30 00:00:00
Adding partition for 2016-10-01 00:00:00 and 2016-10-31 00:00:00
Adding partition for 2016-11-01 00:00:00 and 2016-11-30 00:00:00
Adding partition for 2016-12-01 00:00:00 and 2016-12-31 00:00:00
Adding partition for 2017-01-01 00:00:00 and 2017-01-31 00:00:00
Adding partition for 2017-02-01 00:00:00 and 2017-02-28 00:00:00
Adding partition for 2017-03-01 00:00:00 and 2017-03-31 00:00:00
Adding partition for 2017

```{note}
Load data from detached yearly partitions into monthly partitioned table.
```

In [89]:
%%sql

INSERT INTO users_range_part
SELECT * FROM users_range_part_2016

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


[]

In [90]:
%%sql

INSERT INTO users_range_part
SELECT * FROM users_range_part_2017

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [91]:
%%sql

INSERT INTO users_range_part
SELECT * FROM users_range_part_2018

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [92]:
%%sql

INSERT INTO users_range_part
SELECT * FROM users_range_part_2019

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


[]

In [93]:
%%sql

INSERT INTO users_range_part
SELECT * FROM users_range_part_2020

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


[]

In [94]:
%%sql

SELECT * FROM users_range_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
3,Mickey,Mouse,mickey@mouse.com,False,,U,False,2017-06-22,2020-11-24 12:12:27.094936
1,Scott,Tiger,scott@tiger.com,False,,U,False,2018-10-01,2020-11-24 12:12:27.094936
2,Donald,Duck,donald@duck.com,False,,U,False,2019-02-10,2020-11-24 12:12:27.094936


In [95]:
%%sql

SELECT * FROM users_range_part_201706

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
3,Mickey,Mouse,mickey@mouse.com,False,,U,False,2017-06-22,2020-11-24 12:12:27.094936


In [96]:
%%sql

SELECT * FROM users_range_part_201810

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
1,Scott,Tiger,scott@tiger.com,False,,U,False,2018-10-01,2020-11-24 12:12:27.094936


In [97]:
%%sql

SELECT * FROM users_range_part_201902

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
2,Donald,Duck,donald@duck.com,False,,U,False,2019-02-10,2020-11-24 12:12:27.094936


```{note}
As we are able to see the data in the monthly partitioned table, we can drop the tables which are created earlier using yearly partitioning strategy.
```

In [98]:
%%sql

DROP TABLE users_range_part_2016

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [99]:
%%sql

DROP TABLE users_range_part_2017

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [100]:
%%sql

DROP TABLE users_range_part_2018

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [101]:
%%sql

DROP TABLE users_range_part_2019

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [102]:
%%sql

DROP TABLE users_range_part_2020

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [103]:
%%sql

SELECT table_catalog, 
    table_schema, 
    table_name FROM information_schema.tables
WHERE table_name ~ 'users_range_part_'
ORDER BY table_name

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
61 rows affected.


table_catalog,table_schema,table_name
itversity_sms_db,public,users_range_part_201601
itversity_sms_db,public,users_range_part_201602
itversity_sms_db,public,users_range_part_201603
itversity_sms_db,public,users_range_part_201604
itversity_sms_db,public,users_range_part_201605
itversity_sms_db,public,users_range_part_201606
itversity_sms_db,public,users_range_part_201607
itversity_sms_db,public,users_range_part_201608
itversity_sms_db,public,users_range_part_201609
itversity_sms_db,public,users_range_part_201610


## Hash Partitioning

Let us understand how we can take care of Hash partitioning of tables.

In [104]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/f1zzOh7hU5E?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* It is primarily used to create partitions based up on modulus and reminder.
* Here are the steps involved in creating table using hash partitioning strategy.
  * Create table using `PARTITION BY HASH`
  * Add default and remainder specific partitions based up on modulus.
  * Validate by inserting data into the table
* We can detach as well as drop the partitions from the table.
* Hash partitioning is typically done on sparse columns such as `user_id`.
* If we want to use hash partitioning on more than one tables with common key, we typically partition all the tables using same key.


### Create Partitioned Table

Let us create partitioned table with name `users_hash_part`.
* It contains same columns as `users`.
* We will partition the table based up on `user_id` field.
* We will create one partition for each reminder with modulus 8.

In [105]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [106]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


In [107]:
%sql DROP TABLE IF EXISTS users_hash_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [108]:
%%sql

CREATE TABLE users_hash_part (
    user_id SERIAL,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (user_id)
) PARTITION BY HASH(user_id)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{note}
We will not be able to insert the data until we add at least one partition.
```

## Managing Partitions - Hash

Let us understand how to manage partitions using table `users_hash_part` which is partitioned using **hash**.

In [109]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/RYTQmErsahs?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* We would like to divide our data into 8 hash buckets.
* While adding partitions for **hash partitioned table**, we need to specify modulus and remainder.
* For each and every record inserted, following will happen for the column specified as partitioned key.
  * A hash will be computed. Hash is nothing but an integer.
  * The integer generated will be divided by the value specified in **modulus**.
  * Based up on the remainder, the record will be inserted into corresponding partition.

In [110]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [111]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


```{error}
We cannot have a default partition for hash partitioned table.
```

In [112]:
%%sql

CREATE TABLE users_hash_part_default
PARTITION OF users_hash_part DEFAULT

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
(psycopg2.errors.InvalidTableDefinition) a hash-partitioned table may not have a default partition

[SQL: CREATE TABLE users_hash_part_default PARTITION OF users_hash_part DEFAULT]
(Background on this error at: http://sqlalche.me/e/13/f405)


```{note}
Let us add partitions using modulus as 8. For each remainder between 0 to 7. we need to add a partition.
```

In [113]:
%%sql

CREATE TABLE users_hash_part_0_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 0)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [114]:
%%sql

CREATE TABLE users_hash_part_1_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 1)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [115]:
%%sql

CREATE TABLE users_hash_part_2_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 2)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [116]:
%%sql

CREATE TABLE users_hash_part_3_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 3)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [117]:
%%sql

CREATE TABLE users_hash_part_4_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 4)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [118]:
%%sql

CREATE TABLE users_hash_part_5_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 5)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [119]:
%%sql

CREATE TABLE users_hash_part_6_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 6)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [120]:
%%sql

CREATE TABLE users_hash_part_7_of_8
PARTITION OF users_hash_part
FOR VALUES WITH (modulus 8, remainder 7)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [121]:
%%sql

INSERT INTO users_hash_part
    (user_first_name, user_last_name, user_email_id, created_dt)
VALUES 
    ('Scott', 'Tiger', 'scott@tiger.com', '2018-10-01'),
    ('Donald', 'Duck', 'donald@duck.com', '2019-02-10'),
    ('Mickey', 'Mouse', 'mickey@mouse.com', '2017-06-22')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


[]

```{note}
**user_id** is populated by sequence. The hash of every sequence generated integer will be divided by modulus (which is 8) and based up on the remainder data will be inserted into corresponding partition.
```

In [122]:
%%sql

SELECT * FROM users_hash_part

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
3 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
1,Scott,Tiger,scott@tiger.com,False,,U,False,2018-10-01,2020-11-24 12:13:06.353736
3,Mickey,Mouse,mickey@mouse.com,False,,U,False,2017-06-22,2020-11-24 12:13:06.353736
2,Donald,Duck,donald@duck.com,False,,U,False,2019-02-10,2020-11-24 12:13:06.353736


In [123]:
%%sql

SELECT * FROM users_hash_part_0_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
1,Scott,Tiger,scott@tiger.com,False,,U,False,2018-10-01,2020-11-24 12:13:06.353736


In [124]:
%%sql

SELECT * FROM users_hash_part_1_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
3,Mickey,Mouse,mickey@mouse.com,False,,U,False,2017-06-22,2020-11-24 12:13:06.353736


In [125]:
%%sql

SELECT * FROM users_hash_part_2_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
2,Donald,Duck,donald@duck.com,False,,U,False,2019-02-10,2020-11-24 12:13:06.353736


In [126]:
%%sql

SELECT * FROM users_hash_part_3_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts


In [127]:
%%sql

SELECT * FROM users_hash_part_4_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts


In [128]:
%%sql

SELECT * FROM users_hash_part_5_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts


In [129]:
%%sql

SELECT * FROM users_hash_part_6_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts


In [130]:
%%sql

SELECT * FROM users_hash_part_7_of_8

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts


## Usage Scenarios

Let us go through some of the usage scenarios with respect to partitioning.

In [131]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/ahabU0ywRBw?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* It is typically used to manage large tables so that the tables does not grow abnormally over a period of time.
* Partitioning is quite often used on top of log tables, reporting tables etc.
* If a log table is partitioned and if we want to have data for 7 years, partitions older than 7 years can be quickly dropped.
* Dropping partitions to clean up huge chunk of data is much faster compared to running delete command on non partitioned table.
* For tables like orders with limited set of statuses, we often use list partitioning based up on the status. It can be 2 partitions (CLOSED orders and ACTIVE orders) or separate partition for each status.
  * As most of the operations will be on **Active Orders**, this approach can significantly improve the performance.
* In case of log tables, where we might want to retain data for several years, we tend to use range partition on date column. If we use list partition, then we might end up in duplication of data unnecessarily.

In [132]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [133]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


```{note}
Monthly partition using list. We need to have additional column to store the month to use list partitioning strategy.
```

In [134]:
%%sql

DROP TABLE IF EXISTS users_mthly

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [135]:
%%sql

CREATE TABLE users_mthly (
    user_id SERIAL,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    created_mnth INT,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (created_mnth, user_id)
) PARTITION BY LIST(created_mnth)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [136]:
%%sql

CREATE TABLE users_mthly_201601
PARTITION OF users_mthly
FOR VALUES IN (201601)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [137]:
%%sql

CREATE TABLE users_mthly_201602
PARTITION OF users_mthly
FOR VALUES IN (201602)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

```{note}
Monthly partition using range. Partition strategy is defined on top of **created_dt**. No additional column is required.
```

In [138]:
%%sql

DROP TABLE IF EXISTS users_mthly

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [139]:
%%sql

CREATE TABLE users_mthly (
    user_id SERIAL,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (created_dt, user_id)
) PARTITION BY RANGE(created_dt)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [140]:
%%sql

CREATE TABLE users_mthly_201601
PARTITION OF users_mthly
FOR VALUES FROM ('2016-01-01') TO ('2016-01-31')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [141]:
%%sql

CREATE TABLE users_mthly_201602
PARTITION OF users_mthly
FOR VALUES FROM ('2016-02-01') TO ('2016-02-29')

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

## Sub Partitioning

We can have sub partitions created with different permutations and combinations. Sub Partitioning is also known as nested partitioning.
* List - List

In [142]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/tO3oEo9h-EE?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* List - Range
and others.

```{note}
Try different sub-partitioning strategies based up on your requirements.
```

In [143]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [144]:
%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


### List - List Partitioning

Let us understand how we can create table using list - list sub partitioning. We would like to have main partition per year and then sub partitions per quarter.
* Create table `users_qtly` with `PARTITION BY LIST` with `created_year`.
* Create tables for yearly partitions with `PARTITION BY LIST` with `created_month`.
* Create tables for quarterly partitions with list of values using `FOR VALUES IN`.

In [145]:
%%sql

DROP TABLE IF EXISTS users_qtly

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [146]:
%%sql

CREATE TABLE users_qtly (
    user_id SERIAL,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    created_year INT,
    created_mnth INT,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (created_year, created_mnth, user_id)
) PARTITION BY LIST(created_year)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [147]:
%%sql

CREATE TABLE users_qtly_2016
PARTITION OF users_qtly
FOR VALUES IN (2016)
PARTITION BY LIST (created_mnth)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [148]:
%%sql

CREATE TABLE users_qtly_2016q1
PARTITION OF users_qtly_2016
FOR VALUES IN (1, 2, 3)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [149]:
%%sql

CREATE TABLE users_qtly_2016q2
PARTITION OF users_qtly_2016
FOR VALUES IN (4, 5, 6)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

### List - Range Partitioning

Let us understand how we can create table using list - Range sub partitioning using same example as before (partitioning by year and then by quarter).
* Create table with `PARTITION BY LIST` with `created_year`.
* Create tables for yearly partitions with `PARTITION BY RANGE` with `created_month`.
* Create tables for quarterly partitions with the range of values using `FOR VALUES FROM (lower_bound) TO (upper_bound)`.

In [150]:
%%sql

DROP TABLE IF EXISTS users_qtly

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [151]:
%%sql

CREATE TABLE users_qtly (
    user_id SERIAL,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    created_year INT,
    created_mnth INT,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (created_year, created_mnth, user_id)
) PARTITION BY LIST(created_year)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [152]:
%%sql

CREATE TABLE users_qtly_2016
PARTITION OF users_qtly
FOR VALUES IN (2016)
PARTITION BY RANGE (created_mnth)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [153]:
%%sql

CREATE TABLE users_qtly_2016q1
PARTITION OF users_qtly_2016
FOR VALUES FROM (1) TO (3)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [154]:
%%sql

CREATE TABLE users_qtly_2016q2
PARTITION OF users_qtly_2016
FOR VALUES FROM (4) TO (6)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

## Exercise - Partitioning Tables

Here is the exercise to get comfort with partitioning. We will be using range partitioning.
* Use retail database. Make sure **orders** table already exists.

In [155]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/uAkrpaJmbx0?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

### Exercise 1

Create table **orders_part** with the same columns as orders.
* Partition the table by month using range partitioning on **order_date**.
* Add 14 partitions - 13 based up on the data and 1 default. Here is the naming convention.
  * Default - orders_part_default
  * Partition for 2014 January - orders_part_201401

### Exercise 2

Let us load and validate data in the partitioned table.
* Load the data from **orders** into **orders_part**.
* Get count on **orders_part** as well as all the 14 partitions. You should get 0 for default partition and all the records should be distributed using the other 13 partitions.