Script to run against your RDS Postgres databases that tries to work around the Lazy Load feature of EBS
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


AWS RDS PostgreSQL uses EBS which has an interesting feature called Lazy Loading that allows it to instantiate a disk (of any size) at almost constant time. Although a fantastic feature, this however, can lead to unexpected outcomes when high-end production load is thrown at a newly launched RDS Postgres instance immediately after Restoring from a Snapshot.

This project tries to use various methods available to allow RDS Postgres Users to force 'Initialization' of Disk Blocks, using pg_prewarm PostgreSQL extension (which is supported in RDS Postgres). This is useful for instance when you restore an RDS Postgres instance and cannot afford high Latencies for the initial workload thrown at it.

Although pg_prewarm was originally meant for populating buffer-cache (and not for this purpose), the idea is common and (in this specific use-case) heaven-sent to Initialize (almost) the entire snapshot from S3 on to the RDS EBS volume in question.

For those who understand the working of pg_prewarm (and are concerned about Instance Memory / Cache sizes), do note that even if pg_prewarm runs through all tables etc., thereby effectively evicting the disk-blocks pushed to cache in a previous run for a recent table, it still does the job of initializing all disk-blocks with respect to the EBS volume, and thus still recommended (for the above use-case).

Notably, TOAST tables are handled in a special way, owing to how Postgres treats them. For this, please refer to 'toast.sql' Script in this Repository.


When this SQL is run against each Database in an RDS Postgres Instance, it forces the Disk-Block Initialization for all Database Objects owned by the User (or for which there are SELECT privileges).

On a sample run (run on a pgbench database), my RDS Postgres instance returns this:

clock_timestamp table_size freespace_map_size visibility_map_size init_size blocks_prefetched schema_name table_name
2018-09-29 07:11:33.688139+00 1281 MB 344 kB 48 kB 0 bytes 163984 public pgbench_accounts
2018-09-29 07:11:57.970511+00 8192 bytes 24 kB 8192 bytes 0 bytes 5 public pgbench_branches
2018-09-29 07:11:57.970735+00 0 bytes 0 bytes 0 bytes 0 bytes 0 public pgbench_history
2018-09-29 07:11:57.970804+00 48 kB 24 kB 8192 bytes 0 bytes 10 public pgbench_tellers
2018-09-29 07:11:57.971753+00 ¤ 0 bytes 0 bytes 0 bytes 0 ¤ User's large objects
(5 rows)


On a sample run (run on a pgbench database), my RDS Postgres instance returns this:

pg_user@pgbench=> SELECT 'VACUUM FULL ' || (SELECT nspname FROM pg_namespace n WHERE n.oid = c.relnamespace) || '.' || relname || ';' AS vacuum_sql FROM pg_class c WHERE reltoastrelid > 0 ORDER BY 1;

VACUUM FULL information_schema.sql_features;
VACUUM FULL information_schema.sql_implementation_info;
VACUUM FULL information_schema.sql_languages;
VACUUM FULL information_schema.sql_packages;
VACUUM FULL information_schema.sql_parts;
VACUUM FULL information_schema.sql_sizing;
VACUUM FULL information_schema.sql_sizing_profiles;
VACUUM FULL pg_catalog.pg_attrdef;
VACUUM FULL pg_catalog.pg_constraint;
VACUUM FULL pg_catalog.pg_db_role_setting;
VACUUM FULL pg_catalog.pg_description;
VACUUM FULL pg_catalog.pg_proc;
VACUUM FULL pg_catalog.pg_rewrite;
VACUUM FULL pg_catalog.pg_seclabel;
VACUUM FULL pg_catalog.pg_shdescription;
VACUUM FULL pg_catalog.pg_shseclabel;
VACUUM FULL pg_catalog.pg_statistic;
VACUUM FULL pg_catalog.pg_statistic_ext;
VACUUM FULL pg_catalog.pg_trigger;
(19 rows)