perf: optimize tables query #542

aokiji · 2025-09-28T15:10:07Z

Summary

Tables info can be queried faster by calling pg_total_relation_size only once per row, pg_total_relation_size is a volatile function, meaning that it would be called twice per row as volatile functions are not guaranteed the same result per table lookup, but that is not meaningful in this context

Additionally the group by expression can be removed entirely since only one row per oid can be produced by the join conditions

the speed up can be noticed specially in databases with large number of tables, to test this out I will set up a test database and measure the improvement

Set up the environment

We will create a postgres 12 database and populate it with 50000 tables with primary key

Using the following init.sql script:

-- init.sql
CREATE TABLE IF NOT EXISTS public.client_types (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255)
);

-- create multiple tables with foreign keys in public
DO $$
DECLARE
    table_prefix TEXT := 'client_catalog_';
    number_of_tables_to_create INT := 50000;
BEGIN

    -- create tables
    FOR i IN 1..number_of_tables_to_create LOOP
        EXECUTE format('
            CREATE TABLE IF NOT EXISTS public.%I ( LIKE public.client_types )' , table_prefix || i::text);

    END LOOP;
END $$;

Let's start a postgres server that initializes with init.sql

❯ docker run --rm -p 15432:5432 -e POSTGRES_USER=example_user -e POSTGRES_PASSWORD=example_password -e POSTGRES_DB=example_db -v $PWD/init.sql:/docker-entrypoint-initdb.d/init.sql:ro --name pg_sql_bench postgres:12-alpine -c shared_buffers=512MB -c max_locks_per_transaction=1000

Validation

Using the two versions of the query:

current_query

-- current_query.sql
select
  c.oid :: int8 as "id!",
  nc.nspname as schema,
  c.relname as name,
  c.relkind as table_kind,
  c.relrowsecurity as rls_enabled,
  c.relforcerowsecurity as rls_forced,
  case
    when c.relreplident = 'd' then 'DEFAULT'
    when c.relreplident = 'i' then 'INDEX'
    when c.relreplident = 'f' then 'FULL'
    else 'NOTHING'
  end as "replica_identity!",
  pg_total_relation_size(format('%I.%I', nc.nspname, c.relname)) :: int8 as "bytes!",
  pg_size_pretty(
    pg_total_relation_size(format('%I.%I', nc.nspname, c.relname))
  ) as "size!",
  pg_stat_get_live_tuples(c.oid) as "live_rows_estimate!",
  pg_stat_get_dead_tuples(c.oid) as "dead_rows_estimate!",
  obj_description(c.oid) as comment
from
  pg_namespace nc
  join pg_class c on nc.oid = c.relnamespace
where
  c.relkind in ('r', 'p', 'v', 'm')
  and not pg_is_other_temp_schema(nc.oid)
  and (
    pg_has_role(c.relowner, 'USAGE')
    or has_table_privilege(
      c.oid,
      'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'
    )
    or has_any_column_privilege(c.oid, 'SELECT, INSERT, UPDATE, REFERENCES')
  )
group by
  c.oid,
  c.relname,
  c.relkind,
  c.relrowsecurity,
  c.relforcerowsecurity,
  c.relreplident,
  nc.nspname;

new_query

-- new_query.sql
select
  c.oid :: int8 as "id!",
  nc.nspname as schema,
  c.relname as name,
  c.relkind as table_kind,
  c.relrowsecurity as rls_enabled,
  c.relforcerowsecurity as rls_forced,
  case
    when c.relreplident = 'd' then 'DEFAULT'
    when c.relreplident = 'i' then 'INDEX'
    when c.relreplident = 'f' then 'FULL'
    else 'NOTHING'
  end as "replica_identity!",
  relation_size:: int8 as "bytes!",
  pg_size_pretty(relation_size) as "size!",
  pg_stat_get_live_tuples(c.oid) as "live_rows_estimate!",
  pg_stat_get_dead_tuples(c.oid) as "dead_rows_estimate!",
  obj_description(c.oid) as comment
from
  pg_namespace nc
  join pg_class c on nc.oid = c.relnamespace
  cross join lateral pg_total_relation_size(c.oid) relation_size
where
  c.relkind in ('r', 'p', 'v', 'm')
  and not pg_is_other_temp_schema(nc.oid)
  and (
    pg_has_role(c.relowner, 'USAGE')
    or has_table_privilege(
      c.oid,
      'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'
    )
    or has_any_column_privilege(c.oid, 'SELECT, INSERT, UPDATE, REFERENCES')
  )

The first step is to test that both queries produce the same results

❯ psql postgres://example_user:example_password@localhost:15432/example_db -t -A < current_query.sql | sort > current_query_results
❯ psql postgres://example_user:example_password@localhost:15432/example_db -t -A < new_query.sql | sort > new_query_results
❯ diff -s current_query_results new_query_results
Files current_query_results and new_query_results are identical

Performance analysis

Knowing both queries produce the same result lets check the performance

Prior to each hyperfine run we must restart the database to ensure queries run on equal context

❯ hyperfine 'psql postgres://example_user:example_password@localhost:15432/example_db -t -A < current_query.sql'
Benchmark 1: psql postgres://example_user:example_password@localhost:15432/example_db -t -A < current_query.sql
  Time (mean ± σ):      1.670 s ±  0.379 s    [User: 0.050 s, System: 0.041 s]
  Range (min … max):    1.542 s …  2.748 s    10 runs

❯ hyperfine 'psql postgres://example_user:example_password@localhost:15432/example_db -t -A < new_query.sql'
Benchmark 1: psql postgres://example_user:example_password@localhost:15432/example_db -t -A < new_query.sql
  Time (mean ± σ):      1.049 s ±  0.359 s    [User: 0.046 s, System: 0.039 s]
  Range (min … max):    0.929 s …  2.071 s    10 runs

Mean and max time differ on 700ms

tables info can be queried faster by calling pg_total_relation_size only once per row

psteinroe

thanks for the verbose PR!

perf: optimize tables query

f2a2bd0

tables info can be queried faster by calling pg_total_relation_size only once per row

psteinroe approved these changes Sep 29, 2025

View reviewed changes

psteinroe merged commit f61d5a1 into supabase-community:main Sep 29, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: optimize tables query #542

perf: optimize tables query #542

Uh oh!

aokiji commented Sep 28, 2025

Uh oh!

psteinroe left a comment

Uh oh!

Uh oh!

Uh oh!

perf: optimize tables query #542

perf: optimize tables query #542

Uh oh!

Conversation

aokiji commented Sep 28, 2025

Summary

Set up the environment

Validation

Performance analysis

Uh oh!

psteinroe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!