From 5ecaef0b8a608a8c81c4c055fa7ec3df6cc25f25 Mon Sep 17 00:00:00 2001 From: Evan Tahler Date: Wed, 15 May 2024 09:06:14 -0700 Subject: [PATCH] more destination postgres warnings (#38219) --- docs/integrations/destinations/postgres.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/integrations/destinations/postgres.md b/docs/integrations/destinations/postgres.md index ce26efcad565a4..75bea1411352b7 100644 --- a/docs/integrations/destinations/postgres.md +++ b/docs/integrations/destinations/postgres.md @@ -4,14 +4,15 @@ This page guides you through the process of setting up the Postgres destination :::caution -Postgres, while an excellent relational database, is not a data warehouse. +Postgres, while an excellent relational database, is not a data warehouse. Please only consider using postgres as a destination for small data volumes (e.g. less than 10GB) or for testing purposes. For larger data volumes, we recommend using a data warehouse like BigQuery, Snowflake, or Redshift. 1. Postgres is likely to perform poorly with large data volumes. Even postgres-compatible destinations (e.g. AWS Aurora) are not immune to slowdowns when dealing with large writes or - updates over ~500GB. Especially when using normalization with `destination-postgres`, be sure to + updates over ~100GB. Especially when using [typing and deduplication](/using-airbyte/core-concepts/typing-deduping) with `destination-postgres`, be sure to monitor your database's memory and CPU usage during your syncs. It is possible for your destination to 'lock up', and incur high usage costs with large sync volumes. -2. Postgres column [name length limitations](https://www.postgresql.org/docs/current/limits.html) +2. When attempting to scale a postgres database to handle larger data volumes, scaling IOPS (disk throughput) is as important as increasing memory and compute capacity. +3. Postgres column [name length limitations](https://www.postgresql.org/docs/current/limits.html) are likely to cause collisions when used as a destination receiving data from highly-nested and flattened sources, e.g. `{63 byte name}_a` and `{63 byte name}_b` will both be truncated to `{63 byte name}` which causes postgres to throw an error that a duplicate column name was