Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Enable TOAST compression for PG CatCache #21040

Open
1 task done
kai-franz opened this issue Feb 13, 2024 · 0 comments
Open
1 task done

[YSQL] Enable TOAST compression for PG CatCache #21040

kai-franz opened this issue Feb 13, 2024 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@kai-franz
Copy link
Contributor

kai-franz commented Feb 13, 2024

Jira Link: DB-10009

Description

PostgreSQL compresses attributes for oversized heap tuples. When PostgreSQL copies tuples to the catcache, it keeps their attributes compressed if they were compressed in the heap.

In Yugabyte, when PG reads a tuple from DocDB, it is always uncompressed. This means that all tuples in the catcache are uncompressed, causing high memory usage, especially when preloading large tables.

We would like to enable TOAST compression in YB for the PG catcache to reduce the per-backend memory consumption.

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@kai-franz kai-franz added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Feb 13, 2024
@kai-franz kai-franz self-assigned this Feb 13, 2024
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Feb 13, 2024
kai-franz added a commit that referenced this issue Apr 8, 2024
Summary:
In YB, catalog tuples are cached in memory in a per-backend process catcache (`catcache.c`). This can lead to high per-backend memory consumption when users have many large objects (e.g. functions or views). Additionally, with catalog table preloading, entire catalog tables are prefetched into every backend's catcache, using up even more memory.

To reduce the memory impact of the catcache, we re-introduce TOAST compression, a feature from Postgres which compresses large attributes.

In Postgres, TOAST compression is used for tuples in the heap. When reading from the heap, they are cached in the catcache before decompression.

In Yugabyte, the heap is replaced with DocDB storage, which uses its own compression scheme (but it always returns data uncompressed). This diff adds compression when either inserting or updating values in the catcache. Then, whenever we read a value from the catcache, we decompress it (the logic to do decompression when reading from the catcache is already there in YB).

Note that the relcache, which sits on top of the catcache, remains uncompressed (as it is in PG). For many large objects stored in catalog tables, such as views in `pg_rewrite` or default values in `pg_attrdef`, the relcache serves the reads for these objects, so we won't have to do decompression every time we do a catalog lookup for one of these objects.

#### Details

Here is the precise process for compressing a tuple:

1. Check if the total size of the uncompressed tuple is over the threshold (2000 bytes by default). If it is, then we proceed to step 2. Otherwise, we keep the original tuple as-is.
2. For each variable-length attribute, try to compress it. If the compressed attribute ends up being larger than the uncompressed version, then we discard the compressed version and keep using the uncompressed version.
3. Form a new tuple out of a combination of the original fixed-length attributes, uncompressed variable-length attributes (that we tried to compress but the compressed version ended up being larger), and compressed variable-length attributes.
4. Copy this new tuple to the catcache.
Jira: DB-10009

Test Plan:
## Java unit tests

  - `TestToastFunction`: Tests the correctness and memory efficiency of TOAST compression for large functions.

## Perf tests (in progress)

[[ https://drive.google.com/drive/folders/1lFzZ74fsmWHS_qPGZxcZYJ_3nRNd2Xuu | Google Drive folder with results ]]

  - Workloads / schemas:
    - Empty system schema (baseline)
    - GetNeed schema
    - Compression-heavy schema: Large views, large default values to force TOAST compression
  - Scenarios - measure latency for:
    - Cache refresh (after a DDL or on initial connection)
      - Connection 1: ALTER TABLE DDL (to force cache refresh)
      - Connection 2: DML w/\timing
    - Adhoc metadata lookup
      - Start a connection w/o cache loading
      - Series of DML statements, each select count(*) from a different table
  - Dev portal cluster w/ build that has LTO enabled
    - RF=3, preferred one AZ
    - Connect to the preferred AZ

Reviewers: myang

Reviewed By: myang

Subscribers: steve.varnau, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D29916
kai-franz added a commit that referenced this issue Apr 10, 2024
Summary:
Original commit: 064914c / D29916
In YB, catalog tuples are cached in memory in a per-backend process catcache (`catcache.c`). This can lead to high per-backend memory consumption when users have many large objects (e.g. functions or views). Additionally, with catalog table preloading, entire catalog tables are prefetched into every backend's catcache, using up even more memory.

To reduce the memory impact of the catcache, we re-introduce TOAST compression, a feature from Postgres which compresses large attributes.

In Postgres, TOAST compression is used for tuples in the heap. When reading from the heap, they are cached in the catcache before decompression.

In Yugabyte, the heap is replaced with DocDB storage, which uses its own compression scheme (but it always returns data uncompressed). This diff adds compression when either inserting or updating values in the catcache. Then, whenever we read a value from the catcache, we decompress it (the logic to do decompression when reading from the catcache is already there in YB).

Note that the relcache, which sits on top of the catcache, remains uncompressed (as it is in PG). For many large objects stored in catalog tables, such as views in `pg_rewrite` or default values in `pg_attrdef`, the relcache serves the reads for these objects, so we won't have to do decompression every time we do a catalog lookup for one of these objects.

#### Details

Here is the precise process for compressing a tuple:

1. Check if the total size of the uncompressed tuple is over the threshold (2000 bytes by default). If it is, then we proceed to step 2. Otherwise, we keep the original tuple as-is.
2. For each variable-length attribute, try to compress it. If the compressed attribute ends up being larger than the uncompressed version, then we discard the compressed version and keep using the uncompressed version.
3. Form a new tuple out of a combination of the original fixed-length attributes, uncompressed variable-length attributes (that we tried to compress but the compressed version ended up being larger), and compressed variable-length attributes.
4. Copy this new tuple to the catcache.
Jira: DB-10009

Test Plan:
## Java unit tests

  - `TestToastFunction`: Tests the correctness and memory efficiency of TOAST compression for large functions.

## Perf tests (in progress)

[[ https://drive.google.com/drive/folders/1lFzZ74fsmWHS_qPGZxcZYJ_3nRNd2Xuu | Google Drive folder with results ]]

  - Workloads / schemas:
    - Empty system schema (baseline)
    - GetNeed schema
    - Compression-heavy schema: Large views, large default values to force TOAST compression
  - Scenarios - measure latency for:
    - Cache refresh (after a DDL or on initial connection)
      - Connection 1: ALTER TABLE DDL (to force cache refresh)
      - Connection 2: DML w/\timing
    - Adhoc metadata lookup
      - Start a connection w/o cache loading
      - Series of DML statements, each select count(*) from a different table
  - Dev portal cluster w/ build that has LTO enabled
    - RF=3, preferred one AZ
    - Connect to the preferred AZ

Reviewers: myang

Reviewed By: myang

Subscribers: yql, steve.varnau

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33958
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants