[YSQL] ERROR: row is too big, TOAST support #2003

jaki · 2019-08-08T17:58:24Z

Jira Link: DB-1486
This is taken from the arrays Postgres regress test.

-- Check that arrays of composites are safely detoasted when needed
create temp table src (f1 text);
insert into src
  select string_agg(random()::text,'') from generate_series(1,10000);
ERROR:  row is too big: size 121224, maximum size 8160

It should not error.

The text was updated successfully, but these errors were encountered:

ndeodhar · 2019-08-08T18:12:24Z

This is related to TOAST-able columns (we use postgres storage for temp tables):
https://www.postgresql.org/docs/9.5/storage-toast.html

The same example works in vanilla postgres and in YB for regular tables.

…e index tuple encoding Summary: Issue: On a secondary index scan, the index tuple is read from DocDB and potentially be compressed via TOAST mechanism. As Yugabyte stores (and potentially compresses data) in DocDB layer, the TOAST compression is not necessary and only results in extra latency. Solution: We disable TOAST compression for YB tables only. Note that temporary tables / indices are not stored in DocDB and thus might still need TOAST compression. When reading a tuple from DocDB, as the result is casted to an `IndexTuple` struct (which limits the index size to 8192 bytes), the uncompressed index tuple might exceed this limit and throw an error. To fix that, we change the encoding format of `IndexTuple` to support index size up to 32 MB (it's not recommended to support larger index size due to memory issue (of holding many tuples in memory), and RPC limit size when reading from DocDB). Note that temp tables / index of temp tables are still subject to the 8 kb limit (as described in #2003). Test Plan: Unit test added with a column size larger than 8 kb. `ybd --java-test org.yb.pgsql.TestPgRegressTable` Performance test: Before the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=59.430..258.773 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 16.887 ms Execution Time: 263.399 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=260.664..261.011 rows=1000 loops=1) Sort Key: data Sort Method: quicksort Memory: 4040kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=41.748..223.814 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 58.788 ms Execution Time: 262.599 ms Peak Memory Usage: 16473 kB (9 rows) ``` On a very wide column (JSON column with 4000 attributes instead), it throws an error: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; ERROR: index row requires 25984 bytes, maximum size is 8191 ``` After the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=34.926..48.935 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 35.742 ms Execution Time: 50.382 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=100.812..109.367 rows=1000 loops=1) Sort Key: data Sort Method: external merge Disk: 8096kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=62.023..76.775 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 47.625 ms Execution Time: 112.364 ms Peak Memory Usage: 17590 kB (9 rows) ``` Adding support for wide index makes the query works for a very wide JSON column with 4000 attributes: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=270.117..300.666 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 36.454 ms Execution Time: 302.121 ms Peak Memory Usage: 84736 kB (6 rows) ``` On testing with temp table / indexes of temp table, the issue in #2003 still persists. Testing with wide column Foreign Key references, both the old and new change succeeded. Reviewers: mihnea, smishra, neil, dmitry Reviewed By: dmitry Subscribers: fhoogland, kannan, yql Differential Revision: https://phabricator.dev.yugabyte.com/D17737

…elation and change index tuple encoding Summary: Issue: On a secondary index scan, the index tuple is read from DocDB and potentially be compressed via TOAST mechanism. As Yugabyte stores (and potentially compresses data) in DocDB layer, the TOAST compression is not necessary and only results in extra latency. Solution: We disable TOAST compression for YB tables only. Note that temporary tables / indices are not stored in DocDB and thus might still need TOAST compression. When reading a tuple from DocDB, as the result is casted to an `IndexTuple` struct (which limits the index size to 8192 bytes), the uncompressed index tuple might exceed this limit and throw an error. To fix that, we change the encoding format of `IndexTuple` to support index size up to 32 MB (it's not recommended to support larger index size due to memory issue (of holding many tuples in memory), and RPC limit size when reading from DocDB). Note that temp tables / index of temp tables are still subject to the 8 kb limit (as described in #2003). Test Plan: Unit test added with a column size larger than 8 kb. `ybd --java-test org.yb.pgsql.TestPgRegressTable` Performance test: Before the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=59.430..258.773 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 16.887 ms Execution Time: 263.399 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=260.664..261.011 rows=1000 loops=1) Sort Key: data Sort Method: quicksort Memory: 4040kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=41.748..223.814 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 58.788 ms Execution Time: 262.599 ms Peak Memory Usage: 16473 kB (9 rows) ``` On a very wide column (JSON column with 4000 attributes instead), it throws an error: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; ERROR: index row requires 25984 bytes, maximum size is 8191 ``` After the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=34.926..48.935 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 35.742 ms Execution Time: 50.382 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=100.812..109.367 rows=1000 loops=1) Sort Key: data Sort Method: external merge Disk: 8096kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=62.023..76.775 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 47.625 ms Execution Time: 112.364 ms Peak Memory Usage: 17590 kB (9 rows) ``` Adding support for wide index makes the query works for a very wide JSON column with 4000 attributes: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=270.117..300.666 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 36.454 ms Execution Time: 302.121 ms Peak Memory Usage: 84736 kB (6 rows) ``` On testing with temp table / indexes of temp table, the issue in #2003 still persists. Testing with wide column Foreign Key references, both the old and new change succeeded. Reviewers: mihnea, smishra, neil, dmitry Reviewed By: dmitry Subscribers: yql, kannan, fhoogland Differential Revision: https://phabricator.dev.yugabyte.com/D18075

…ation and change index tuple encoding Summary: Issue: On a secondary index scan, the index tuple is read from DocDB and potentially be compressed via TOAST mechanism. As Yugabyte stores (and potentially compresses data) in DocDB layer, the TOAST compression is not necessary and only results in extra latency. Solution: We disable TOAST compression for YB tables only. Note that temporary tables / indices are not stored in DocDB and thus might still need TOAST compression. When reading a tuple from DocDB, as the result is casted to an `IndexTuple` struct (which limits the index size to 8192 bytes), the uncompressed index tuple might exceed this limit and throw an error. To fix that, we change the encoding format of `IndexTuple` to support index size up to 32 MB (it's not recommended to support larger index size due to memory issue (of holding many tuples in memory), and RPC limit size when reading from DocDB). Note that temp tables / index of temp tables are still subject to the 8 kb limit (as described in #2003). Test Plan: Unit test added with a column size larger than 8 kb. `ybd --java-test org.yb.pgsql.TestPgRegressTable` Performance test: Before the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=59.430..258.773 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 16.887 ms Execution Time: 263.399 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=260.664..261.011 rows=1000 loops=1) Sort Key: data Sort Method: quicksort Memory: 4040kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=41.748..223.814 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 58.788 ms Execution Time: 262.599 ms Peak Memory Usage: 16473 kB (9 rows) ``` On a very wide column (JSON column with 4000 attributes instead), it throws an error: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; ERROR: index row requires 25984 bytes, maximum size is 8191 ``` After the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=34.926..48.935 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 35.742 ms Execution Time: 50.382 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=100.812..109.367 rows=1000 loops=1) Sort Key: data Sort Method: external merge Disk: 8096kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=62.023..76.775 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 47.625 ms Execution Time: 112.364 ms Peak Memory Usage: 17590 kB (9 rows) ``` Adding support for wide index makes the query works for a very wide JSON column with 4000 attributes: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=270.117..300.666 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 36.454 ms Execution Time: 302.121 ms Peak Memory Usage: 84736 kB (6 rows) ``` On testing with temp table / indexes of temp table, the issue in #2003 still persists. Testing with wide column Foreign Key references, both the old and new change succeeded. Reviewers: mihnea, smishra, neil, dmitry Reviewed By: dmitry Subscribers: yql, kannan, fhoogland Differential Revision: https://phabricator.dev.yugabyte.com/D18076

jaki added the area/ysql Yugabyte SQL (YSQL) label Aug 8, 2019

jaki assigned ndeodhar Aug 8, 2019

jaki added this to To do in YSQL via automation Aug 8, 2019

ddorian changed the title ~~[YSQL] ERROR: row is too big~~ [YSQL] ERROR: row is too big, TOAST support Jul 14, 2020

ddorian mentioned this issue Jul 14, 2020

Running Plone/Relstorage on YugabyteDB #4142

Open

ndeodhar moved this from To do to Backlog in YSQL Aug 5, 2020

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 8, 2022

yugabyte-ci unassigned ndeodhar Jul 8, 2022

yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Sep 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] ERROR: row is too big, TOAST support #2003

[YSQL] ERROR: row is too big, TOAST support #2003

jaki commented Aug 8, 2019 •

edited by yugabyte-ci

ndeodhar commented Aug 8, 2019

[YSQL] ERROR: row is too big, TOAST support #2003

[YSQL] ERROR: row is too big, TOAST support #2003

Comments

jaki commented Aug 8, 2019 • edited by yugabyte-ci

ndeodhar commented Aug 8, 2019

jaki commented Aug 8, 2019 •

edited by yugabyte-ci