-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] ERROR: row is too big, TOAST support #2003
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Projects
Comments
This is related to TOAST-able columns (we use postgres storage for temp tables): The same example works in vanilla postgres and in YB for regular tables. |
ddorian
changed the title
[YSQL] ERROR: row is too big
[YSQL] ERROR: row is too big, TOAST support
Jul 14, 2020
yugabyte-ci
added
kind/bug
This issue is a bug
priority/medium
Medium priority issue
labels
Jun 8, 2022
lnguyen-yugabyte
added a commit
that referenced
this issue
Jul 3, 2022
…e index tuple encoding Summary: Issue: On a secondary index scan, the index tuple is read from DocDB and potentially be compressed via TOAST mechanism. As Yugabyte stores (and potentially compresses data) in DocDB layer, the TOAST compression is not necessary and only results in extra latency. Solution: We disable TOAST compression for YB tables only. Note that temporary tables / indices are not stored in DocDB and thus might still need TOAST compression. When reading a tuple from DocDB, as the result is casted to an `IndexTuple` struct (which limits the index size to 8192 bytes), the uncompressed index tuple might exceed this limit and throw an error. To fix that, we change the encoding format of `IndexTuple` to support index size up to 32 MB (it's not recommended to support larger index size due to memory issue (of holding many tuples in memory), and RPC limit size when reading from DocDB). Note that temp tables / index of temp tables are still subject to the 8 kb limit (as described in #2003). Test Plan: Unit test added with a column size larger than 8 kb. `ybd --java-test org.yb.pgsql.TestPgRegressTable` Performance test: Before the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=59.430..258.773 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 16.887 ms Execution Time: 263.399 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=260.664..261.011 rows=1000 loops=1) Sort Key: data Sort Method: quicksort Memory: 4040kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=41.748..223.814 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 58.788 ms Execution Time: 262.599 ms Peak Memory Usage: 16473 kB (9 rows) ``` On a very wide column (JSON column with 4000 attributes instead), it throws an error: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; ERROR: index row requires 25984 bytes, maximum size is 8191 ``` After the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=34.926..48.935 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 35.742 ms Execution Time: 50.382 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=100.812..109.367 rows=1000 loops=1) Sort Key: data Sort Method: external merge Disk: 8096kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=62.023..76.775 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 47.625 ms Execution Time: 112.364 ms Peak Memory Usage: 17590 kB (9 rows) ``` Adding support for wide index makes the query works for a very wide JSON column with 4000 attributes: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=270.117..300.666 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 36.454 ms Execution Time: 302.121 ms Peak Memory Usage: 84736 kB (6 rows) ``` On testing with temp table / indexes of temp table, the issue in #2003 still persists. Testing with wide column Foreign Key references, both the old and new change succeeded. Reviewers: mihnea, smishra, neil, dmitry Reviewed By: dmitry Subscribers: fhoogland, kannan, yql Differential Revision: https://phabricator.dev.yugabyte.com/D17737
lnguyen-yugabyte
added a commit
that referenced
this issue
Jul 3, 2022
…elation and change index tuple encoding Summary: Issue: On a secondary index scan, the index tuple is read from DocDB and potentially be compressed via TOAST mechanism. As Yugabyte stores (and potentially compresses data) in DocDB layer, the TOAST compression is not necessary and only results in extra latency. Solution: We disable TOAST compression for YB tables only. Note that temporary tables / indices are not stored in DocDB and thus might still need TOAST compression. When reading a tuple from DocDB, as the result is casted to an `IndexTuple` struct (which limits the index size to 8192 bytes), the uncompressed index tuple might exceed this limit and throw an error. To fix that, we change the encoding format of `IndexTuple` to support index size up to 32 MB (it's not recommended to support larger index size due to memory issue (of holding many tuples in memory), and RPC limit size when reading from DocDB). Note that temp tables / index of temp tables are still subject to the 8 kb limit (as described in #2003). Test Plan: Unit test added with a column size larger than 8 kb. `ybd --java-test org.yb.pgsql.TestPgRegressTable` Performance test: Before the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=59.430..258.773 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 16.887 ms Execution Time: 263.399 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=260.664..261.011 rows=1000 loops=1) Sort Key: data Sort Method: quicksort Memory: 4040kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=41.748..223.814 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 58.788 ms Execution Time: 262.599 ms Peak Memory Usage: 16473 kB (9 rows) ``` On a very wide column (JSON column with 4000 attributes instead), it throws an error: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; ERROR: index row requires 25984 bytes, maximum size is 8191 ``` After the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=34.926..48.935 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 35.742 ms Execution Time: 50.382 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=100.812..109.367 rows=1000 loops=1) Sort Key: data Sort Method: external merge Disk: 8096kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=62.023..76.775 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 47.625 ms Execution Time: 112.364 ms Peak Memory Usage: 17590 kB (9 rows) ``` Adding support for wide index makes the query works for a very wide JSON column with 4000 attributes: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=270.117..300.666 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 36.454 ms Execution Time: 302.121 ms Peak Memory Usage: 84736 kB (6 rows) ``` On testing with temp table / indexes of temp table, the issue in #2003 still persists. Testing with wide column Foreign Key references, both the old and new change succeeded. Reviewers: mihnea, smishra, neil, dmitry Reviewed By: dmitry Subscribers: yql, kannan, fhoogland Differential Revision: https://phabricator.dev.yugabyte.com/D18075
lnguyen-yugabyte
added a commit
that referenced
this issue
Jul 3, 2022
…ation and change index tuple encoding Summary: Issue: On a secondary index scan, the index tuple is read from DocDB and potentially be compressed via TOAST mechanism. As Yugabyte stores (and potentially compresses data) in DocDB layer, the TOAST compression is not necessary and only results in extra latency. Solution: We disable TOAST compression for YB tables only. Note that temporary tables / indices are not stored in DocDB and thus might still need TOAST compression. When reading a tuple from DocDB, as the result is casted to an `IndexTuple` struct (which limits the index size to 8192 bytes), the uncompressed index tuple might exceed this limit and throw an error. To fix that, we change the encoding format of `IndexTuple` to support index size up to 32 MB (it's not recommended to support larger index size due to memory issue (of holding many tuples in memory), and RPC limit size when reading from DocDB). Note that temp tables / index of temp tables are still subject to the 8 kb limit (as described in #2003). Test Plan: Unit test added with a column size larger than 8 kb. `ybd --java-test org.yb.pgsql.TestPgRegressTable` Performance test: Before the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=59.430..258.773 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 16.887 ms Execution Time: 263.399 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=260.664..261.011 rows=1000 loops=1) Sort Key: data Sort Method: quicksort Memory: 4040kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=41.748..223.814 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 58.788 ms Execution Time: 262.599 ms Peak Memory Usage: 16473 kB (9 rows) ``` On a very wide column (JSON column with 4000 attributes instead), it throws an error: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; ERROR: index row requires 25984 bytes, maximum size is 8191 ``` After the change: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=34.926..48.935 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 35.742 ms Execution Time: 50.382 ms Peak Memory Usage: 8208 kB (6 rows) ``` ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0' order by data; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Sort (cost=18.57..18.82 rows=100 width=32) (actual time=100.812..109.367 rows=1000 loops=1) Sort Key: data Sort Method: external merge Disk: 8096kB -> Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=62.023..76.775 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 47.625 ms Execution Time: 112.364 ms Peak Memory Usage: 17590 kB (9 rows) ``` Adding support for wide index makes the query works for a very wide JSON column with 4000 attributes: ``` yugabyte=# explain analyze select data from t1 where cid = 'user-0-0'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Only Scan using ix on t1 (cost=0.00..15.25 rows=100 width=32) (actual time=270.117..300.666 rows=1000 loops=1) Index Cond: (cid = 'user-0-0'::text) Heap Fetches: 0 Planning Time: 36.454 ms Execution Time: 302.121 ms Peak Memory Usage: 84736 kB (6 rows) ``` On testing with temp table / indexes of temp table, the issue in #2003 still persists. Testing with wide column Foreign Key references, both the old and new change succeeded. Reviewers: mihnea, smishra, neil, dmitry Reviewed By: dmitry Subscribers: yql, kannan, fhoogland Differential Revision: https://phabricator.dev.yugabyte.com/D18076
yugabyte-ci
added
kind/enhancement
This is an enhancement of an existing feature
and removed
kind/bug
This issue is a bug
labels
Sep 18, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-1486
This is taken from the
arrays
Postgres regress test.It should not error.
The text was updated successfully, but these errors were encountered: