New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: error compressing wide table #4398
Comments
It seems like you are running into postgres limitations here. The 8160 byte row limit is the maximum size a row can have and short of recompiling postgres with a different page size there is no way around. With > 1000 columns the likelyhood of hitting this is pretty high assuming average field size is 8 bytes. But postgres supports storing data out of line to handle data > 8kb so I'm wondering why this is not done here. Did you change the storage options for the fields? |
Our tables have up to ~1000 columns, which are essentially all 2/4/8 byte ints
or 4/8 byte floats (using "plain" storage). In the error I sent, the rows have
~3600 bytes of "plain" columns, and another 5 varlen columns (which can be
toasted). We do not change the column storage stuff.
We have no issue with vanilla postgres - the problem is while enabling
compressing in timescale.
|
reproduce like
DROP TABLE foo;
DROP TABLE foo_ts;
SELECT format('CREATE TABLE foo AS SELECT now() AS tm, %s', array_to_string(array_agg(format('1 AS c%s',a)), ', ')) FROM generate_series(1,666)a\gexec
CREATE TABLE foo_ts (LIKE foo); SELECT * FROM create_hypertable('foo_ts', 'tm');
ALTER TABLE foo_ts SET (timescaledb.compress, timescaledb.compress_segmentby='c1', timescaledb.compress_orderby='tm');
INSERT INTO foo_ts SELECT * FROM foo;
SELECT compress_chunk(c) FROM show_chunks('foo_ts')c;
|
Below is the explanation for what is happening. Consider this example:
Consider above scenario where we create a hypertable and enable compression on that table. When compress_chunk() is called below are list of things happening:
Now assume above table 'foo_ts' has 600 columns(tm, c1, c2 ... c599). Tested on: Hi Justin, |
On Wed, Aug 10, 2022 at 01:42:21AM -0700, Bharathy wrote:
Hi Justin,
Thanks for reproducible testcase.
You mentioned that you don't see any issue with vanilla postgres. Can you please share a similar testcase which i can run on postgres(without timescaledb) to better understand on what is happening. This will give me more information to decide on if this is a bug or an expected behaviour.
The testcase I gave shows a table which 1) has no issue under vanilla postgres
and 2) gives an error with timescaledb. What would you want another test case
to show ?
…--
Justin
|
Hi Justin, Consider simple table like below:
Please see that storage for char and varchar is set to 'extended'. It means values for these attributes(aka columns) can be toasted and compressed as well by postgres.
There is no internal timescaledb tables created yet. Now create hypertable and enable compression on the table.
If you notice there is internal timescaledb compression table ('_timescaledb_internal._compressed_hypertable_85') created.
Please notice the storage qualifier for columns tm, c2, c3, c4. External means attribute can be toasted but not compressed. Please create table like below with 460 columns:
Now do an insert, it will fail with same error.
In short we are trying to hit hard limits set by postgres. Thus IMHO this is an expected behaviour. |
On Wed, Aug 10, 2022 at 11:01:53PM -0700, Bharathy wrote:
In short we are trying to hit hard limits set by postgres. Thus IMHO this is an expected behaviour.
postgres has an absolute, compile-time limit of 1600 columns, further limited
by column widths, which is documented:
https://www.postgresql.org/docs/current/limits.html
Your test case demonstrates in vanilla postgres the documented limitation.
The column limit in postgres is an intentional limit (max 1600 columns to avoid
bad performance of any O(n**2) loops), but it seems like timescale's use of
additional columns and storage settings is an implementation detail that would
be better if users didn't need to be aware of. Alternately, if timescale
supports fewer columns or narrower tuple width than postgres, I think that the
limit should be documented.
We use timeseries data and many columns - historically up to 1600 cols and
currently up to ~1000. With the exception of a few "key" columns, they're all
int/float, with "plain" storage. This limitation prevents us from using the
most interesting functionality from timescale.
The problem for me is not only that compressed hypertables have a lower limit,
but also that they fail not when we're INSERTing data, but later when calling
compress_chunk(). It would help if a WARNING (or ERROR) were issued when
creating the hypertable, to indicate that the widths of inline columns might
exceed the maximum (depending I think on null values and var width columns)
when compressing chunks later.
…--
Justin
|
Hi Justin, |
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as row is too big: size 10856, maximum size 8160. This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398 Co-Authored-By: Mats Kindahl <mats.kindahl@gmail.com>
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398 Co-Authored-By: Mats Kindahl <mats.kindahl@gmail.com>
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398 Co-Authored-By: Mats Kindahl <mats.kindahl@gmail.com>
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398 Co-Authored-By: Mats Kindahl <mats.kindahl@gmail.com>
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398 Co-Authored-By: Mats Kindahl <mats.kindahl@gmail.com>
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes #4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes timescale#4398
Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes #4398
What type of bug is this?
Unexpected error
What subsystems and features are affected?
Compression
What happened?
We have wide tables (with up to ~1000 columns).
When I try to compress a table, it fails like:
TimescaleDB version affected
2.7
PostgreSQL version used
14.3
What operating system did you use?
centos7
What installation method did you use?
Source
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
How can we reproduce the bug?
No response
The text was updated successfully, but these errors were encountered: