-
Notifications
You must be signed in to change notification settings - Fork 853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM on full outer join queries involving many partitions #1274
Comments
Could you provide the OOM details that show all the processes together with their memory usage? The details that are relevant look similar to the ones mentioned here: |
Linux is not running out of memory, since Kubernetes terminates the pod on some threshold (which is ~52gb). Here is a Terminated at 73% memory
Do you think I have a problem in the query? |
No; this behavior seems off; this should only happen if you have a query requiring a lot of memory and you have This leads to the questions:
|
Also, this does not directly relate to TimescaleDB, but I guess we'd rather have a sqlstate 53200 / out_of_memory for the client by PostgreSQL than Kubernetes intervening. I'm not sure which parameters would be involved in the Kubernetes case, in classical Linux you'd reduce overcommit_ratio and disable overcommit altogether so there is never any memory starvation possible for the running node. |
@feikesteenbergen, all settings are specified at the 1st post -
With regards to Kubernetes, it terminates the pod at some threshold (before the node itself suffers). I agree that it'd be better if Postgres will return OOM error - how do you suggest to configure it? Thanks |
@cloud-rocket Would you be happy to share the (obfuscated) explain plan in this issue? |
Sure
|
@cloud-rocket thanks for the plan If I understand you correctly, the problem does not occur when running the sole select? I'm going to see if I can reproduce something like that as well to see if we can find out what is causing the memory increase. |
It's a good question. For the obvious reason of returning millions of records, I cannot run this select on its own. But I will try to run I actually never thought there might be a difference between creating a view with select and just running a select.... |
But we do have the same out memory issues since 0.x versions.... |
It looks like the |
@cloud-rocket good news and bad news. The good news is that I'm able to reproduce this issue. It is however not Timescale specific, it seems to be a PostgreSQL issue when doing an
Reproducing testcase: CREATE TABLE buggy(
inserted timestamptz not null
)
PARTITION BY RANGE (inserted);
-- Create some partitions
DO $BODY$
DECLARE
partname text;
start date := date_trunc('week', '1999-12-31'::date);
BEGIN
FOR i IN 0..1000 LOOP
partname := format('buggy_%s', to_char(start, 'IYYYIW'));
EXECUTE format( 'CREATE TABLE %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
partname,
'buggy',
start,
start + 7
);
start := start + 7;
END LOOP;
END;
$BODY$;
-- This works fine
EXPLAIN
SELECT
inserted
FROM
buggy b1
LEFT JOIN
buggy b2 USING (inserted)
LEFT JOIN
buggy b3 USING (inserted)
LEFT JOIN
buggy b4 USING (inserted)
;
/*
This will either do the following:
- `ERROR: 53200: out of memory` for systems with overcommit disabled, or
- an out-of-memory kill (kernel)
- rescheduling of the pod (k8s)
*/
EXPLAIN
SELECT
inserted
FROM
buggy b1
FULL OUTER JOIN
buggy b2 USING (inserted)
FULL OUTER JOIN
buggy b3 USING (inserted)
FULL OUTER JOIN
buggy b4 USING (inserted)
; I'll do a bit more of debugging to isolate the issue, but I think we'll report this upstream to the PostgreSQL bugs mailing list and once the bug# is known, we'll close this issue with a link to the discussion on the PostgreSQL mailing list. |
@cloud-rocket as only |
Thank you @feikesteenbergen. |
It depends. There is no such thing as an index for multiple tables in PostgreSQL, but you could create a single table with 1 column which contains these values and put an index on that. The other option is to do a bit of the planning yourself, e.g. do 1 full outer join in a CTE, and do another one level lower. This will force the planner into a specific order of execution, which normally is something you want to avoid, but here it may be useful. For example, this query causes an OOM on my machine: EXPLAIN
SELECT
inserted
FROM
buggy b1
FULL OUTER JOIN
buggy b2 USING (inserted)
FULL OUTER JOIN (
buggy b3 USING (inserted)
FULL OUTER JOIN
buggy b4 USING (inserted)
)
; But this one doesn't: EXPLAIN
WITH my_cte AS (
SELECT
inserted
FROM
buggy b1
FULL OUTER JOIN
buggy b2 USING (inserted)
)
SELECT
inserted
FROM
my_cte -- contains b1 and b2
FULL OUTER JOIN
buggy b3 USING (inserted)
FULL OUTER JOIN
buggy b4 USING (inserted); The above approach will require that my_cte be materialized - so a lot of memory or temp files may be required. Perhaps a better approach would be to get the aggregates per table first and join the aggregates only, e.g.: ```sql
WITH a AS (
SELECT
time_bucket('5 minute', timestamp) AS tbucket,
...
FROM
bbb_23_calcs
GROUP BY
time_bucket('5 minute', timestamp)
), b AS (
SELECT
time_bucket('5 minute', timestamp) AS tbucket,
...
FROM
bbb_23_22_ip
GROUP BY
time_bucket('5 minute', timestamp)
, [...]
SELECT
tbucket,
a.col1,
b.col2,
coalesce(a.col3, b.col3)
FROM
a
FULL OUTER JOIN
b USING (tbucket)
ORDER BY
tbucket; |
I raised this as a bug on the PostgreSQL mailing list here: |
A patch was added to PostgreSQL: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d25ea0127598dd0ad27a029215172f396fbd1009 I've done a few tests with and without this patch, a 3-way full outer join with 1000 partiitons each: EXPLAIN (ANALYZE)
SELECT
inserted
FROM
buggy b1
FULL OUTER JOIN
buggy b2 USING (inserted)
FULL OUTER JOIN
buggy b3 USING (inserted)
; Before:
After:
The 4-way join is now planned within reasonable enough time for a materialized view/continuous aggregate, and it does not consume huge amounts of memory:
There won't be a backpatch for PostgreSQL < 12, so for current releases the only way to make this work is to use a workaround. It will take some time for Timescale to be available on 12, but as soon as it is this problem should not exist anymore. We can close the issue, as there is nothing we can do from our side to make this work better before PostgreSQL 12 is released. |
THanks @feikesteenbergen!! Great job!! |
Closing! |
Relevant system information:
postgres --version
): 11\dx
inpsql
): 1.3.0shared_buffers = 6531MB
effective_cache_size = 19594MB
maintenance_work_mem = 2047MB
wal_buffers = 16MB
work_mem = 33441kB
max_worker_processes = 11
max_parallel_workers = 4
Describe the bug
Postgres runs out of memory
Having 3 several years of data tables we are trying to create a materialized view with 5mins aggregation containing all the data together.
The memory usage increases while the query runs - until it reaches the threshold of 52gb
To Reproduce
Query example:
Expected behavior
View should be created after long run time
Actual behavior
Pod is terminated as it runs out of memory
Screenshots
Only a single query is running until pod is terminated with out of memory:
The text was updated successfully, but these errors were encountered: