Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient plan is generated for queries with many (>= 24) joined tables which take a long time or an HEAPalloc error. I get Error: GDK reported error. HEAPalloc: Insufficient space for HEAP of 400000000 bytes. #3809

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2015-09-10 15:20:43 +0200
From: Martin van Dinther <<martin.van.dinther>>
To: SQL devs <>
Version: 11.21.5 (Jul2015)
CC: @njnes, vijayakrishna55

Last updated: 2016-04-05 12:09:20 +0200

Comment 21261

Date: 2015-09-10 15:20:43 +0200
From: Martin van Dinther <<martin.van.dinther>>

User-Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0
Build Identifier:

The following query on 24 tables with join conditions in the WHERE clause:
SELECT x20,x47,x38,x18,x10,x22,x37,x3,x63,x8,x30,x43,x54,x9,x21,x25,x2,x61,x55,x32,x52,x29,x50,x12
FROM t9,t20,t63,t54,t32,t22,t52,t25,t47,t29,t12,t18,t10,t21,t43,t37,t2,t61,t55,t3,t38,t50,t30,t8
WHERE b3=a18
AND a25=b38
AND a54=b8
AND a61=b47
AND a43=b29
AND b18=a12
AND b32=a30
AND a22=b43
AND a29=b21
AND a10=b25
AND a12=4
AND b22=a32
AND a20=b55
AND b30=a9
AND a2=b61
AND a38=b52
AND a55=b9
AND a21=b50
AND a37=b54
AND b10=a3
AND a52=b2
AND a50=b63
AND a8=b20
AND a47=b37;
fails to execute and aborts with:
GDK reported error.
HEAPalloc: Insufficient space for HEAP of 400000000 bytes.

Please note: this is only one example query. There are many more and even larger ones (up to 64 tables joined) in select5.test.sql

Reproducible: Always

Steps to Reproduce:

  1. Start mserver5 (MonetDB 5 server v11.21.6) compiled with --enable-assert
  2. Start mclient
  3. Run SQL script to create the 64 tables with each 10 rows, see: create_64_tables_with_10_rows_each.sql
  4. Run SQL query:
    SELECT x20,x47,x38,x18,x10,x22,x37,x3,x63,x8,x30,x43,x54,x9,x21,x25,x2,x61,x55,x32,x52,x29,x50,x12
    FROM t9,t20,t63,t54,t32,t22,t52,t25,t47,t29,t12,t18,t10,t21,t43,t37,t2,t61,t55,t3,t38,t50,t30,t8
    WHERE b3=a18
    AND a25=b38
    AND a54=b8
    AND a61=b47
    AND a43=b29
    AND b18=a12
    AND b32=a30
    AND a22=b43
    AND a29=b21
    AND a10=b25
    AND a12=4
    AND b22=a32
    AND a20=b55
    AND b30=a9
    AND a2=b61
    AND a38=b52
    AND a55=b9
    AND a21=b50
    AND a37=b54
    AND b10=a3
    AND a52=b2
    AND a50=b63
    AND a8=b20
    AND a47=b37;
  5. to cleanup tables, run SQL script: drop_64_tables.sql

Actual Results:

builtin opt gdk_dbpath = /export/scratch1/dinther/INSTALL/var/monetdb5/dbfarm/demo
builtin opt gdk_debug = 0
builtin opt gdk_vmtrim = no
builtin opt monet_prompt = >
builtin opt monet_daemon = no
builtin opt mapi_port = 50000
builtin opt mapi_open = false
builtin opt mapi_autosense = false
builtin opt sql_optimizer = default_pipe
builtin opt sql_debug = 0
cmdline opt embedded_r = true
cmdline opt mapi_port = 41000
cmdline opt gdk_debug = 10
MonetDB 5 server v11.21.6
This is an unreleased version
Serving database 'demo', using 8 threads
Compiled for x86_64-unknown-linux-gnu/64bit with 64bit OIDs and 128bit integers dynamically linked
Found 15.590 GiB available main-memory.
Copyright (c) 1993-July 2008 CWI.
Copyright (c) August 2008-2015 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Listening for connection requests on mapi:monetdb://127.0.0.1:41000/
MonetDB/GIS module loaded
Start processing logs sql/sql_logs version 52200
Finished processing logs sql/sql_logs
MonetDB/SQL module loaded
MonetDB/R module loaded

SQL catalog created, loading sql scripts once

bash-4.2$ mclient -p41000
Welcome to mclient, the MonetDB/SQL interactive terminal (unreleased)
Database: MonetDB v11.21.6 (unreleased), 'demo'
Type \q to quit, ? for a list of available commands
auto commit mode: on
sql>SELECT x20,x47,x38,x18,x10,x22,x37,x3,x63,x8,x30,x43,x54,x9,x21,x25,x2,x61,x55,x32,x52,x29,x50,x12
more> FROM t9,t20,t63,t54,t32,t22,t52,t25,t47,t29,t12,t18,t10,t21,t43,t37,t2,t61,t55,t3,t38,t50,t30,t8
more> WHERE b3=a18
more> AND a25=b38
more> AND a54=b8
more> AND a61=b47
more> AND a43=b29
more> AND b18=a12
more> AND b32=a30
more> AND a22=b43
more> AND a29=b21
more> AND a10=b25
more> AND a12=4
more> AND b22=a32
more> AND a20=b55
more> AND b30=a9
more> AND a2=b61
more> AND a38=b52
more> AND a55=b9
more> AND a21=b50
more> AND a37=b54
more> AND b10=a3
more> AND a52=b2
more> AND a50=b63
more> AND a8=b20
more> AND a47=b37
more>;
GDK reported error.
HEAPalloc: Insufficient space for HEAP of 400000000 bytes.
sql>

Expected Results:

No HEAPalloc error.

The plan generator (or optimizer) should try to generate joins instead of crossproducts (see plan below).

If crossproducts are not avoidable (so when no equality condition for the table exists) do them after the joins have been completed and not before the joins. It will result in smaller initial intermediates, less large intermediate results and hence faster and potentially successful (no GDK HEAPalloc error) execution.

The plan contains many crossproduct() calls which are computed first and thus produce very large intermediate results and very long execution time or abort with HEAPalloc error.

rel
project (
| join (
| | join (
| | | join (
| | | | join (
| | | | | join (
| | | | | | join (
| | | | | | | join (
| | | | | | | | join (
| | | | | | | | | join (
| | | | | | | | | | join (
| | | | | | | | | | | join (
| | | | | | | | | | | | join (
| | | | | | | | | | | | | join (
| | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | | | | | | | crossproduct (
| | | | | | | | | | | | | | | | | | | | | | | join (
| | | | | | | | | | | | | | | | | | | | | | | | table(sys.t18) [ t18.a18 NOT NULL HASHCOL , t18.b18, t18.x18 ] COUNT ,
| | | | | | | | | | | | | | | | | | | | | | | | select (
| | | | | | | | | | | | | | | | | | | | | | | | | table(sys.t12) [ t12.a12 NOT NULL HASHCOL , t12.x12 ] COUNT
| | | | | | | | | | | | | | | | | | | | | | | | ) [ t12.a12 NOT NULL HASHCOL = int "4" ]
| | | | | | | | | | | | | | | | | | | | | | | ) [ t18.b18 = t12.a12 NOT NULL HASHCOL ],
| | | | | | | | | | | | | | | | | | | | | | | table(sys.t22) [ t22.a22 NOT NULL HASHCOL , t22.b22, t22.x22 ] COUNT
| | | | | | | | | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | | | | | | | | table(sys.t32) [ t32.a32 NOT NULL HASHCOL , t32.b32, t32.x32 ] COUNT
| | | | | | | | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | | | | | | | table(sys.t54) [ t54.a54 NOT NULL HASHCOL , t54.b54, t54.x54 ] COUNT
| | | | | | | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | | | | | | table(sys.t63) [ t63.a63 NOT NULL HASHCOL , t63.b63, t63.x63 ] COUNT
| | | | | | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | | | | | table(sys.t20) [ t20.a20 NOT NULL HASHCOL , t20.b20, t20.x20 ] COUNT
| | | | | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | | | | table(sys.t9) [ t9.a9 NOT NULL HASHCOL , t9.b9, t9.x9 ] COUNT
| | | | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | | | table(sys.t52) [ t52.a52 NOT NULL HASHCOL , t52.b52, t52.x52 ] COUNT
| | | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | | table(sys.t25) [ t25.a25 NOT NULL HASHCOL , t25.b25, t25.x25 ] COUNT
| | | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | | table(sys.t47) [ t47.a47 NOT NULL HASHCOL , t47.b47, t47.x47 ] COUNT
| | | | | | | | | | | | | | ) [ ],
| | | | | | | | | | | | | | table(sys.t29) [ t29.a29 NOT NULL HASHCOL , t29.b29, t29.x29 ] COUNT
| | | | | | | | | | | | | ) [ t22.b22 = t32.a32 NOT NULL HASHCOL ],
| | | | | | | | | | | | | table(sys.t55) [ t55.a55 NOT NULL HASHCOL , t55.b55, t55.x55 ] COUNT
| | | | | | | | | | | | ) [ t20.a20 NOT NULL HASHCOL = t55.b55, t55.a55 NOT NULL HASHCOL = t9.b9 ],
| | | | | | | | | | | | table(sys.t43) [ t43.a43 NOT NULL HASHCOL , t43.b43, t43.x43 ] COUNT
| | | | | | | | | | | ) [ t43.a43 NOT NULL HASHCOL = t29.b29, t22.a22 NOT NULL HASHCOL = t43.b43 ],
| | | | | | | | | | | table(sys.t61) [ t61.a61 NOT NULL HASHCOL , t61.b61, t61.x61 ] COUNT
| | | | | | | | | | ) [ t61.a61 NOT NULL HASHCOL = t47.b47 ],
| | | | | | | | | | table(sys.t2) [ t2.a2 NOT NULL HASHCOL , t2.b2, t2.x2 ] COUNT
| | | | | | | | | ) [ t2.a2 NOT NULL HASHCOL = t61.b61, t52.a52 NOT NULL HASHCOL = t2.b2 ],
| | | | | | | | | table(sys.t38) [ t38.a38 NOT NULL HASHCOL , t38.b38, t38.x38 ] COUNT
| | | | | | | | ) [ t25.a25 NOT NULL HASHCOL = t38.b38, t38.a38 NOT NULL HASHCOL = t52.b52 ],
| | | | | | | | table(sys.t30) [ t30.a30 NOT NULL HASHCOL , t30.b30, t30.x30 ] COUNT
| | | | | | | ) [ t32.b32 = t30.a30 NOT NULL HASHCOL , t30.b30 = t9.a9 NOT NULL HASHCOL ],
| | | | | | | table(sys.t37) [ t37.a37 NOT NULL HASHCOL , t37.b37, t37.x37 ] COUNT
| | | | | | ) [ t37.a37 NOT NULL HASHCOL = t54.b54, t47.a47 NOT NULL HASHCOL = t37.b37 ],
| | | | | | table(sys.t50) [ t50.a50 NOT NULL HASHCOL , t50.b50, t50.x50 ] COUNT
| | | | | ) [ t50.a50 NOT NULL HASHCOL = t63.b63 ],
| | | | | table(sys.t21) [ t21.a21 NOT NULL HASHCOL , t21.b21, t21.x21 ] COUNT
| | | | ) [ t21.a21 NOT NULL HASHCOL = t50.b50, t29.a29 NOT NULL HASHCOL = t21.b21 ],
| | | | table(sys.t10) [ t10.a10 NOT NULL HASHCOL , t10.b10, t10.x10 ] COUNT
| | | ) [ t10.a10 NOT NULL HASHCOL = t25.b25 ],
| | | table(sys.t3) [ t3.a3 NOT NULL HASHCOL , t3.b3, t3.x3 ] COUNT
| | ) [ t10.b10 = t3.a3 NOT NULL HASHCOL , t3.b3 = t18.a18 NOT NULL HASHCOL ],
| | table(sys.t8) [ t8.a8 NOT NULL HASHCOL , t8.b8, t8.x8 ] COUNT
| ) [ t54.a54 NOT NULL HASHCOL = t8.b8, t8.a8 NOT NULL HASHCOL = t20.b20 ]
) [ t20.x20, t47.x47, t38.x38, t18.x18, t10.x10, t22.x22, t37.x37, t3.x3, t63.x63, t8.x8, t30.x30, t43.x43, t54.x54, t9.x9, t21.x21, t25.x25, t2.x2, t61.x61, t55.x55, t32.x32, t52.x52, t29.x29, t50.x50, t12.x12 ]

Comment 21262

Date: 2015-09-10 15:22:48 +0200
From: Martin van Dinther <<martin.van.dinther>>

Created attachment 351
create_64_tables_with_10_rows_each.sql

Attached file: create_64_tables_with_10_rows_each.sql (application/sql, 34794 bytes)
Description: create_64_tables_with_10_rows_each.sql

Comment 21263

Date: 2015-09-10 15:23:19 +0200
From: Martin van Dinther <<martin.van.dinther>>

Created attachment 352
drop_64_tables.sql

Attached file: drop_64_tables.sql (application/sql, 1042 bytes)
Description: drop_64_tables.sql

Comment 21269

Date: 2015-09-17 16:45:39 +0200
From: Martin van Dinther <<martin.van.dinther>>

Created attachment 354
explain_query_3809.out

Also adding the explain output of the query

Attached file: explain_query_3809.out (text/plain, 45025 bytes)
Description: explain_query_3809.out

Comment 21270

Date: 2015-09-17 17:20:17 +0200
From: Martin van Dinther <<martin.van.dinther>>

Additional issue is that after getting the
GDK reported error.
HEAPalloc: Insufficient space for HEAP of 8000000000 bytes.
the server does NOT remove/purge the already created karge intermediate bat files and thus the disk remains full after running this query.
In my case the file monetdb5/dbfarm/demo/bat/20 is 7.3 GB !

Even after disconnecting from the mserver5 and stopping mserver5 process the large file(s) still remains on disk.

Only after restarting mserver5 again it seems to purge the large obsolete data file(s).

This server error handling should be improved (IMHO) by purging all the for this MAL plan created intermediate files from disk, as when the query would have been executed successfully and the query resultset was closed.

Comment 21275

Date: 2015-09-20 21:46:04 +0200
From: @njnes

cleanout empty select operators more agressively. This results in a large join tree being reordered fully, solving the crossproducts.

Comment 21276

Date: 2015-09-20 22:05:28 +0200
From: MonetDB Mercurial Repository <>

Changeset b35df980282e made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=b35df980282e

Changeset description:

remove empty select during join_reorder, solves bug #3809

Comment 21458

Date: 2015-11-03 10:18:52 +0100
From: @sjoerdmullender

Jul2015 SP1 has been released.

Comment 22010

Date: 2016-04-05 12:09:20 +0200
From: Vijay Krishna <>

Is this issue and the below one same or related? The below issue is from 2014 releases and till July 2015-SP3.

https://www.monetdb.org/bugzilla/show_bug.cgi?id=3972

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant