Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New handling of delta tables hurts badly reusage of bats #6261

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

New handling of delta tables hurts badly reusage of bats #6261

monetdb-team opened this issue Nov 30, 2020 · 0 comments

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2017-04-13 14:02:07 +0200
From: @swingbit
To: SQL devs <>
Version: 11.25.15 (Dec2016-SP3)
CC: @mlkersten, @njnes

Last updated: 2017-10-26 14:01:35 +0200

Comment 25218

Date: 2017-04-13 14:02:07 +0200
From: @swingbit

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36
Build Identifier:

Here's a minimal example:

CREATE table t(a int);
START TRANSACTION;
INSERT INTO t VALUES (1);

explain
SELECT *
FROM t as t1,
t as t2
;

ROLLBACK;

The output of Jun2016 looks like this:

C_2:bat[:oid] := sql.tid(X_1,"spinque","t");
X_5:bat[:int] := sql.bind(X_1,"spinque","t","a",0);
(C_8:bat[:oid],r1_8:bat[:int]) := sql.bind(X_1,"spinque","t","a",2);
X_11:bat[:int] := sql.bind(X_1,"spinque","t","a",1);
X_13 := sql.delta(X_5,C_8,r1_8,X_11);
X_14 := algebra.projection(C_2,X_13);
(C_15,r1_22) := algebra.crossproduct(X_14,X_14);

Notice that, as you would expect, the final crossproduct appens between two instances of X_14

The output of Dec2016 looks like this:

C_4:bat[:oid] := sql.tid(X_3,"spinque","t");
X_7:bat[:int] := bat.new(nil:int);
X_10:bat[:oid] := bat.new(nil:oid);
r1_11:bat[:int] := bat.new(nil:int);
X_13:bat[:int] := sql.bind(X_3,"spinque","t","a",1:int);
X_15 := sql.delta(X_7,X_10,r1_11,X_13);
X_16 := algebra.projection(C_4,X_15);
X_18:bat[:int] := bat.new(nil:int);
X_19:bat[:oid] := bat.new(nil:oid);
r1_20:bat[:int] := bat.new(nil:int);
X_22 := sql.delta(X_18,X_19,r1_20,X_13);
X_23 := algebra.projection(C_4,X_22);
(X_24,r1_25) := algebra.crossproduct(X_16,X_23);

Notice that now the final crossproduct is using two different variables.
The content of X_16 and X_23 is obviously the same, but the commonTerms optimizer can no longer simplify them because the application of delta tables is now based on bat.new(), which is not reusable.

Please don't stop at this simple example. In a more complex query that for example reuses the same views in multiple places, this leads to such views being computed over and over.

This is currently a complete show-stopper for us. All our plans becomes several times longer and query times explode. As much as I like Dec2016 in many other ways, we will have to step back to Jun2016.

If I may raise a (hopefully constructive) concern, I keep seeing what look to me like conflicting goals: on the one hands, much effort put into a simple assembly-like language, which is perfect for optimizations like commonTerms (crucial in real applications!), but on the other hand choices that clearly go in the opposite direction, first and foremost the usage of operators with side-effects (e.g. new, in-place append, packIncrement), which make many optimizations almost useless. See also https://www.monetdb.org/bugzilla/show_bug.cgi?id=3992, it has never been tackled.

Reproducible: Always

$ mserver5 --version
MonetDB 5 server v11.25.12 (64-bit, 128-bit integers)
This is an unreleased version
Copyright (c) 1993-July 2008 CWI
Copyright (c) August 2008-2017 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Found 15.6GiB available memory, 8 available cpu cores
Libraries:
libpcre: 8.40 2017-01-11 (compiled with 8.40)
openssl: OpenSSL 1.0.2j 26 Sep 2016 (compiled with OpenSSL 1.0.2j-fips 26 Sep 2016)
libxml2: 2.9.3 (compiled with 2.9.3)
Compiled by: roberto@photon.hq.spinque.com (x86_64-unknown-linux-gnu)
Compilation: gcc -g -Werror -Wall -Wextra -W -Werror-implicit-function-declaration -Wpointer-arith -Wdeclaration-after-statement -Wundef -Wformat=2 -Wno-format-nonliteral -Winit-self -Winvalid-pch -Wmissing-declarations -Wmissing-format-attribute -Wmissing-prototypes -Wold-style-definition -Wpacked -Wunknown-pragmas -Wvariadic-macros -fstack-protector-all -Wstack-protector -Wpacked-bitfield-compat -Wsync-nand -Wjump-misses-init -Wmissing-include-dirs -Wlogical-op -Wunreachable-code
Linking : /usr/bin/ld -m elf_x86_64

Comment 25246

Date: 2017-04-19 09:58:27 +0200
From: @swingbit

Digging a bit into it, I found that optimizer.emptybind() turns sql.emptybind() calls into bat.new() calls.

Removing this optimizer gives:

C_4:bat[:oid] := sql.tid(X_3,"spinque","t");
X_7:bat[:int] := sql.emptybind(X_3,"spinque","t","a",0:int);
(C_10:bat[:oid],r1_11:bat[:int]) := sql.emptybind(X_3,"spinque","t","a",2:int);
X_13:bat[:int] := sql.bind(X_3,"spinque","t","a",1:int);
X_15 := sql.delta(X_7,C_10,r1_11,X_13);
X_16 := algebra.projection(C_4,X_15);
X_18:bat[:int] := sql.emptybind(X_3,"spinque","t","a",0:int);
(C_19:bat[:oid],r1_20:bat[:int]) := sql.emptybind(X_3,"spinque","t","a",2:int);
X_22 := sql.delta(X_18,C_19,r1_20,X_13);
X_23 := algebra.projection(C_4,X_22);
(X_24,r1_25) := algebra.crossproduct(X_16,X_23);

However, sql.emptybat still results not reusable, which seems weird to me.
I think the following is missing:

diff -r 80c50983077c monetdb5/optimizer/opt_support.c
--- a/monetdb5/optimizer/opt_support.c Tue Mar 28 15:09:09 2017 +0200
+++ b/monetdb5/optimizer/opt_support.c Wed Apr 19 09:51:00 2017 +0200
@@ -437,6 +437,8 @@
if (getFunctionId(p) == bindRef) return FALSE;
if (getFunctionId(p) == bindidxRef) return FALSE;
if (getFunctionId(p) == binddbatRef) return FALSE;

  •           if (getFunctionId(p) == emptybindRef) return FALSE;
    
  •           if (getFunctionId(p) == emptybindidxRef) return FALSE;
              if (getFunctionId(p) == columnBindRef) return FALSE;
              if (getFunctionId(p) == copy_fromRef) return FALSE;
              /* assertions are the end-point of a flow path */
    

Could you please comment on the purpose of optimizer.emptybind() ?

Comment 25247

Date: 2017-04-19 10:00:20 +0200
From: @swingbit

Forgot to mention, probably it was clear already: by removing optimizer.emptybind and adding those two lines to hasSideEffects() function, plans are back to normal.

Comment 25382

Date: 2017-06-13 16:58:58 +0200
From: @swingbit

It turns out that removing optimizer.emptybind() from the default pipeline (together with the little patch to hasSideEffects() function) did solve the plan explosion issue, but I see now that in some cases the resulting plan is not correct.

However, reintroducing it gives me unacceptable plans.

Getting stuck with older versions is no real solution.

I'm a bit frustrated with no good option forward.

Does anyone have a little time to look at this and comment?

Comment 25571

Date: 2017-08-13 20:28:21 +0200
From: MonetDB Mercurial Repository <>

Changeset 7018c1058588 made by Martin Kersten mk@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=7018c1058588

Changeset description:

Avoid removal of sql.bind operations
Empty persistent table operations should only be removed when
there are no other updates within the same transaction

addressing bug #6261

Comment 25572

Date: 2017-08-13 20:31:15 +0200
From: @mlkersten

Reduced the aggressive replacement of empty bat producing operation
in the empty-bind operations to cope with the issue reported in the first example.

  X_4 := sql.mvc();                                                                              
  C_5:bat[:oid] := sql.tid(X_4, "sys", "t");                                                          
  X_8:bat[:int] := sql.bind(X_4, "sys", "t", "a", 0:int);                                             
  (C_13:bat[:oid], X_14:bat[:int]) := sql.bind(X_4, "sys", "t", "a", 2:int);                          
  X_11:bat[:int] := sql.bind(X_4, "sys", "t", "a", 1:int);                                            
  X_16 := sql.delta(X_8, C_13, X_14, X_11);                                                           
  X_17 := algebra.projection(C_5, X_16);                                                              
  (X_25, X_26) := algebra.crossproduct(X_17, X_17);                                                   
  X_28 := algebra.projection(X_26, X_17);                                                             
  X_27 := algebra.projection(X_25, X_17);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant