Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when 2 multiplexed functions in MAL plan, only one is mapped correctly to bat<mod>.function primitive #3556

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2014-08-29 11:26:21 +0200
From: @swingbit
To: MonetDB5 devs <>
Version: 11.15.19 (Feb2013-SP6)
CC: @mlkersten

Last updated: 2015-08-28 13:41:52 +0200

Comment 20106

Date: 2014-08-29 11:26:21 +0200
From: @swingbit

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36
Build Identifier:

This happens on Feb2013.
I am under the impression that in more recent versions where manifold was introduced, the cause of the problem is still there, although the effects are less serious.

Also, I have the feeling this can be related to bug #3361 (https://www.monetdb.org/bugzilla/show_bug.cgi?id=3361)

Consider this explain:

explain select replace(a1,a2,a3), id
from (
select name as a1, 'a' as a2, 'A' as a3, id as id
from sys.functions
) as x;

+---------------------------------------------------------------+
| mal |
+===============================================================+
| function user.s3_1{autoCommit=true}(A0:str,A1:str):void; |
| X_18:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_25 := nil:bat[:oid,:int]; |
| barrier X_48 := language.dataflow(); |
| X_4 := sql.mvc(); |
| X_5:bat[:oid,:oid] := sql.tid(X_4,"sys","functions"); |
| X_8 := sql.bind(X_4,"sys","functions","name",0); |
| (X_11,r1_11) := sql.bind(X_4,"sys","functions","name",2); |
| X_14 := sql.bind(X_4,"sys","functions","name",1); |
| X_16 := sql.delta(X_8,X_11,r1_11,X_14); |
| X_17 := algebra.leftfetchjoin(X_5,X_16); |
| X_18:bat[:oid,:str] := batstr.replace(X_17,A0,A1); |
| X_19 := sql.bind(X_4,"sys","functions","id",0); |
| (X_21,r1_25) := sql.bind(X_4,"sys","functions","id",2); |
| X_23 := sql.bind(X_4,"sys","functions","id",1); |
| X_24 := sql.delta(X_19,X_21,r1_25,X_23); |
| X_25 := algebra.leftfetchjoin(X_5,X_24); |
| language.pass(X_5); |
| exit X_48; |
| X_26 := sql.resultSet(2,1,X_18); |
| sql.rsColumn(X_26,"sys.L1","L1","varchar",0,0,X_18); |
| sql.rsColumn(X_26,"sys.x","id","int",32,0,X_25); |
| X_35 := io.stdout(); |
| sql.exportResult(X_35,X_26); |
| end s3_1; |
+---------------------------------------------------------------+

Now the same query, where the only difference is "id" -> "id + 1" in the subquery:

explain select replace(a1,a2,a3), id
from (
select name as a1, 'a' as a2, 'A' as a3, id + 1 as id
from sys.functions
) as x;

+-----------------------------------------------------------------+
| mal |
+=================================================================+
| function user.s4_1{autoCommit=true}(A0:str,A1:str,A2:lng):void; |
| X_18:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_25 := nil:bat[:oid,:int]; |
| X_26:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_27:bat[:oid,:str] := nil:bat[:oid,:str]; |
| barrier X_56 := language.dataflow(); |
| X_5 := sql.mvc(); |
| X_6:bat[:oid,:oid] := sql.tid(X_5,"sys","functions"); |
| X_9 := sql.bind(X_5,"sys","functions","name",0); |
| (X_12,r1_12) := sql.bind(X_5,"sys","functions","name",2); |
| X_15 := sql.bind(X_5,"sys","functions","name",1); |
| X_17 := sql.delta(X_9,X_12,r1_12,X_15); |
| X_18:bat[:oid,:str] := algebra.leftfetchjoin(X_6,X_17); |
| X_19 := sql.bind(X_5,"sys","functions","id",0); |
| (X_21,r1_21) := sql.bind(X_5,"sys","functions","id",2); |
| X_23 := sql.bind(X_5,"sys","functions","id",1); |
| X_24 := sql.delta(X_19,X_21,r1_21,X_23); |
| X_25 := algebra.leftfetchjoin(X_6,X_24); |
| X_26:bat[:oid,:str] := algebra.project(X_25,A0); |
| X_27:bat[:oid,:str] := algebra.project(X_25,A1); |
| language.pass(X_6); |
| exit X_56; |
| X_69:bat[:str,:oid] := bat.reverse(X_18); |
| X_71 := bat.new(nil:oid,nil:str); |
| barrier (X_75,X_76) := iterator.new(X_18); |
| X_78 := algebra.fetch(X_26,X_75); |
| X_80 := algebra.fetch(X_27,X_75); |
| X_82 := str.replace(X_76,X_78,X_80); |
| X_83 := algebra.fetch(X_69,X_75); |
| bat.insert(X_71,X_83,X_82); |
| redo (X_75,X_76) := iterator.next(X_18); |
| exit (X_75,X_76); |
| X_18:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_26:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_27:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_69:bat[:str,:oid] := nil:bat[:str,:oid]; |
| X_28:bat[:oid,:str] := X_71; |
| X_31 := batcalc.lng(X_25); |
| X_32:bat[:oid,:lng] := batcalc.+(X_31,A2); |
| X_33 := sql.resultSet(2,1,X_28); |
| sql.rsColumn(X_33,"sys.L1","L1","varchar",0,0,X_28); |
| sql.rsColumn(X_33,"sys.x","id","bigint",64,0,X_32); |
| X_43 := io.stdout(); |
| sql.exportResult(X_43,X_33); |
| end s4_1; |
+-----------------------------------------------------------------+

The main difference is: batstr.replace has become a loop over str.replace.
What I do not understand, is how the "+ 1" is related to this change. I have the feeling it comes from the additional constant A2 passed to the function as a parameter, but I can't figure out the exact reason.

Why is this bad?

  1. Obviously, because the loop can in principle be much slower than the bat version
  2. Potentially more serious, because the loop, which involves bat.new and bat.insert will not be recognised as reusable by the commonTerms optimizer, if the query becomes bigger and repeats the same pattern. I could verify this, with generated query which have such repeated patterns at the very beginning and become very slow. Notice that eliminating redundant computations can be crucial. When the non-recognised pattern happens at the beginning of long subqueries, potentially very long portions that could be reused will be in fact recomputed.

Reproducible: Always

Comment 20112

Date: 2014-09-01 15:29:55 +0200
From: @swingbit

Update: the choice of expanding mal.multiplex("str","replace", .., ..) into either batstr.replace() or into the loop is done by the inline optimizer.

I've used the embedded debugger on both queries.

On the version without "+ 1", opt_inline turns mal.multiplex("str","replace", .., ..) into batstr.replace.

On the version with "+ 1", opt_inline turns mal.multiplex("str","replace", .., ..) into a loop.

I still can't see for what reason though.

Comment 20113

Date: 2014-09-01 15:52:40 +0200
From: @swingbit

Sorry, previous comment is wrong.

I wrote about the inline optimizer, but I meant the REMAP optimizer.

In the first query, it turns the multiplex into a batstr call. In the second one, it leaves it as multiplex (which then is turned into a loop by the multiplex optimizer).

The question remains: why does the remap optimizer fail to create the batstr call?

Comment 20114

Date: 2014-09-01 16:38:10 +0200
From: @swingbit

UPDATE: Te problem has nothing to do with constants.

The following is also generating a loop on str.replace, just because of the addition of abs() on the subquery.

sql>explain
more>select replace(a1,a2,a3), id
more>from (
more> select name as a1, 'a' as a2, 'A' as a3, abs(id) as id
more> from sys.functions
more>) as x;
+---------------------------------------------------------------+
| mal |
+===============================================================+
| function user.s2_1{autoCommit=true}(A0:str,A1:str):void; |
| X_17:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_24 := nil:bat[:oid,:int]; |
| X_25:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_26:bat[:oid,:str] := nil:bat[:oid,:str]; |
| barrier X_54 := language.dataflow(); |
| X_4 := sql.mvc(); |
| X_5:bat[:oid,:oid] := sql.tid(X_4,"sys","functions"); |
| X_8 := sql.bind(X_4,"sys","functions","name",0); |
| (X_11,r1_11) := sql.bind(X_4,"sys","functions","name",2); |
| X_14 := sql.bind(X_4,"sys","functions","name",1); |
| X_16 := sql.delta(X_8,X_11,r1_11,X_14); |
| X_17:bat[:oid,:str] := algebra.leftfetchjoin(X_5,X_16); |
| X_18 := sql.bind(X_4,"sys","functions","id",0); |
| (X_20,r1_20) := sql.bind(X_4,"sys","functions","id",2); |
| X_22 := sql.bind(X_4,"sys","functions","id",1); |
| X_23 := sql.delta(X_18,X_20,r1_20,X_22); |
| X_24 := algebra.leftfetchjoin(X_5,X_23); |
| X_25:bat[:oid,:str] := algebra.project(X_24,A0); |
| X_26:bat[:oid,:str] := algebra.project(X_24,A1); |
| language.pass(X_5); |
| exit X_54; |
| X_67:bat[:str,:oid] := bat.reverse(X_17); |
| X_69 := bat.new(nil:oid,nil:str); |
| barrier (X_73,X_74) := iterator.new(X_17); |
| X_76 := algebra.fetch(X_25,X_73); |
| X_78 := algebra.fetch(X_26,X_73); |
| X_80 := str.replace(X_74,X_76,X_78); |
| X_81 := algebra.fetch(X_67,X_73); |
| bat.insert(X_69,X_81,X_80); |
| redo (X_73,X_74) := iterator.next(X_17); |
| exit (X_73,X_74); |
| X_17:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_25:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_26:bat[:oid,:str] := nil:bat[:oid,:str]; |
| X_67:bat[:str,:oid] := nil:bat[:str,:oid]; |
| X_27:bat[:oid,:str] := X_69; |
| X_30:bat[:oid,:int] := batcalc.abs(X_24); |
| X_31 := sql.resultSet(2,1,X_27); |
| sql.rsColumn(X_31,"sys.L1","L1","varchar",0,0,X_27); |
| sql.rsColumn(X_31,"sys.x","id","int",32,0,X_30); |
| X_41 := io.stdout(); |
| sql.exportResult(X_41,X_31); |
| end s2_1; |
+---------------------------------------------------------------+
44 tuples (3.339ms)
sql>

It seems that when two candidates for conversion to bat operations are present, only the second gets mapped right (batcalc.abs() in this case), while the first doesn't (batstr.replace() in this case).

Comment 20246

Date: 2014-10-04 14:17:34 +0200
From: @mlkersten

Bug has been added to test suite.
The code appears to be properly generated for all three cases in OCT2014 branch.

Comment 20247

Date: 2014-10-04 14:19:16 +0200
From: MonetDB Mercurial Repository <>

Changeset 506bb6a962f7 made by Martin Kersten mk@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=506bb6a962f7

Changeset description:

Manifold code generation test
The code seems properly generated for Bug #3556

Comment 20254

Date: 2014-10-06 10:07:06 +0200
From: @swingbit

Martin,

I don't quite agree that the problem is solved in Oct2014.

The loop is no longer generated because manifold takes care of it internally.

But the point is that the specialised bat version of the function should have been called (batreplace), not the manifold call.

Am I wrong?

Comment 20255

Date: 2014-10-06 10:21:44 +0200
From: @swingbit

Compare these two:

In the first case:

explain select replace(a1,a2,a3), id
from (
select name as a1, 'a' as a2, 'A' as a3, id as id
from sys.functions
) as x;

we get batstr.replace(X_17,A0,A1) <- (bat,constant,constant)

Then, just by adding an abs() to id:

select replace(a1,a2,a3), id
from (
select name as a1, 'a' as a2, 'A' as a3, abs(id) as id
from sys.functions
) as x;

we get a (bat,bat,bat) call <- hence the manifold call, because no batreplace(bat,bat,bat) is available.

Why is that? How is the abs() on the id column triggering this?

Comment 20256

Date: 2014-10-06 11:02:19 +0200
From: @mlkersten

Roberto,

Look at the latest (shortened) test output. It illustrates that two algebra.project() calls precede it.
Indeed, the ABS() over a scalar value (or argument) would not require it.

Comment 20257

Date: 2014-10-06 11:12:39 +0200
From: @swingbit

Indeed, I see what is happening.

So initially I had thought this issue was related to the remap optimizer, but it comes instead from how the constants are expanded into bat constants.

Now that the real cause is clear, the title of this bug report is no longer accurate.

Personally, I would think those unnecessary constant expansion should be removed, but I leave it up to you what to do with this bug report. Feel free to change the title as appropriate, or close it if you don't think it should be fixed.

Roberto

Comment 20988

Date: 2015-07-09 22:20:44 +0200
From: @mlkersten

The constant propagation optimizer has been re-enabled and the commonTerms optimizer now performs a much better job in removing duplicate sub-expressions.

Bug closed for the time being. Patches will appear in future release.

Comment 21191

Date: 2015-08-28 13:41:52 +0200
From: @sjoerdmullender

Jul2015 has been released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant