ENH: improved multiValueBind #1328

rPraml · 2018-03-02T02:15:15Z

This fixes the bug, if multivalueBind is supported, we cannot say in genral that we can use it on ID columns.

If ID column is from type string or integer it would work, but if ID column is of a type that multivaluebind does not support, you'll get an error.

It als re-adds my existing implementation for SqlServer and oracle

rbygrave · 2018-03-02T03:43:49Z

Have you checked the query plans that Oracle and SQL Server produce? Can you post them up?

rPraml · 2018-03-02T15:33:17Z

example query for oracle

16:28:51.864 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.order_date c2, t0.ship_date c3, t1.name c4, t0.cretime c5, t0.updtime c6, t0.kcustomer_id c7 from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  where t0.cretime in (SELECT * FROM TABLE (SELECT ? FROM DUAL))  and t0.id <= ? ; --bind(Array[1100]={2018-03-02 16:28:51.555,1970-01-01 01:00:01.234,...},4)
16:28:51.865 [main] DEBUG io.ebean.SUM - FindMany type[Order] origin[BmouL7.A.A] exeMicros[22078] rows[1] predicates[t0.cretime in (SELECT * FROM TABLE (SELECT ? FROM DUAL))  and t0.id <= ? ] bind[Array[1100]={2018-03-02 16:28:51.555,1970-01-01 01:00:01.234,...},4]
16:28:51.881 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.order_date c2, t0.ship_date c3, t1.name c4, t0.cretime c5, t0.updtime c6, t0.kcustomer_id c7 from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  where t0.id in (SELECT * FROM TABLE (SELECT ? FROM DUAL))  and t0.id <= ? ; --bind(Array[1100]={1,2,3,-3,-4,-5,-6,-7,-8,-9,-10,-11,-12,...},4)
16:28:51.882 [main] DEBUG io.ebean.SUM - FindMany type[Order] origin[BmouL7.A.A] exeMicros[12871] rows[3] predicates[t0.id in (SELECT * FROM TABLE (SELECT ? FROM DUAL))  and t0.id <= ? ] bind[Array[1100]={1,2,3,-3,-4,-5,-6,-7,-8,-9,-10,-11,-12,...},4]
16:28:51.887 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.order_date c2, t0.ship_date c3, t1.name c4, t0.cretime c5, t0.updtime c6, t0.kcustomer_id c7 from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  where 1=1 and t0.id > ? ; --bind(0)
16:28:51.888 [main] DEBUG io.ebean.SUM - FindMany type[Order] origin[BmouL7.A.A] exeMicros[5370] rows[5] predicates[1=1 and t0.id > ? ] bind[0]
16:28:51.907 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.name c2, t0.smallnote c3, t0.anniversary c4, t0.cretime c5, t0.updtime c6, t0.version c7, t0.billing_address_id c8, t0.shipping_address_id c9 from o_customer t0 where t0.name in (SELECT * FROM TABLE (SELECT ? FROM DUAL))  and t0.id <= ? ; --bind(Array[1100]={Rob,Fiona,FooBar2,FooBar3,FooBar4,FooBar5,...},4)
16:28:51.908 [main] DEBUG io.ebean.SUM - FindMany type[Customer] origin[RwAkj.A.A] exeMicros[17590] rows[2] predicates[t0.name in (SELECT * FROM TABLE (SELECT ? FROM DUAL))  and t0.id <= ? ] bind[Array[1100]={Rob,Fiona,FooBar2,FooBar3,FooBar4,FooBar5,...},4]

and sqlserver

16:32:33.554 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.order_date c2, t0.ship_date c3, t1.name c4, t0.cretime c5, t0.updtime c6, t0.kcustomer_id c7 from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  where t0.cretime in (SELECT * FROM ?)  and t0.id <= ? ; --bind(Array[2200]={2018-03-02 16:32:33.199,1970-01-01 01:00:01.234,...},4)
16:32:33.554 [main] DEBUG io.ebean.SUM - FindMany type[Order] origin[BmouL7.A.A] exeMicros[29471] rows[1] predicates[t0.cretime in (SELECT * FROM ?)  and t0.id <= ? ] bind[Array[2200]={2018-03-02 16:32:33.199,1970-01-01 01:00:01.234,...},4]
16:32:33.564 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.order_date c2, t0.ship_date c3, t1.name c4, t0.cretime c5, t0.updtime c6, t0.kcustomer_id c7 from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  where t0.id in (SELECT * FROM ?)  and t0.id <= ? ; --bind(Array[2200]={1,2,3,-3,-4,-5,-6,-7,-8,-9,-10,-11,-12,...},4)
16:32:33.565 [main] DEBUG io.ebean.SUM - FindMany type[Order] origin[BmouL7.A.A] exeMicros[6361] rows[3] predicates[t0.id in (SELECT * FROM ?)  and t0.id <= ? ] bind[Array[2200]={1,2,3,-3,-4,-5,-6,-7,-8,-9,-10,-11,-12,...},4]
16:32:33.570 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.order_date c2, t0.ship_date c3, t1.name c4, t0.cretime c5, t0.updtime c6, t0.kcustomer_id c7 from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  where 1=1 and t0.id > ? ; --bind(0)
16:32:33.571 [main] DEBUG io.ebean.SUM - FindMany type[Order] origin[BmouL7.A.A] exeMicros[5719] rows[5] predicates[1=1 and t0.id > ? ] bind[0]
16:32:33.596 [main] DEBUG io.ebean.SQL - select t0.id c0, t0.status c1, t0.name c2, t0.smallnote c3, t0.anniversary c4, t0.cretime c5, t0.updtime c6, t0.version c7, t0.billing_address_id c8, t0.shipping_address_id c9 from o_customer t0 where t0.name in (SELECT * FROM ?)  and t0.id <= ? ; --bind(Array[2200]={Rob,Fiona,FooBar2,FooBar3,FooBar4,FooBar5,...},4)
16:32:33.597 [main] DEBUG io.ebean.SUM - FindMany type[Customer] origin[RwAkj.A.A] exeMicros[21726] rows[2] predicates[t0.name in (SELECT * FROM ?)  and t0.id <= ? ] bind[Array[2200]={Rob,Fiona,FooBar2,FooBar3,FooBar4,FooBar5,...},4]

rPraml · 2018-03-02T15:37:22Z

src/main/java/io/ebeaninternal/server/persist/platform/OracleHelpImpl.txt

+import java.sql.Connection;
+import java.sql.SQLException;
+
+import oracle.jdbc.OracleConnection;


NOTE: this is a text file, as oracle.jdbc.OracleConnection is not public available. So I included the class in binary form:
https://github.com/ebean-orm/ebean/pull/1328/files#diff-1d0fda0b6639c184544f2e59c5ae5036

Do you think this hack is acceptable?

rbygrave · 2018-03-09T01:19:04Z

Sorry, I actually meant the explain plan for the queries ... that the explain plan shows them hitting the index. We need to check and confirm ...

rPraml · 2018-03-09T14:08:19Z

Did I get it right, that I should execute "explain select * from xxx where ..." manually and check which indices are hit? Or is there a feature in ebean that dumps the "explain plan" for each query?

# Conflicts: # src/main/java/io/ebeaninternal/server/deploy/BeanDescriptorManager.java

rbygrave · 2018-03-11T23:51:48Z

Did I get it right, that I should execute "explain select * from xxx where ..." manually and check which indices are hit?

Yes. Specifically we want to compare the 2 explain plans ... to confirm that they are effectively the same. That we still hit the indexes etc and yes we need to do this manually at the moment.

(Yes, there is a desire and plan to automate the collection of explain plans as part of a performance monitoring tool).

# Conflicts: # pom.xml # src/main/java/io/ebeaninternal/server/deploy/BeanDescriptorManager.java

rPraml

@rbygrave I analyzed the queries for postgres and came to the conclusion that the internal query optimizer will produce the same query plans (at least for postgres).

select t0.id, t0.status, t0.order_date, t0.ship_date, t1.name, t0.cretime, t0.updtime, 
  t0.kcustomer_id from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  
where order_date in (?, ?, <REMOVED~1000> ?, ? )  and t0.id <= ?

and

select t0.id, t0.status, t0.order_date, t0.ship_date, t1.name, t0.cretime, t0.updtime, 
  t0.kcustomer_id from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id  
where order_date = any(?) and t0.id <= ?

results to the SAME query plan

Hash Join  (cost=22.16..624.38 rows=422 width=134)
  Hash Cond: (t0.kcustomer_id = t1.id)
  ->  Bitmap Heap Scan on o_order t0  (cost=7.43..604.34 rows=422 width=36)
        Recheck Cond: (id <= 4)
        Filter: (order_date = ANY ('{2018-03-27,2018-03-28,2018-03-29 ... 31,2018-04-01,2018-04-02}'::date[]))
        ->  Bitmap Index Scan on pk_o_order  (cost=0.00..7.32 rows=423 width=0)
              Index Cond: (id <= 4)
  ->  Hash  (cost=12.10..12.10 rows=210 width=102)
        ->  Seq Scan on o_customer t1  (cost=0.00..12.10 rows=210 width=102)

I also checked sporadically the plans for other DBMS and came to the conclusion, that they are OK.
I added a quick & dirty mechanism to log the plans in ebean. See here:
FOCONIS@172bc44

rPraml · 2018-03-09T14:35:06Z

src/main/java/io/ebeaninternal/server/core/InternalConfiguration.java

+      case SQLSERVER16:
+      case SQLSERVER17:
+      case SQLSERVER:
+        return new SqlServerMultiValueBind();


Note to @rbygrave - please check https://github.com/ebean-orm/ebean/pull/1328/files#diff-78b982f628a680c43549c680aaadfa2aR271 if we need SQLSERVER16/17, too

rPraml · 2018-03-09T14:41:23Z

src/main/java/io/ebeaninternal/server/persist/platform/AbstractMultiValueBind.java

@@ -73,7 +90,8 @@ protected String getArrayType(int dbType) {
      case TIMESTAMP:
      case TIME_WITH_TIMEZONE:
      case TIMESTAMP_WITH_TIMEZONE:
-        return "timestamp";
+        return null; // NO: Does not work reliable due time zone issues! - Fall back to normal query


timestamps don't work reliable due timezone issues. It would require to perform timezone conversion before putting the timestamps into the multi-value datastructure.

rbygrave · 2018-03-28T12:24:48Z

Right sorry, I already knew that Postgres ANY was good - I checked that before allowing that in and we have been using it to good effect in Postgres for a while now.

What I really want to do is absolutely confirm that the query plans for Oracle and SQL Server (which are the 2 platforms this change wants to add this support for right) are good. So I need to see actual query plans - we need to be sure.

rPraml · 2018-03-28T21:01:21Z

So I need to see actual query plans - we need to be sure.

good that you are so persistent ;)

I checked the query plans and plans like where id in (?,?,?) are more efficient as the multi-value one where id in (SELECT * .... ) in oracle / sqlserver

I found an interesting article here: https://www.spiderstrategies.com/blog/2014-11-03-sql-server-query-type-performance.html

there is an intersting summary:

You should definitely stop using batched parameterized queries for selecting rows by ID. They were the bottom performer in every test. They should be replaced with temporary tables if you're willing to do a little work to make sure you're not hitting the create temporary table delay. If, for whatever reason, the temp table approach is not chosen, you should use the constructed query approach within a framework that prevents SQL injection attacks.

but here the query plans for oracle and sqlserver

Oracle normal:

select t0.id c0, t0.name c1 from tuuid_entity t0 where t0.id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ? ) 
 
------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                 |     1 |   139 |     1   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |                 |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| TUUID_ENTITY    |     1 |   139 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | PK_TUUID_ENTITY |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("T0"."ID"=:1 OR "T0"."ID"=:2 OR "T0"."ID"=:3 OR "T0"."ID"=:4 OR 
              "T0"."ID"=:5 OR "T0"."ID"=:6 OR "T0"."ID"=:7 OR "T0"."ID"=:8 OR "T0"."ID"=:9 OR 
              "T0"."ID"=:10)
 
Note
-----
   - dynamic sampling used for this statement (level=2)

Oracle with multi value bind

select t0.id c0, t0.name c1 from tuuid_entity t0 where t0.id in (SELECT * FROM TABLE (SELECT ? FROM DUAL)) 
 
----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |     1 | 16524 |    32   (4)| 00:00:01 |
|*  1 |  HASH JOIN SEMI                     |              |     1 | 16524 |    32   (4)| 00:00:01 |
|   2 |   TABLE ACCESS FULL                 | TUUID_ENTITY |    27 |  3753 |     2   (0)| 00:00:01 |
|   3 |   VIEW                              | VW_NSO_1     |  8168 |   127M|    29   (0)| 00:00:01 |
|   4 |    COLLECTION ITERATOR PICKLER FETCH|              |  8168 | 16336 |    29   (0)| 00:00:01 |
|   5 |     FAST DUAL                       |              |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("T0"."ID"="COLUMN_VALUE")
 
Note
-----
   - dynamic sampling used for this statement (level=2)

SQL Server normal

SQL Server Multi value bind

rPraml · 2018-03-29T08:00:53Z

What do you think, if we use the multivaluebind for SqlServer & Oracle only for higher parameter count? (currently hard coded > 100)

# Conflicts: # src/main/java/io/ebeaninternal/server/core/OrmQueryRequest.java # src/main/java/io/ebeaninternal/server/persist/Binder.java # src/test/resources/dbmigration/migrationtest/sqlserver17/1.2__dropsFor_1.1.sql # src/test/resources/dbmigration/migrationtest/sqlserver17/1.4__dropsFor_1.3.sql

rPraml · 2018-09-24T18:13:15Z

I updated this PR - I have some new information: Using TVPs is not always optimal
https://stackoverflow.com/questions/23120360/table-valued-parameters-with-estimated-number-of-rows-1
So I think it is a good strategy to switch to TVPs only, if the number of parameters exceed a limit (currently 100 for SqlServer and Oracle)

# Conflicts: # src/main/resources/io/ebeaninternal/dbmigration/builtin-extra-ddl.xml

rPraml · 2018-10-25T09:36:57Z

@rbygrave updated and resolved merge conflicts, maybe you have time to review

btw: where are the travis builds?

…o long; max key length is 767 bytes")

# Conflicts: # src/main/java/io/ebeaninternal/server/deploy/id/IdBinder.java # src/main/java/io/ebeaninternal/server/expression/InExpression.java

# Conflicts: # pom.xml # src/main/java/io/ebeaninternal/server/expression/InExpression.java # src/main/java/io/ebeaninternal/server/persist/platform/PostgresMultiValueBind.java # src/main/java/io/ebeaninternal/server/query/CQueryBindCapture.java # src/test/java/org/tests/model/basic/xtra/TestInsertBatchThenUpdate.java

rbygrave · 2019-08-20T09:12:45Z

it is a good strategy to switch to TVPs only, if the number of parameters exceed a limit (currently 100 for SqlServer and Oracle)

In OLTP applications are we going to be binding more than 100 Ids frequently ? I don't think so and what that suggests (given the TVP's have worse query plans for sql server and oracle) is that we maybe should not use them at all for sql server and oracle.

Do you still want to push for this?

rbygrave · 2019-08-29T23:56:13Z

I think we are going to let this change go. Multi-value binding with Oracle and SQL Server has worse query execution plans and for me I don't think it isn't worth having this for > 100 bind values.

So unless you are going to push hard for this we should close this PR. So only use MVB with Postgres ANY (until such time other DB's make the query plans as good as IN).

rPraml · 2019-08-30T08:52:34Z

Hello Rob, sorry for the late response, I was in vacation the last weeks. I am back in office at monday and try to discuss the further plans with my team.
Background: We have a reporting/filter in our application, where the user can select/deselect certain values (like Autofilter in excel)
The values that the end user can select are often out of our control and the resulting query may hit the parameter limit of 2100 in Sqlserver.

rbygrave · 2019-09-12T10:24:17Z

Closing.

rPraml added 3 commits March 2, 2018 02:35

ENH: improved multiValueBind

ad1e976

more refactor

69d57fe

changed scope of sqlserver driver

2456c4a

FIX: MultiValueBind for oracle

3086da6

rPraml commented Mar 2, 2018

View reviewed changes

rbygrave added the needs work label Mar 9, 2018

rPraml added 4 commits March 9, 2018 15:22

Merge remote-tracking branch 'ebean/master' into multivaluebind

10043ab

# Conflicts: # src/main/java/io/ebeaninternal/server/deploy/BeanDescriptorManager.java

removed timestamps as it has time zone issues

75d446d

update reference scripts

624f848

Change javadoc

eba16bd

rPraml added 2 commits March 21, 2018 09:58

Merge remote-tracking branch 'ebean/master' into multivaluebind

91b970e

# Conflicts: # pom.xml # src/main/java/io/ebeaninternal/server/deploy/BeanDescriptorManager.java

Update to new factory pattern

cfebc51

rPraml commented Mar 27, 2018

View reviewed changes

rPraml added 2 commits March 29, 2018 00:45

Use MultiValueBind only if more than 100 parameters

f7bd881

Fix tests

316ac5f

rPraml added 4 commits March 29, 2018 14:28

FIX: MultiValueBind

51112cb

updated the PR - fixes some issues with cache keys

efa0d19

Moved the Oracle TVP creation to initScript

fcdba0b

Merge remote-tracking branch 'ebean/master' into multivaluebind

fbc630a

# Conflicts: # src/main/resources/io/ebeaninternal/dbmigration/builtin-extra-ddl.xml

rPraml mentioned this pull request Oct 26, 2018

Waiting for PR "Multivaluebind" FOCONIS/ebean#18

Closed

rPraml added 10 commits January 2, 2019 15:26

FIX testcases for MariaDB+utf8mb4 (Failure was: "Specified key was to…

0b63915

…o long; max key length is 767 bytes")

Excluded tests that won't work on SqlServers generated ids

869e39c

SqlServer queryplan captured in transaction & fix of possible NPE

fb7a840

Merge remote-tracking branch 'ebean/master' into pr/multivaluebind

5cff95a

# Conflicts: # src/main/java/io/ebeaninternal/server/deploy/id/IdBinder.java # src/main/java/io/ebeaninternal/server/expression/InExpression.java

Merge branch 'fix-possible-npe-queyplanlogger' into pr/multivaluebind

981aa3f

Merge branch 'fix-mysql' into pr/multivaluebind

8c64a03

rolled back transaction earlier

f327b38

Merge branch 'fix-possible-npe-queyplanlogger' into pr/multivaluebind

3ea86cf

fixes in multivaluebind and BindCapture

092693e

rbygrave closed this Sep 12, 2019

rPraml deleted the multivaluebind branch August 3, 2022 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: improved multiValueBind #1328

ENH: improved multiValueBind #1328

rPraml commented Mar 2, 2018

rbygrave commented Mar 2, 2018

rPraml commented Mar 2, 2018

rPraml Mar 2, 2018

rbygrave commented Mar 9, 2018

rPraml commented Mar 9, 2018

rbygrave commented Mar 11, 2018

rPraml left a comment

rPraml Mar 9, 2018

rPraml Mar 9, 2018

rbygrave commented Mar 28, 2018

rPraml commented Mar 28, 2018

rPraml commented Mar 29, 2018

rPraml commented Sep 24, 2018

rPraml commented Oct 25, 2018 •

edited

Loading

rbygrave commented Aug 20, 2019

rbygrave commented Aug 29, 2019

rPraml commented Aug 30, 2019

rbygrave commented Sep 12, 2019

ENH: improved multiValueBind #1328

ENH: improved multiValueBind #1328

Conversation

rPraml commented Mar 2, 2018

rbygrave commented Mar 2, 2018

rPraml commented Mar 2, 2018

rPraml Mar 2, 2018

Choose a reason for hiding this comment

rbygrave commented Mar 9, 2018

rPraml commented Mar 9, 2018

rbygrave commented Mar 11, 2018

rPraml left a comment

Choose a reason for hiding this comment

rPraml Mar 9, 2018

Choose a reason for hiding this comment

rPraml Mar 9, 2018

Choose a reason for hiding this comment

rbygrave commented Mar 28, 2018

rPraml commented Mar 28, 2018

Oracle normal:

Oracle with multi value bind

SQL Server normal

SQL Server Multi value bind

rPraml commented Mar 29, 2018

rPraml commented Sep 24, 2018

rPraml commented Oct 25, 2018 • edited Loading

rbygrave commented Aug 20, 2019

rbygrave commented Aug 29, 2019

rPraml commented Aug 30, 2019

rbygrave commented Sep 12, 2019

rPraml commented Oct 25, 2018 •

edited

Loading