-
-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: improved multiValueBind #1328
Conversation
Have you checked the query plans that Oracle and SQL Server produce? Can you post them up? |
example query for oracle
and sqlserver
|
import java.sql.Connection; | ||
import java.sql.SQLException; | ||
|
||
import oracle.jdbc.OracleConnection; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: this is a text file, as oracle.jdbc.OracleConnection is not public available. So I included the class in binary form:
https://github.com/ebean-orm/ebean/pull/1328/files#diff-1d0fda0b6639c184544f2e59c5ae5036
Do you think this hack is acceptable?
Sorry, I actually meant the |
Did I get it right, that I should execute "explain select * from xxx where ..." manually and check which indices are hit? Or is there a feature in ebean that dumps the "explain plan" for each query? |
# Conflicts: # src/main/java/io/ebeaninternal/server/deploy/BeanDescriptorManager.java
Yes. Specifically we want to compare the 2 explain plans ... to confirm that they are effectively the same. That we still hit the indexes etc and yes we need to do this manually at the moment. (Yes, there is a desire and plan to automate the collection of explain plans as part of a performance monitoring tool). |
# Conflicts: # pom.xml # src/main/java/io/ebeaninternal/server/deploy/BeanDescriptorManager.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rbygrave I analyzed the queries for postgres and came to the conclusion that the internal query optimizer will produce the same query plans (at least for postgres).
select t0.id, t0.status, t0.order_date, t0.ship_date, t1.name, t0.cretime, t0.updtime,
t0.kcustomer_id from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id
where order_date in (?, ?, <REMOVED~1000> ?, ? ) and t0.id <= ?
and
select t0.id, t0.status, t0.order_date, t0.ship_date, t1.name, t0.cretime, t0.updtime,
t0.kcustomer_id from o_order t0 join o_customer t1 on t1.id = t0.kcustomer_id
where order_date = any(?) and t0.id <= ?
results to the SAME query plan
Hash Join (cost=22.16..624.38 rows=422 width=134)
Hash Cond: (t0.kcustomer_id = t1.id)
-> Bitmap Heap Scan on o_order t0 (cost=7.43..604.34 rows=422 width=36)
Recheck Cond: (id <= 4)
Filter: (order_date = ANY ('{2018-03-27,2018-03-28,2018-03-29 ... 31,2018-04-01,2018-04-02}'::date[]))
-> Bitmap Index Scan on pk_o_order (cost=0.00..7.32 rows=423 width=0)
Index Cond: (id <= 4)
-> Hash (cost=12.10..12.10 rows=210 width=102)
-> Seq Scan on o_customer t1 (cost=0.00..12.10 rows=210 width=102)
I also checked sporadically the plans for other DBMS and came to the conclusion, that they are OK.
I added a quick & dirty mechanism to log the plans in ebean. See here:
FOCONIS@172bc44
case SQLSERVER16: | ||
case SQLSERVER17: | ||
case SQLSERVER: | ||
return new SqlServerMultiValueBind(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to @rbygrave - please check https://github.com/ebean-orm/ebean/pull/1328/files#diff-78b982f628a680c43549c680aaadfa2aR271 if we need SQLSERVER16/17, too
@@ -73,7 +90,8 @@ protected String getArrayType(int dbType) { | |||
case TIMESTAMP: | |||
case TIME_WITH_TIMEZONE: | |||
case TIMESTAMP_WITH_TIMEZONE: | |||
return "timestamp"; | |||
return null; // NO: Does not work reliable due time zone issues! - Fall back to normal query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timestamps don't work reliable due timezone issues. It would require to perform timezone conversion before putting the timestamps into the multi-value datastructure.
Right sorry, I already knew that Postgres ANY was good - I checked that before allowing that in and we have been using it to good effect in Postgres for a while now. What I really want to do is absolutely confirm that the query plans for Oracle and SQL Server (which are the 2 platforms this change wants to add this support for right) are good. So I need to see actual query plans - we need to be sure. |
good that you are so persistent ;) I checked the query plans and plans like I found an interesting article here: https://www.spiderstrategies.com/blog/2014-11-03-sql-server-query-type-performance.html there is an intersting summary:
but here the query plans for oracle and sqlserver Oracle normal:select t0.id c0, t0.name c1 from tuuid_entity t0 where t0.id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 139 | 1 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TUUID_ENTITY | 1 | 139 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_TUUID_ENTITY | 1 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T0"."ID"=:1 OR "T0"."ID"=:2 OR "T0"."ID"=:3 OR "T0"."ID"=:4 OR
"T0"."ID"=:5 OR "T0"."ID"=:6 OR "T0"."ID"=:7 OR "T0"."ID"=:8 OR "T0"."ID"=:9 OR
"T0"."ID"=:10)
Note
-----
- dynamic sampling used for this statement (level=2) Oracle with multi value bindselect t0.id c0, t0.name c1 from tuuid_entity t0 where t0.id in (SELECT * FROM TABLE (SELECT ? FROM DUAL))
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16524 | 32 (4)| 00:00:01 |
|* 1 | HASH JOIN SEMI | | 1 | 16524 | 32 (4)| 00:00:01 |
| 2 | TABLE ACCESS FULL | TUUID_ENTITY | 27 | 3753 | 2 (0)| 00:00:01 |
| 3 | VIEW | VW_NSO_1 | 8168 | 127M| 29 (0)| 00:00:01 |
| 4 | COLLECTION ITERATOR PICKLER FETCH| | 8168 | 16336 | 29 (0)| 00:00:01 |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T0"."ID"="COLUMN_VALUE")
Note
-----
- dynamic sampling used for this statement (level=2) SQL Server normalSQL Server Multi value bind |
What do you think, if we use the multivaluebind for SqlServer & Oracle only for higher parameter count? (currently hard coded > 100) |
# Conflicts: # src/main/java/io/ebeaninternal/server/core/OrmQueryRequest.java # src/main/java/io/ebeaninternal/server/persist/Binder.java # src/test/resources/dbmigration/migrationtest/sqlserver17/1.2__dropsFor_1.1.sql # src/test/resources/dbmigration/migrationtest/sqlserver17/1.4__dropsFor_1.3.sql
I updated this PR - I have some new information: Using TVPs is not always optimal |
# Conflicts: # src/main/resources/io/ebeaninternal/dbmigration/builtin-extra-ddl.xml
@rbygrave updated and resolved merge conflicts, maybe you have time to review btw: where are the travis builds? |
…o long; max key length is 767 bytes")
# Conflicts: # src/main/java/io/ebeaninternal/server/deploy/id/IdBinder.java # src/main/java/io/ebeaninternal/server/expression/InExpression.java
# Conflicts: # pom.xml # src/main/java/io/ebeaninternal/server/expression/InExpression.java # src/main/java/io/ebeaninternal/server/persist/platform/PostgresMultiValueBind.java # src/main/java/io/ebeaninternal/server/query/CQueryBindCapture.java # src/test/java/org/tests/model/basic/xtra/TestInsertBatchThenUpdate.java
In OLTP applications are we going to be binding more than 100 Ids frequently ? I don't think so and what that suggests (given the TVP's have worse query plans for sql server and oracle) is that we maybe should not use them at all for sql server and oracle. Do you still want to push for this? |
I think we are going to let this change go. Multi-value binding with Oracle and SQL Server has worse query execution plans and for me I don't think it isn't worth having this for > 100 bind values. So unless you are going to push hard for this we should close this PR. So only use MVB with Postgres ANY (until such time other DB's make the query plans as good as IN). |
Hello Rob, sorry for the late response, I was in vacation the last weeks. I am back in office at monday and try to discuss the further plans with my team. |
Closing. |
This fixes the bug, if multivalueBind is supported, we cannot say in genral that we can use it on ID columns.
If ID column is from type string or integer it would work, but if ID column is of a type that multivaluebind does not support, you'll get an error.
It als re-adds my existing implementation for SqlServer and oracle