New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch insert/update with Hibernate & HikariCP #758
Comments
Sounds more like you are using entities not compatible with batching. My guess is your entity uses an
JDBC batching will be disabled by JPA / Hibernate without notice when using this ID generation strategy because it requires asking the DB for the auto generated ID for each inserted row. You would need to use sequence or table id generator instead, so that JPA / Hibernate can allocate more than a single ID at once before actually executing the batch insert. |
@jnehlmeier ty for your idea
actually, the simplest way to help me would be to give me a working configuration :) |
@cvaliere I do not have a working configuration. However, if you are using MySQL, you will not get batching behavior even if the code uses the JDBC batching API unless the MySQL See http://stackoverflow.com/questions/26307760/mysql-and-jdbc-with-rewritebatchedstatements-true |
I am using PostGreSQL, not MySQL |
@cvaliere unfortunately no, I don't |
@cvaliere Have you tried asking on the Hibernate forums? Are there any secondary tables involved? https://hibernate.atlassian.net/browse/HHH-5797 Have you tried setting What about |
This answer: Says that PgJDBC doesn't do anything useful with batching, and just runs each statement... |
actually I may have misunderstood the concept of "batch inserts"
batching them would transform it into a single multi-insert statement but, from what I read on a lot of forums, "batching" does NOT do that; it just sends the 2 inserts in a single round-trip to the database, but still with 2 statements and, in addition to the batching, we can ask the JDBC to transform the batches into multi-insert statements, with rewriteBatchedStatements (MySQL) or reWriteBatchedInserts (PostGreSQL) am I right? |
@cvaliere Technically, JDBC defines two kinds of "batching". One is defined by the
Some drivers have the ability to optimize the query as you described, but coalescing the values into a single statement. I.e. "rewriting" the batched statements. The second kind of batching is typically much more performant and requires the use of the PreparedStatement. PreparedStatement ps = connection.prepareStatement("INSERT INTO account VALUES (?,?)");
ps.setInt(1, 1);
ps.setString(2, "John");
ps.addBatch();
ps.setInt(1,2);
ps.setString(2, "Jane");
ps.addBatch();
...
ps.executeBatch(); Until recently, MySQL did not support server-side prepared statements, so this kind of batching couldn't work there. Note that the PostgreSQL I recommend enabling Hibernate query logging so that you can check to see what is being generated. Keep in mind that the query re-writing is taking place in the driver which is "below" Hibernate. So you might see Hibernate send several "INSERT ..." statements, but the driver might still rewrite them. You can also enable PgJDBC debug logging by setting the driver URL property UPDATE: Also, as noted above, having UPDATE2: I suspect that Hibernate requires that the generated IDs be returned from the inserts, because it uses the object IDs everywhere internally, so that is basically going to disable batching on PostgreSQL, AFAICT. |
@brettwooldridge thank you for all your explanations! So, if I understand well, here are the 4 ways of making 100 INSERTs, ordered by the less performant to the most performant:
Am I right if I say that Hibernate does the n°3? (when hibernate.jdbc.batch_size is defined)
I don't think so. Hibernate gets the ID from the sequence before the INSERT, and sends the ID inside the INSERT, so it doesn't need to retrieve the ID from the INSERT because it already has it.
yes, that's why, in my tests, I don't look at the log ; instead, I wrote a TRIGGER AFTER STATEMENT on my table and see how many times it gets executed to insert 100 objects:
|
@cvaliere You've got it basically right. In a well optimized JDBC driver, such as Oracle, or even going back in the day jTDS for SQL Server, a prepared statement with batching is the most performant way of mass insert through the JDBC interface. Of course, database-specific bulk loading methods are most performant but are not abstracted by JDBC. A well optimized driver will:
Less optimized drivers will, as the PgJDBC driver is doing by default, simply implement the batching API as a series of individual inserts -- basically defeating the purpose of batching. Muddying the waters is that these less optimized drivers include optimizations that recognize multiple sequential identical inserts (same table, same columns), and transform them into single inserts with multiple tuples. This is the "rewrite" that is supported by MySQL and PostgreSQL. Then you combine the less optimized case, where the driver is going to transform a batch insert into individual inserts, and then pump that through the optimization that recombines them into a single insert with multiple tuples, and you get a kinda in-between performance. The thing is that there is a length limit to a SQL statement, different for every database. So, for some queries the driver might be able to combine 100 inserts into a single insert with 100 tuples, but other queries it can only combine 20 inserts into a single insert with 20 tuples -- because of the size of the data. So, to answer the questions in your 1-4 ranking.
It might be interesting to try pgjdbc-ng, but I would strongly recommend building it from master rather than using 0.6 from a year ago. |
May be this would be useful for someone. Springboot, Hibernate, postgres With settings below I have inserts performed like
This ?reWriteBatchedInserts=true actually bring inserts to batch on postgres side. application.yml
entity
Useful post about reWriteBatchedInserts by Vlad Mihalcea |
Hi everyone
I'm using HikariCP 2.5.1 & Hibernate 5.2.3
I try to make Hibernate batch the update/insert, but no matter what I try, I can't succeed.
Here is my XML config:
Then my test case is simply to create 10 simple objects, and I would expect only 1 statement, but I have 10 instead.
Apparently there is another way of setting things up (https://github.com/brettwooldridge/HikariCP/wiki/Hibernate4) but I can't find how it works.
Does someone know how to make it work?
Thanks a lot!
The text was updated successfully, but these errors were encountered: