Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Inserts? #22

Closed
jbaiter opened this issue Feb 23, 2016 · 5 comments
Closed

Batch Inserts? #22

jbaiter opened this issue Feb 23, 2016 · 5 comments

Comments

@jbaiter
Copy link

jbaiter commented Feb 23, 2016

Is it possible to do "real" batch inserts (à la clojure.java.jdbc/insert!) with hugsql?

I just tried to import ~8000 records with the syntax from the documentation:

-- :name create-sentences! :! :n
-- :doc creates a number of new sentence records
INSERT INTO sentences
(tokens, tags, document_id)
VALUES :t*:sentences
(->> sentences
     (map #(vec (map % [:tokens :tags :document_id])))
     vec
     (#(q/create-sentences! {:sentences %}))))

This yields a "Too many SQL variables" exception (I'm using SQLite).

Doing the same with clojure.java.jdbc/insert!, however, works like a charm:

(->> sentences                                                                          
     (apply (partial jdbc/insert! q/conn :sentences [:tokens :tags :document_id]))
     (reduce +)))
@csummers
Copy link
Member

The Tuple List Parameter supports multi-record insert for the INSERT...VALUES (...),(...) syntax, but you can still hit some JDBC driver limits with this, which is what is happening here. The limits may be higher with other database drivers.

For these kinds of batch cases, you can map or doseq through your batch calling your single-record insert or multi-record insert (cut up your batch with partition-all).

On the other hand, I say know your tools and use them to their full extent. Using clojure.java.jdbc/insert! is fine. It's already internally mapping over insert-single-row-sql, so why not use it instead of needing to write your own?

But, I can see how this is confusing, and I think this at least needs some more documentation with an example or two.

@jbaiter
Copy link
Author

jbaiter commented Feb 23, 2016

Thanks for the clarification! I had a solution with map-ing over the single insert statement before and I was under the impression that the insert! method was faster, but that may just as well have been the placebo effect, I did not do any proper profiling.
Also, thank you for creating the library, I've been looking for a replacement to yesql and this looks very much like it :-)

@jbaiter jbaiter closed this as completed Feb 23, 2016
@csummers
Copy link
Member

Thanks again for the report!

@csummers
Copy link
Member

...I was under the impression that the insert! method was faster, but that may just as well have been the placebo effect...

insert! may have been faster if it was a single transaction and your map-ing solution was multiple transactions. See Transactions.

@mikeball
Copy link

I was curious on the actual limit of parameters. It looks like ~32k for postgresql, and ~2k for sql server.

http://www.postgresql.org/message-id/10218.1208960238@sss.pgh.pa.us
http://stackoverflow.com/questions/14631346/java-jdbc-prepared-statement-maximum-parameter-markers

edit: Also mentioned in the SO post is that for postgres, before 9.4 every record in a batch did a round trip if a generated key was returned. Am I reading that right? Or is that a different kind of jdbc "batch"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants