New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
return the inserted ids for postgres bulk insert #178
return the inserted ids for postgres bulk insert #178
Conversation
end | ||
end | ||
|
||
def add_objects(hash, parent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to hide this method and import_helper in the Import module. Right now it looks like it's being added to ActiveRecord::Base as a class method. We should only add the public methods that we want to expose to a user there so we don't unnecessarily pollute or add potential conflicts w/ other things on ActiveRecord::Base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take a look tomorrow
On Sun, Feb 22, 2015 at 12:42 PM, Zach Dennis notifications@github.com
wrote:
In lib/activerecord-import/import.rb
#178 (comment)
:
# should probably take a hash to associations to follow.
hash={}
models.each {|model| add_objects(hash, model) }
hash.each_pair do |class_name, assocs|
assocs.each_pair do |assoc_name, subobjects|
subobjects.first.class.import(subobjects, options) unless subobjects.empty?
end
end
result
else
import_helper(*args)
end
- end
- def add_objects(hash, parent)
We'll want to hide this method and import_helper in the Import module.
Right now it looks like it's being added to ActiveRecord::Base as a class
method. We should only add the public methods that we want to expose to a
user there so we don't unnecessarily pollute or add potential conflicts w/
other things on ActiveRecord::Base.—
Reply to this email directly or view it on GitHub
https://github.com/zdennis/activerecord-import/pull/178/files#r25135051.
ad422b6
to
fd244d9
Compare
place the id return into every model and mark the model as clean? (since it has been saved) make the patch work for non-standard primary keys modification showing how save a tree of objects efficiently. pass options down into sub-objects that are imported set the foreign key on a sub-collection when the parent is imported tests for importing 3 levels of objects new model for testing 3 levels of objects: chapter move the setting of ids down into the postgresql adapter.
fd244d9
to
ee7411a
Compare
👍 |
if you have been using the GoodMeasuresLLC fork, you'll need to add :recursive => true if you want to import autosave associations with the root objects |
@zdennis - thoughts? |
I have tested this on millions of records and it's incredibly useful. Would love to see it in the main project. |
@johnnaegle @alainmeier, this is on the path to being merged. I will review the changes again in the next few days. If no other questions or suggestions then we'll get this in very soon. Sorry for the wait and thank you for your patience and work on this. |
@zdennis, that will be awesome! |
Hi @robmathews and @johnnaegle I am running some tests on this branch - works great but I am noticing in the logs that in my test case a number of select queries between the bulk insert of the parent objects and the bulk inserts of the child objects. I was wondering why these are necessary? I haven't run the tests you have checked in, but I expect that give your test case of 6 books, you would see 6 selects on table chapters after the bulk book insert but before the bulk chapters insert Why are these necessary? Or perhaps I haven't done set something correctly in my case.. Thanks for all your work on this, Tim. |
I see this in the logs as well, curious This is from the test case:
|
…ulk import, we want to bypass the proxy association and use the underlying target. Do this by marking each association as loaded
6d8d652
to
07be534
Compare
I've marked all the associations that get recursed on as loaded. This prevents a lot of extra queries. Great catch @timadler. |
Wow - thanks @johnnaegle. Fast turn around! Works great. |
…lling some SQL. I think pulling the autofollowing of associations out helps that
@zdennis - thoughts? |
|
||
That would be about 4M SQL insert statements vs 3, which results in vastly improved performance. In our case, it converted | ||
an 18 hour batch process to <2 hrs. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this. I've created a local copy of the branch and have formatted and massaged this a little bit, but it's mostly intact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey thanks - I noticed you have stuff over in the wiki section, but it is important that folks realize why you bothered to write this in the first place.
This is important stuff. I think it should really make its way into the rails core ....
zach, thanks so much for looking at this. we've been wanting to get back on the main branch for so long.... |
…de do its job. This removes the chance of name collision and ensures we can call the original method with super.
I've incorporated @zdennis's changes by cherry-picking your commits into this branch. I made a few renames that hoepfully make things clearer. I have one PR that I wanted @robmathews to review before I merge it in, but I think it simplifies the importing of associations: GoodMeasuresLLC#1 |
Looking good here. I've been waiting for this a long time but we really need it now so I gave it a try. I take it this depends on PostgreSQL returning the ids in the same order the rows were given. There has been some debate over whether this is guaranteed. The developers refuse to be held on it but it seems doubtful that they will be able to change it now given how many people rely on it. Many have said, and I agree, that the returning feature isn't very useful unless the order is maintained. You're probably aware of this but I just want to make sure. |
I've relied on the order of the returned ids being the same for over 15 yrs On Fri, Mar 27, 2015 at 11:52 AM, James Le Cuirot notifications@github.com
|
@zdennis thoughts about getting this merged in? Thanks! |
@johnnaegle +1 |
bump |
👍 |
@alainmeier, thanks for posting. @johnnaegle / @robmathews , any thoughts? |
Reverts commit 8e11c8a.
8df3967
to
a3d3848
Compare
I reverted a change and added a test for multiple has_many associations on a single model. |
@alainmeier, would you be able to update and run your code with @johnnaegle's latest changes? |
💥 everything works great again. Thanks @johnnaegle |
Awesome! Okay, I think it's about time we get this in there. |
return the inserted ids for postgres bulk insert
Thank you to @johnnaegle, @robmathews, and everyone that contributed this work. I know it's been a long time coming. |
Thank you all! This is a great feature 👍 |
Thanks everyone! @zdennis will you be releasing a new RubyGems version soon by any chance? |
@alainmeier, yeah, tonight or tomorrow night. |
Very useful, and very timely. We were just looking into how to import a many-to-many relationship and this will help a lot. |
@zdennis Could you release a new gem version? It would be much appreciated! |
He accepted the pull request a few weeks back. On Fri, May 15, 2015 at 7:44 AM, Michiel de Mare notifications@github.com
|
Sorry for the wait. A new gem (0.8.0) has been published just now. This is included in that. |
👍 |
Thanks! |
per #104 and #105, I've rebased the commits onto the latest zdennis master and got the adapter tests working for mysql, postgres and sqlite3.
insert_many now returns the number of imports and the ids of the objects inserted (if the adapter provides them). Those ids are then set on the model and the model is marked as clean. For auto-save associations, a recursive bulk import is performed. There are postgresql adapter specific tests that show the ids getting set and recursive import of autosave associations.