-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destroy Associations in Background Job #36912
Conversation
activerecord/lib/active_record/associations/belongs_to_association.rb
Outdated
Show resolved
Hide resolved
Motivation: In rails#36912 I added background destroys, this pr brings in deletes as well. Currently in the GitHub codebase we have methods that make both of these functions available, I just learned this thanks @seejohnrun. However it is hidden and likely underutlized due to the fact that it is not core Rails functionality. It would be nice to upstream both these tools if the Rails core team is interested. Related Issues: - rails#34901 - rails/actionmailbox#16 - rails#36912 Changes: - Added a new job to handle deleting associations. - Added a new dependent option on associations, background_delete (Open to other names).
Hey @gwincr11! Nice work on this so far. I don’t think we’ll want to serialize all associated records into a job. What I had imagined from our discussion was something like enqueuing a job with a class name and whatever conditions are needed to find the destroyable records. The job would then look up those records in batches to destroy them. This avoids (1) needing to load all associated records into memory at once and (2) needing to load them synchronously in order to enqueue the destroy job. I’d also like to see us stick to the |
@georgeclaghorn thanks for the feedback. I have discovered we have a very similar job at GitHub, I am going to dig into how we handle this and see if I can do more to port over our functionality. It seems we are closer to what you are suggesting and I think you are right it is better not to serialize everything. |
activerecord/lib/active_record/associations/belongs_to_association.rb
Outdated
Show resolved
Hide resolved
activerecord/lib/active_record/associations/builder/belongs_to.rb
Outdated
Show resolved
Hide resolved
activerecord/lib/active_record/associations/builder/has_many.rb
Outdated
Show resolved
Hide resolved
activerecord/lib/active_record/associations/has_many_association.rb
Outdated
Show resolved
Hide resolved
activerecord/lib/active_record/associations/has_one_association.rb
Outdated
Show resolved
Hide resolved
One thing that we do that does not fit with this model, some times we do a soft delete where we set a certain db column to true if something has been soft deleted. We have a check to ensure that the record was not soft deleted before deleting associations. I am wondering if I should let people pass in a custom check as a method on the owning model, or possibly pass in a different job to allow for custom escape logic? Any thoughts opinions? cc @tenderlove or @eileencodes or @georgeclaghorn? |
@georgeclaghorn I have significantly changed how this works based on your feedback, what do you think of it now. I couldn't just use what we have as it doesn't deal with scopes and is a little hard to work with if the model was hard deleted, so I needed to come up with a new solution, I think it is headed in a good direction but would love some more feedback. Thanks! |
Errr, maybe not... I think I confused my self and should drink caffeine before I comment in issues. cc @tenderlove, @eileencodes and @georgeclaghorn |
I could follow the ensuring pattern @georgeclaghorn laid out, maybe something like has_many :children, dependent: :destroy_later, owner_ensuring_destroy: :method_to_call_on_owner, child_ensuring_destroy: :method_to_call_on_child I don't have a current need to call a method on the child, but that maybe nice to add. |
This pr now contains the ability to throw an exception if ActiveRecord is not present, it has also been rebased against the current master. I feel this is the best forward, I feel this way as often times we have to hold onto records for legal reasons for x number of days. If the app is suddenly render unable to complete a job for some reason or is mis-configured deleting content could lead to legal troubles for some users. I know others feel otherwise @rafaelfranca for example, so if we desire we can go an alternate route. Either way we go this test suite is now showing a condition I am unclear on how to remedy. Currently the test files |
Just to throw out a thought, we could use appraisal for testing different gem sets https://github.com/thoughtbot/appraisal |
@gwincr11 it should be easier to just introduce |
@simi I wasn't sure how comfortable we would be with mocking in this case vs. a true integration test. I do agree wrapping the check is a nice clean up. I have been struggling with where to put the logic though. |
@gwincr11 I was thinking about this before and wasn't sure if ActiveSupport or directly ActiveRecord should check this. If this is first check in codebase, it would make sense to scope it just locally to this class. Anyway I'm not sure, we can try to scan codebase if any other functionality depends on ActiveJob optionally and unify those checks. |
@simi, I added ActiveJob back as a development dependency, centralized the presence check and stubbed. This seems like a sane path forward, thanks for jumping in! |
@gwincr11 AFAIK development dependencies live in Gemfile. There's no need for that in gemspec. |
activerecord/lib/active_record.rb
Outdated
@@ -171,6 +174,10 @@ def self.eager_load! | |||
ActiveRecord::AttributeMethods.eager_load! | |||
ActiveRecord::ConnectionAdapters.eager_load! | |||
end | |||
|
|||
def self.active_job? | |||
defined?(ActiveJob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be here. We should not check Active Job, Active Job that should register itself on Active Record.
Can we please follow #34901 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafaelfranca thanks for the feedback.
Are you reffering to this portion of the comment?
In the Active Job railtie we use a on_load(:active_record) to set the destroy job to ActiveRecord::DestroyJob.
Are you suggesting we just set a config option such as this
config.active_job = ActiveSupport::OrderedOptions.new |
Or some other approach? Would I then just switch this method to check for the config as opposed to checking for class definition?
Thanks ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that part. I think the config is in the opposite direction. Active Job should change the active_record.destroy_later_job
and Active Record provides the job that raises if Active Job is not loaded and the job that does the right thing.
That way we don't need to check if Active Job is loaded in Active Record, neither we need to check if Active Record is loaded in Active Job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually raising earlier, prior to trying to enqueue the job and instead when trying to load the parent class. That way if you are in the development phase you get the notice as early as possible lowering the chance that you will miss the dependency when trying to use this option.
I could set the destroy_later_job to false in active_record and check for it not being false or do you think I should only fail when the job is enqueued and not when the developer tries to set the dependent_destroy
option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your solution of raising early is better, so setting the config to false
or nil
is good enough to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. One more question, hopefully only 1. Do I need to move my tests somewhere else or do something to fire the railtie? I haven't tried yet figured I would ask before I potentially bang my head against my desk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafaelfranca I took a stab at this. wanna sanity check I am going in the right direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still coupling Active Record to Active Job. We should be inverting the dependency. Active Record should not even know Active Job exists.
Wouldn't be the way to add way to add custom destroy strategies and add this one on active job load if active record is present? |
activejob/lib/active_job/railtie.rb
Outdated
ActiveSupport.on_load(:active_record) do | ||
active_record_options = app.config.active_record | ||
active_record_options.destroy_association_later_job = ActiveRecord::DestroyAssociationLaterJob | ||
active_record_options.destroy_later_job = ActiveRecord::DestroyLaterJob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafaelfranca Is this the correct place and way to do this? Is there a good way to test this setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is the correct way, but you don't need to add the config to app.config.active_record
anymore. You are already inside ActiveRecord::Base
here. So this code can be:
self.destroy_association_later_job = ActiveRecord::DestroyAssociationLaterJob
self.destroy_later_job = ActiveRecord::DestroyLaterJob
To test it you will need an integration test in the railties folder. Similar to those tests here. https://github.com/rails/rails/blob/master/railties/test/application/configuration_test.rb#L423-L432
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is it. I just made a comment about the implementation and the tests.
activejob/lib/active_job/railtie.rb
Outdated
ActiveSupport.on_load(:active_record) do | ||
active_record_options = app.config.active_record | ||
active_record_options.destroy_association_later_job = ActiveRecord::DestroyAssociationLaterJob | ||
active_record_options.destroy_later_job = ActiveRecord::DestroyLaterJob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is the correct way, but you don't need to add the config to app.config.active_record
anymore. You are already inside ActiveRecord::Base
here. So this code can be:
self.destroy_association_later_job = ActiveRecord::DestroyAssociationLaterJob
self.destroy_later_job = ActiveRecord::DestroyLaterJob
To test it you will need an integration test in the railties folder. Similar to those tests here. https://github.com/rails/rails/blob/master/railties/test/application/configuration_test.rb#L423-L432
activerecord/activerecord.gemspec
Outdated
@@ -34,6 +34,7 @@ Gem::Specification.new do |s| | |||
# NOTE: Please read our dependency guidelines before updating versions: | |||
# https://edgeguides.rubyonrails.org/security.html#dependency-management-and-cves | |||
|
|||
s.add_development_dependency "activejob", version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to add this here. The Gemfile is where we put development dependencies.
@rafaelfranca I think I have the test written correctly. Thanks for the assist. |
activerecord/lib/active_record/destroy_association_later_job.rb
Outdated
Show resolved
Hide resolved
Generalize Action Mailbox's incineration feature. Permit destroying records after a specified amount of time.
end | ||
|
||
|
||
test "enqueues the has_many through to be deleted with custom primary key" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this test trying to assert? If it works when the has_many
association has a primary_key option or when the model that defines association has a custom primary key?
I'm trying to reduce the number of new tables added and I was trying to reuse some of the existing models but I'm having a hard time to know what exactly is being tested here.
Motivation: - Sometimes cascading association deletions can cause timeouts due to IO issue. Perhaps a model has associations that are destroyed on deletion which in turn trigger other deletions and this can continue down a complex tree. Along this tree you may also hit other IO operations. Such deep deletions can lead to server timeouts while awaiting completion and really the user may not notice all the changes on their side immediately making them wait unnecesarially or worse causing a timeout during the operation. It is possible to move the association deletion into jobs today, however it is pretty boilerplate and it would be nice to make it easier to do. This pr builds off the work of @georgeclaghorn in PR rails#34901 and is currently a simple proposal of how this could work. I branched off his pr for now to quickly product this proof of concept. Related Issues: - rails#34901 - rails/actionmailbox#16 Changes: - Added a new job to handle deleting associations. - Added a new `dependent` option on `has_many` associations, `destroy_async` (Open to other names). - Added a single proof of concept test.
* Fix CHANGELOG entry * Avoid using abreviations in argument names * Refactor the job to use `find_by` * Improve documentation * Check if both jobs are configured when calling destroy_later at class * Pass keyword arguments to the job to avoid warnings on Ruby 2.7 * Remove duplicated method * Fix typos and avoid adding new tables * Move active record queues configuration to Active Record This config is related to jobs but it doesn't load anything on Active Job, so there is no need to move the config to the Active Job railtie. * Allow to configure the destroy later jobs per model We are still going to configure this globaly to the framework but if some user wants to use a different job in a model the system will work. This also make easier to isolate the Active Job tests from the Active Record jobs. * Change the tests description to talk about the framework features The tests were talking about the model features, not which features in the framework they were asserting. * Add documentation about how to configure the destroy jobs
5bc8da1
to
f6eff0f
Compare
assert tag.reload | ||
end | ||
|
||
test "enqueues the has_many through to be deleted with custom primary key" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test ensures that we can find and delete an association with a custom primary key:
self.primary_key = "many_key"
Merged in #40157 |
Motivation: - Now that destroy_async is an option on association, a natural next step seems to be adding this method to standard objects for a more consistent offering. Related Issues: - rails#40157 - rails#36912 Changes: - Add a destroy_async method to persistence. - Add a rail tie to setup the active job configs. - Add a default destroy async job. - Test
Motivation:
IO issue. Perhaps a model has associations that are destroyed on
deletion which in turn trigger other deletions and this can continue
down a complex tree. Along this tree you may also hit other IO
operations. Such deep deletions can lead to server timeouts while
awaiting completion and really the user may not notice all the
changes on their side immediately making them wait unnecessarily or
worse causing a timeout during the operation.
today, however it is pretty boilerplate and it would be nice to make
it easier to do.
currently a simple proposal of how this could work. I branched off his
pr for now to quickly product this proof of concept.
Related Issues:
Changes:
dependent
option on associations,destroy_async
(Open to other names).To Do:
[ ] Just load ids when loading associations?This is non-trivial and error prone.[ ] Add child ensuring methods.After thinking about this, I think it should only be implemented if it works with destroy later and normal dependent destroy.