-
Notifications
You must be signed in to change notification settings - Fork 21.9k
ActiveJob Memory bloat #27002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @matthewd, since this has to do with |
This is caused by the query cache, which is enabled by the reloader. If you run the job directly with the cache enabled, you'll see memory usage grow steadily: ActiveRecord::Base.connection.cache { MemTestJob.perform_now } Conversely, if you disable the cache while the job is running, memory growth will taper off: def perform
ActiveRecord::Base.connection.uncached do
Item.find_in_batches.with_index do |_, index|
gc_start if index % 10 == 0
end
end
end This behaviour is intentional - Rails is using more memory to avoid duplicate queries. The cache is cleared when the job finishes, so memory isn't being permanently leaked, but peak usage will be higher. The cache is also cleared when the connection writes to the database - in a real app, you would presumably write at least once per batch, and simulating this also controls memory usage: def perform
Item.find_in_batches.with_index do |batch, index|
gc_start if index % 10 == 0
batch.first.touch
end
end The example job here is a pathological case, where a million rows are read but nothing is written. While the query cache clearly doesn't help in this particular scenario, I'm not sure if there's anything we can do here, other than to recommend turning it off in these extreme cases. |
Thanks for the explanation and the workaround. One real-life example is indexing a large database : it read millions of rows without write. Anyway, is it really necessary to cache by default batched queries ? Batches are intended to loop through a large collection of records, whatever you'll write or not, when a |
I'll close this issue, as it was focused on Active Job... but @inkstak if you'd like to open a new issue that batches should skip the query cache (or a PR 😁), I think that's a reasonable point. |
When parsing documentation we generate a lot of active record objects:   Most of them are only needed for a split second or two and can then be collected, however due to the QueryCache the objects will live on in memory for the entire duration of the job: - rails/rails#27002 - rails/rails#28646 This PR disables the Query cache for the populate docs job. In the future we need to remove all these N+1 queries from the background doc parse, but for now hopefully this bandaid will at least prevent N+1 objects from being retained in memory for the duration of the job.
When parsing documentation we generate a lot of active record objects:   Most of them are only needed for a split second or two and can then be collected, however due to the QueryCache the objects will live on in memory for the entire duration of the job: - rails/rails#27002 - rails/rails#28646 This PR disables the Query cache for the populate docs job. In the future we need to remove all these N+1 queries from the background doc parse, but for now hopefully this bandaid will at least prevent N+1 objects from being retained in memory for the duration of the job.
Hi, I'm investigating on a memory bloat linked to ActiveJob.
I first reported it to Sidekiq but the problem seems to come from ActiveJob.
System configuration
Rails 5.0.0 (also reproduced with master)
Ruby 2.3.1
Steps to reproduce
An app to reproduce is available here.
Setup the database to create thousands of records.
The job
MemTestJob
requests the records by batches, call garbage collector and output the heap & memory usage.In a console, run the job:
The memory usage and heap size will quickly stabilize.
Now, run the job as ActiveJob do asynchronously:
The heap size and the RSS increase each round until the final blow up.
In this example, it grows a few bytes per request, but it represents several MB on a real apps with many gems.
Investigation
After digging the code, the cause seems to be linked to the reloader in ActiveJob
The increase of memory may be handled in development, but is problematic in production when browsing very large databases. Can we skip this callback when the
config.eager_load
is set totrue
?The text was updated successfully, but these errors were encountered: