Skip to content

Improve performance in ActiveStorage::Service::MirrorService #51740

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 21, 2024

Conversation

heka1024
Copy link
Contributor

@heka1024 heka1024 commented May 5, 2024

Motivation / Background

This Pull Request has been created to remove FIXME comment of ActiveStorage::Service::MirrorService.

Detail

This Pull Request use thread pool to parallelize delete and delete_prefixed in ActiveStorage::Service::MirrorService.

Additional information

If this is a nit performance boost than complexity, how about remove comment?

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Unrelated changes should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

@@ -30,6 +30,13 @@ def self.build(primary:, mirrors:, name:, configurator:, **options) # :nodoc:

def initialize(primary:, mirrors:)
@primary, @mirrors = primary, mirrors
@executor = Concurrent::ThreadPoolExecutor.new(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do a performance benchmarking?

Copy link
Contributor Author

@heka1024 heka1024 May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I'll add performance benchmark and request review to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akhilgkrishnan
This is performance benchmark using GCS. In my opinion, there is a performance improvement when looking at real time.

  • repeat time: 10

  • Used service: Google Cloud Storage (region: ap-northeast-3)

  • Used image size: 512kb

  • primary: 1, mirrors: 2

Task Real Time User Time System Time Total Time
Parallel 0.453 0.055 0.015 0.070
Sequential 0.979 0.045 0.011 0.056
  • primary: 1, mirrors: 1
Task Real Time User Time System Time Total Time
Parallel 0.380 0.053 0.020 0.074
Sequential 0.624 0.033 0.008 0.041

Copy link
Contributor Author

@heka1024 heka1024 May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akhilgkrishnan Could you please review when you get a chance? 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akhilgkrishnan Gentle ping 🙏

Copy link
Member

@akhilgkrishnan akhilgkrishnan Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @heka1024 for the late response. This looks good to me. Lets wait for someone from core to review.

@heka1024 heka1024 force-pushed the parallel-upload-in-mirror-service branch from 00febe1 to ecfbb5e Compare May 5, 2024 15:12
…ervice#delete` and `ActiveStorage::Service::MirrorService#delete_prefixed`
@heka1024 heka1024 force-pushed the parallel-upload-in-mirror-service branch from ecfbb5e to 3f2258a Compare May 12, 2024 08:07
service.public_send method, *args
tasks = each_service.collect do |service|
Concurrent::Promise.execute(executor: @executor) do
service.public_send method, *args
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can guarantee the downstream services are thread-safe, and this could have breaking consequences.

Even the built-in services, I'm not sure, but if the developer is using a custom service they may not realize that it needs to be thread-safe.

I'm also not sure introducing a thread-pool here is cost-effective, we should measure it under more realistic load. Also given the defaults you set, makes me think the best approach is to make this opt-in and configurable.

That FIXME has existed since the beginning (19a5191), so there are many many applications that are accepting the current performance footprint (albeit room for improvement here). Was there a reason you wanted to add this, or just looking for TODOs to work on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zzak Actually, I found this TODO while reading the code and worked on it. I haven't experienced performance issues in my application.

I think there are two options:

  1. Make this configurable and keep the default as the non-threaded version.
  2. Close this PR and remove the TODO comment.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to think of opening PRs sometimes as asking a question. Let's consider this PR the form of "should mirror service use threads?" and see what feedback we get.

I would wait for any more feedback before changing it like option 1, I was just thinking of possible scenarios.

If you don't hear back after a while feel free to reach out.

Thanks for your PR!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just assume all services are thread-safe. Rails is entirely thread-safe, and we should push the community to be.

@rafaelfranca rafaelfranca merged commit 3a2ec9b into rails:main Jun 21, 2024
@heka1024 heka1024 deleted the parallel-upload-in-mirror-service branch June 23, 2024 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants