Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change tootctl to use inline parallelization instead of Sidekiq #11776

Merged
merged 1 commit into from
Sep 10, 2019

Conversation

Gargron
Copy link
Member

@Gargron Gargron commented Sep 8, 2019

  • Add progress bar instead of printing .
  • Skip over any individual errors
  • Add -c/--concurrency flag
  • Remove --background flag

Commands affected by change (unless specified otherwise, progress bar and parallelization):

  • tootctl accounts rotate (not parallelized)
  • tootctl accounts cull
  • tootctl accounts refresh
  • tootctl accounts follow
  • tootctl accounts unfollow
  • tootctl accounts reset-relationships (not parallelized)
  • tootctl cache recount
  • tootctl domains purge
  • tootctl domains crawl
  • tootctl feeds build
  • tootctl media remove
  • tootctl preview_cards remove

Examples:

tootctl media remove --days=60 -c 20 will remove media older than 60 days, 20 at a time

@Gargron Gargron force-pushed the fix-parallelize-tootctl-media-remove branch 5 times, most recently from d4aea03 to 1d8940c Compare September 8, 2019 22:05
@Gargron Gargron changed the title Change tootctl media remove to use inline parallelization Change tootctl to use inline parallelization instead of Sidekiq Sep 8, 2019
@Gargron Gargron marked this pull request as ready for review September 8, 2019 22:07
@Gargron Gargron added the moderation Administration and moderation tooling label Sep 8, 2019
@Gargron Gargron force-pushed the fix-parallelize-tootctl-media-remove branch from 1d8940c to 4178725 Compare September 8, 2019 22:09
@ClearlyClaire
Copy link
Contributor

Won't removing the workers cause issues if someone updates their instance while such tasks are queued?

@Gargron
Copy link
Member Author

Gargron commented Sep 9, 2019

Won't removing the workers cause issues if someone updates their instance while such tasks are queued?

Since the workers are only used by the tootctl commands such timing is unlikely. I would prefer not to keep dead code around!

lib/mastodon/accounts_cli.rb Show resolved Hide resolved
lib/mastodon/accounts_cli.rb Outdated Show resolved Hide resolved
lib/mastodon/accounts_cli.rb Outdated Show resolved Hide resolved
lib/mastodon/media_cli.rb Outdated Show resolved Hide resolved
lib/mastodon/cli_helper.rb Show resolved Hide resolved
lib/mastodon/cli_helper.rb Outdated Show resolved Hide resolved
scope.reorder(nil).find_each do |item|
progress.total = futures.size + 1 if progress.total < futures.size + 1

futures << Concurrent::Future.execute(executor: pool) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, even though find_each work in batches, this code will go through all the items in the query and put them in the queue, which might not be what we want, wrt. RAM usage.

Would it be possible to reap finished jobs as soon as possible and block whenever the pool is busy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The futures start processing as soon as they are created, the map over values at the end is just to block execution until they all finish. What is stored are the futures and the return values, which are numbers in our case, not the ActiveRecord records, so it should help with memory usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still worried about it being too much if the tasks are slow. I guess it could be interesting to see how it runs for pruning old remote media on m.s

@Gargron Gargron force-pushed the fix-parallelize-tootctl-media-remove branch 2 times, most recently from 0d488ed to 1fe542e Compare September 9, 2019 15:45
@Gargron Gargron force-pushed the fix-parallelize-tootctl-media-remove branch from 1fe542e to d78dc49 Compare September 9, 2019 18:06
- Remove --background option
- Add --concurrency(=5) option
- Add progress bars
@Gargron Gargron force-pushed the fix-parallelize-tootctl-media-remove branch from d78dc49 to c87c6d5 Compare September 9, 2019 21:11
@Gargron Gargron merged commit 8674814 into master Sep 10, 2019
@Gargron Gargron deleted the fix-parallelize-tootctl-media-remove branch September 10, 2019 12:59
hiyuki2578 pushed a commit to ProjectMyosotis/mastodon that referenced this pull request Oct 2, 2019
…odon#11776)

- Remove --background option
- Add --concurrency(=5) option
- Add progress bars
rtucker referenced this pull request in vulpineclub/mastodon Jan 7, 2021
…allelization instead of Sidekiq

- Remove --background option
- Add --concurrency(=5) option
- Add progress bars
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
moderation Administration and moderation tooling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants