Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command to remove avatar and header images of inactive remote accounts from the local database #22149

Merged
merged 12 commits into from
Dec 14, 2022

Conversation

evanphilip
Copy link
Contributor

@evanphilip evanphilip commented Dec 8, 2022

This implements a new sub-commands for tootctl media called remove-profile-media to remove avatar and header images of remote accounts that appear inactive from the local database.

Fixes #9567 : absence of method to remove avatar and header images, leading to excessive disk usage.

This PR is just a slight modification of #21066 by @dunkelstern, which was withdrawn by the author since last_webfingered_at was not a satisfactory way to determine old accounts. While imperfect, the existing tootctl accounts cull uses updated_at and last_webfingered_at. Maybe it is the best one can do?

I have also added an option which lets one keep avatars.

All credit to @dunkelstern , I am just desperate to clear my storage and it looks like there are others like me who could use this too!

@evanphilip
Copy link
Contributor Author

evanphilip commented Dec 8, 2022

Thank you @ykzts !

Copy link
Contributor Author

@evanphilip evanphilip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that both the linting errors

  1. Line 59 in lib/mastodon/media_cli.rb
  2. Line 324 in lib/mastodon/media_cli.rb

are unrelated to this commit.

@evanphilip
Copy link
Contributor Author

Thank you @ineffyble and @ykzts for triggering the tests! I would really appreciate it if you could trigger them again since I corrected the linting errors.

Please note that both the linting errors were unrelated to this commit, but I corrected them hoping it would make the merge faster.

@evanphilip
Copy link
Contributor Author

@connorshea I would really appreciate it if you could take a look at this pull request! I believe it would be a quick review for you since it is very similar to the other pull request I had made. I hope I am not being pushy.

🕯️🕯️🕯️🕯️
Next Sunday marks the 4-year anniversary of issue #9567. Though it has almost 50 participants and more than a 100 comments, it is still open. I am hoping we will be able to close it soon 🙏

lib/mastodon/media_cli.rb Outdated Show resolved Hide resolved
lib/mastodon/media_cli.rb Outdated Show resolved Hide resolved
@Gargron
Copy link
Member

Gargron commented Dec 12, 2022

Should this perhaps be a part of tootctl media remove? Why a separate command?

@evanphilip
Copy link
Contributor Author

@ClearlyClaire Thank you for your suggestions! I have incorporated them.

@Gargron I kept the command separate initially just to keep the changes modular, but now I have made it part of media remove. Thank you for the suggestion!

lib/mastodon/media_cli.rb Outdated Show resolved Hide resolved
lib/mastodon/media_cli.rb Outdated Show resolved Hide resolved
lib/mastodon/media_cli.rb Outdated Show resolved Hide resolved
lib/mastodon/media_cli.rb Outdated Show resolved Hide resolved
@evanphilip evanphilip marked this pull request as draft December 13, 2022 08:51
evanphilip and others added 4 commits December 13, 2022 09:52
Co-authored-by: Claire <claire.github-309c@sitedethib.com>
Co-authored-by: Claire <claire.github-309c@sitedethib.com>
Co-authored-by: Claire <claire.github-309c@sitedethib.com>
@evanphilip evanphilip marked this pull request as ready for review December 13, 2022 09:18
@evanphilip
Copy link
Contributor Author

@ClearlyClaire Thank you so much for taking the time to review the code! I would really appreciate it if you could trigger the CircleCI Checks. I am logged in to CircleCI and it has access to my fork, but last time it had to be triggered by @ineffyble.

image

@ClearlyClaire
Copy link
Contributor

ClearlyClaire commented Dec 14, 2022

Hi, I can't see any way to trigger it on my end, but given that we don't have tests for tootctl and it's the only thing you touched, I would not worry about that.

@ghost
Copy link

ghost commented Dec 22, 2022

Hi Claire, thank you very much for your follow-up!
I hit upon an idea of a good workaround, "1-click" manual trigger for re-fetching avatar and header.

Watching is much quicker than reading. Pls see my toot below :)

https://aniyomechan.jp/system/media_attachments/files/109/556/673/488/107/880/original/ca3995c6f56886d4.mp4

What I've done is as follows.

in "mastodon/app/controllers/api/v1/accounts_controller.rb"

21 -     render json: @account, serializer: REST::AccountSerializer
21 +     render json: @account.refresh!, serializer: REST::AccountSerializer

That's it!

This came based on what you kindly told me, again, thank you very much, Claire!
Also my apology because here is not the right place for this kind of newbie question. Sorry about that!
I wish you a merry Christmas! :)

@everton137
Copy link

everton137 commented Dec 26, 2022

I cannot see the time to access my laptop and incorporate this into my instance. I'm the only user and because of avatars and headers my cached media partition is growing too fast. In about one month it reached its 20 Gb limit, even deleting the other cached media daily.

Thank you @evanphilip!

@Tealk
Copy link

Tealk commented Dec 27, 2022

Is that then also executed through the retention policy for cached content and media?

dariusk pushed a commit to hometown-fork/hometown that referenced this pull request Dec 28, 2022
…ounts from the local database (mastodon#22149)

* Add tootctl subcommand media remove-profile-media

* Trigger workflows

* Correcting external linting

* External linting error

* External linting fix

* Merging with remove command

* Linting

* Correct long option names

Co-authored-by: Claire <claire.github-309c@sitedethib.com>

* Correct long option names

Co-authored-by: Claire <claire.github-309c@sitedethib.com>

* Correct long option names

Co-authored-by: Claire <claire.github-309c@sitedethib.com>

* Remove saving a list of purged accounts

Co-authored-by: Claire <claire.github-309c@sitedethib.com>
dariusk added a commit to hometown-fork/hometown that referenced this pull request Dec 28, 2022
#1259)

This cherry-picks [this pull request
commit](mastodon#22149) into Hometown.
It will be coming in a future Mastodon release but we will get it early.
Basically it adds options to `tootctl media remove`:

> Removes locally cached copies of media attachments (and optionally
profile headers and
avatars) from other servers. By default, only media attachements are
removed. The --days option specifies how old media attachments have to
be before they are removed. In case of avatars and headers, it specifies
how old the last webfinger request and update to the user has to be
before they are pruned. It defaults to 7 days. If --prune-profiles is
specified, only avatars and headers are removed. If --remove-headers is
specified, only headers are removed. If --include-follows is specified
along with --prune-profiles or --remove-headers, all non-local profiles
will be pruned irrespective of follow status. By default, only accounts
  that are not followed by or following anyone locally are pruned.

Relates to but does not fully address #1209 because there needs to be a
web UI component, too.

Co-authored-by: Evan <35814742+evanphilip@users.noreply.github.com>
Co-authored-by: Claire <claire.github-309c@sitedethib.com>
@zeeZ
Copy link

zeeZ commented Dec 31, 2022

Is that then also executed through the retention policy for cached content and media?

No.

nametoolong pushed a commit to nametoolong/nuage that referenced this pull request Jan 12, 2023
…ounts from the local database (mastodon#22149)

* Add tootctl subcommand media remove-profile-media

* Trigger workflows

* Correcting external linting

* External linting error

* External linting fix

* Merging with remove command

* Linting

* Correct long option names

Co-authored-by: Claire <claire.github-309c@sitedethib.com>

* Correct long option names

Co-authored-by: Claire <claire.github-309c@sitedethib.com>

* Correct long option names

Co-authored-by: Claire <claire.github-309c@sitedethib.com>

* Remove saving a list of purged accounts

Co-authored-by: Claire <claire.github-309c@sitedethib.com>
@quicoto
Copy link

quicoto commented Jan 20, 2023

Any brave soul willing to update the documentation to include these new params?

https://github.com/mastodon/documentation/blob/master/content/en/admin/tootctl.md

@evanphilip
Copy link
Contributor Author

Any brave soul willing to update the documentation to include these new params?

https://github.com/mastodon/documentation/blob/master/content/en/admin/tootctl.md

@quicoto I assumed the CLI documentation was automatically generated from the docstring. Should have checked.

I won't have access to a full-size computer this week, so it would be great is someone could add this.

@OccultWarlock
Copy link

I can't seem to get docker-compose run --rm web bin/tootctl media remove-headers to work on when running via Docker

Could not find command "remove_headers".
Could not find command "prune_profiles".

@rtxanson
Copy link

rtxanson commented Jan 21, 2023

@OccultWarlock I noticed the same and then realized it's only been merged into main, but not into any of the latest releases. Here's hoping that will be soon.

edit: Ah yes, it's in the coming 4.1.0 release

@ShaunGVos
Copy link

ShaunGVos commented Jan 31, 2023

I can't seem to get docker-compose run --rm web bin/tootctl media remove-headers to work on when running via Docker

Could not find command "remove_headers". Could not find command "prune_profiles".

If it's any help I installed v4.1.0rc2 and ran the following commands successfully:
RAILS_ENV=production /home/mastodon/live/bin/tootctl media remove --remove_headers true
RAILS_ENV=production /home/mastodon/live/bin/tootctl media remove --prune_profiles true

@quicoto
Copy link

quicoto commented Jan 31, 2023

@evanphilip PR with the Docs updated has been created, someone needs to review it

mastodon/documentation#1172

Thank you

@trwnh
Copy link
Member

trwnh commented Feb 15, 2023

i hate to raise this concern two months after it has been merged (and after it has been included in a tagged release), but looking at the way this command is called and the functionality it provides, it seems incredibly confusing. there is no clean way to specify what gets removed. it essentially has three completely different behaviors depending on which flag is provided, and the flags are mutually exclusive. additionally, using certain flags will change the meaning of other flags.

i'm not sure how best to fix this, but i would propose at minimum reworking the flags so that they are inclusive rather than exclusive, and so that you can use multiple flags together:

  • by default, removes cached status attachments
  • --include-avatars will additionally remove cached avatars
  • --include-banners will additionally remove cached banners

additionally, i would consider allowing --include-follows even on status attachments. i don't really have a suggestion for --days having two different meanings, but maybe that one is okay.

@evanphilip
Copy link
Contributor Author

@trwnh I am not very happy with my implementation and I agree that it is confusing. I would be glad to rework the flags myself if we can reach some consensus. Maybe we should open an issue?

I like your suggestion a lot. There would not be a way to leave media attachments untouched, but I don’t think that would be a problem.

@saschafoerster
Copy link

I can confirm, that it was confusing. I was looking forward in winning back a lot of space, but I had to look several times into documentation to fully understand the scope of the different flags. After using all of the commands, I only saved about 9GBs of about 200GBs used. I hoped that I can reduce much more. Was my expectation wrong?

Bildschirm­foto 2023-02-15 um 22 45 21

I hoped that the 30GB of remote avatars and 60GB of remote headers would be reduced significantly.

@evanphilip
Copy link
Contributor Author

evanphilip commented Feb 16, 2023

@saschafoerster Could you run the following as a test?

tootctl media remove --prune-profiles --include-follows --days 0 --dry-run

This will ONLY clear up avatars and headers, but it should clear up practically everything (since it is a dry run, it won’t actually do anything but will show what it will do). If this doesn’t clear up almost everything, something is amiss.

@trwnh
Copy link
Member

trwnh commented Feb 16, 2023

@evanphilip i opened #23628

@saschafoerster
Copy link

@evanphilip I tried this (I am using a docker installation):
docker compose run --rm web bundle exec tootctl media remove --prune-profiles --include-follows --days 0 --verbose --dry-run
I get this error message:

ERROR: "tootctl media remove" was called with arguments ["--verbose"]
Usage: "tootctl media remove"

I tried it then by removing the --verbose part:

464897/465030 |====================================================================================================================================  |  ETA: ??:??:??
Visited 465027 accounts and removed profile media totaling 87,6 GB (DRY RUN)
root@vm4bonndigital:/home/mastodon# 

But then I am a bit afraid, if I am deleting things that are not reloaded on request from external servers. :)

@evanphilip
Copy link
Contributor Author

@saschafoerster Thank you for your feedback! --verbose was indeed wrong, my mistake.

I would certainly not recommend setting --days to a low number, I merely wanted to know if there was a bug. You may just be federated with a lot of active people! Also note that, by default, people who are followed by or following anyone on the instance will not be touched. As ClearlyClaire mentioned above, whatever is cleared will be re-fetched safely, but

They would not be re-fetched when accessing the profiles, but when refreshing it, which occur on certain interactions, or can be re-triggered by pasting the remote user's URL in the search bar.

@Exagone313
Copy link

I opened #24070 for the --verbose removal, I had my two Mastodon instances filling up the space for a month...

@evanphilip
Copy link
Contributor Author

I opened #24070 for the --verbose removal, I had my two Mastodon instances filling up the space for a month...

I did not intend to remove --verbose, I apologize for the trouble. That is indeed a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tootctl media remove does not include profile avatars and headers