Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any idea to stop activitypub-troll.cf or likewise attacks? #21977

Closed
bearice opened this issue Dec 3, 2022 · 27 comments · Fixed by #22025 or #22026
Closed

Any idea to stop activitypub-troll.cf or likewise attacks? #21977

bearice opened this issue Dec 3, 2022 · 27 comments · Fixed by #22025 or #22026
Assignees
Labels
security Security issues and fixes, vulnerabilities suggestion Feature suggestion

Comments

@bearice
Copy link
Contributor

bearice commented Dec 3, 2022

Pitch

From about one hour ago, my instance's federated timeline was spammed with contents from activitypub-troll.cf, and the sidekiq queue is full of pulling requests. I have to purge all the tasks and block the domain in my DNS server.
According it's code: http://activitypub-troll.cf/ (backup: https://gist.github.com/bearice/5aa1b86cba027ee58bcbd6cb1be9b258 )
the attacker crafted an activitypub site that causes mastodon to follow the links with in an infinite loop.

Motivation

Maybe we should limit the depth of recursion when pulling links from a status update?

@bearice bearice added the suggestion Feature suggestion label Dec 3, 2022
@bixu
Copy link

bixu commented Dec 3, 2022

Also seeing this.

@bearice
Copy link
Contributor Author

bearice commented Dec 3, 2022

and it left lots of garbage in the database too:

mastodon=# select count(*) from accounts where domain like '%activitypub-troll.cf';
 count
-------
 10553
(1 row)
mastodon=# select count(*) from mentions where account_id in (select id from accounts where domain like '%activitypub-tr
oll.cf');
 count
-------
 10541
(1 row)
mastodon=# select count(*) from statuses where account_id in (select id from accounts where domain like '%activitypub-tr
oll.cf');
 count
-------
  5273
(1 row)

is there any good way to clean these (i.e., which tables should I lookup) ?

EDIT:
add domain block with “suspend” severity will delete all these things.

@Ghryphen
Copy link

Ghryphen commented Dec 3, 2022

Yes, we "as end users, non-admins" also need the ability to block the second level domain, not just the third level as this attack is using random subdomains as the instance so you can't block.

Screenshot_20221203_025134

@ineffyble
Copy link
Member

Yes, we need the ability to block the second level domain

If you block activitypub-troll.cf it will apply to all subdomains under it.

@bearice
Copy link
Contributor Author

bearice commented Dec 3, 2022

Yes, block policy will work on subdomains, but

The domain block will not prevent creation of account entries in the database,

so, resource is still wasted, both the database and network traffic, also changing domain is cheap for attackers. The impacts are big, especially for small instances.

@ineffyble
Copy link
Member

@bearice I suspect that line is outdated or unclear. Looking at the code, a blocked domain should prevent creation of account entries in the database, unless I'm missing something.

@bearice
Copy link
Contributor Author

bearice commented Dec 3, 2022

If so this line need to be updated:

hint: The domain block will not prevent creation of account entries in the database, but will retroactively and automatically apply specific moderation methods on those accounts.

And blocking attacker's domain should deescalate the issue, making it a mouse-and-cat game, but not fixing the root cause.
We should prevent unlimited recursive pulling from happening.

@ClearlyClaire
Copy link
Contributor

We are looking into the issue. In the meantime, I can confirm when remote domains are blocked with “suspend” severity, new accounts do not get created in the database.

@yingziwu
Copy link

yingziwu commented Dec 3, 2022

The misskey has fix this security vulnerabilities.

https://misskey.io/notes/98bjfxxwv1

image

@mohe2015
Copy link
Contributor

mohe2015 commented Dec 3, 2022

The misskey has fix this security vulnerabilities.

[...]

The fix seems to be to add a recursion limit.

misskey-dev/misskey@66513b9

@ledlamp
Copy link

ledlamp commented Dec 3, 2022

why not just resolve mentioned users on-demand as needed instead of automatically once the toot is discovered?

@Skitals
Copy link

Skitals commented Dec 4, 2022

Today I noticed high cpu usage on my single-user instance. Htop showed sidekiq and ffmpeg were the top cpu users. Ffmpeg was converting random png files. I loaded the sidekiq gui and the "busy" tab was being flooded with jobs from a single instance I was unfamiliar with. The instance appears legit, but I have no posts from them on my federated timeline, and I see no posts from my instance on their timeline. Within a few minutes of me noticing the activity it stopped and has been quiet since. I checked my logs and I have no records of this instance/url anywhere... if I didn't check sidekiq in real-time I would have never known this happened. The only remnants I found are two jpegs in the /tmp directory. One is a random selfie of an asian man. The other is of the Tule Lake Segregation Center, a ww2 Japanese internment camp.

"Legitimate" activitypub requests were flooding my server and causing it to convert image files, pegging my cpu.

Does anyone have any explanation for this? This seems like a serious exploit/vulnerability.

@bearice
Copy link
Contributor Author

bearice commented Dec 4, 2022

Today I noticed high cpu usage on my single-user instance. Htop showed sidekiq and ffmpeg were the top cpu users. Ffmpeg was converting random png files. I loaded the sidekiq gui and the "busy" tab was being flooded with jobs from a single instance I was unfamiliar with. The instance appears legit, but I have no posts from them on my federated timeline, and I see no posts from my instance on their timeline. Within a few minutes of me noticing the activity it stopped and has been quiet since. I checked my logs and I have no records of this instance/url anywhere... if I didn't check sidekiq in real-time I would have never known this happened. The only remnants I found are two jpegs in the /tmp directory. One is a random selfie of an asian man. The other is of the Tule Lake Segregation Center, a ww2 Japanese internment camp.

"Legitimate" activitypub requests were flooding my server and causing it to convert image files, pegging my cpu.

Does anyone have any explanation for this? This seems like a serious exploit/vulnerability.

I think your problem is related to #15195 but it's also not fixed yet, and that is the reason that I don't dare to add any relays on my instances.

@Skitals
Copy link

Skitals commented Dec 4, 2022

I think your problem is related to #15195 but it's also not fixed yet, and that is the reason that I don't dare to add any relays on my instances.

I don't have any relays, I am the only user, and I have no federated posts from this instance. I can't fathom why my server would be processing these images. Not to mention I've never noticed any cpu usage whatsoever with legitimate usage, and suddenly my cpu is pegged at 100% with this very questionable activity. My fear is someone is testing out something nefarious on small-fish servers.

@bearice
Copy link
Contributor Author

bearice commented Dec 4, 2022

I think your problem is related to #15195 but it's also not fixed yet, and that is the reason that I don't dare to add any relays on my instances.

I don't have any relays, I am the only user, and I have no federated posts from this instance. I can't fathom why my server would be processing these images. Not to mention I've never noticed any cpu usage whatsoever with legitimate usage, and suddenly my cpu is pegged at 100% with this very questionable activity. My fear is someone is testing out something nefarious on small-fish servers.

it's not about relays, it's about media caching, whenever mastodon find a media in a toot (it may be from your follows, or boosted, or someone else mentions it) it will download the thing, convert it into different formats, upload it to media storage.
a lot cpu time is required for this job, especially for video files.

I'm running my instances on a Raspberry Pi, so any cpu intensive work will make the instance very unstable.

@ClearlyClaire
Copy link
Contributor

why not just resolve mentioned users on-demand as needed instead of automatically once the toot is discovered?

This would require deep changes to how the data is processed, stored, and communicated through the API, as well as add delays to the “on-demand” requests. This is worth considering in the future, but that would be a breaking change (API-wise) and require a lot of consideration.

@ellispritchard
Copy link

To reduce the "cat and mouse" maybe a federated domain block-list (like dnsbl.info) feature could be developed?

@Yonle
Copy link

Yonle commented Dec 4, 2022

To reduce the "cat and mouse" maybe a federated domain block-list (like dnsbl.info) feature could be developed?

Still somewhat problematic, especially if that feature is on by default.

@ne20002
Copy link

ne20002 commented Dec 4, 2022

It is an always good idea to protect servers. I run a small node using Nginx as reverse proxy.
So I set an overall limit on requests which my node can easily handle. I chose three times normal load.
And a lower limit on /inbox requests which is 20 per source ip. The /inbox limit can be chosen by requests/s with burst. It is quite helpfull in not letting single nodes flood the /inbox queue.

@ClearlyClaire
Copy link
Contributor

Note that rate-limiting incoming activities would not have helped much for this issue, which is your server being tricked into fetching a lot of remote accounts. Those are all outgoing queries from your server.

@Skitals
Copy link

Skitals commented Dec 4, 2022

it's not about relays, it's about media caching, whenever mastodon find a media in a toot (it may be from your follows, or boosted, or someone else mentions it) it will download the thing, convert it into different formats, upload it to media storage. a lot cpu time is required for this job, especially for video files.

I'm running my instances on a Raspberry Pi, so any cpu intensive work will make the instance very unstable.

It is a single user instance, I only follow about 40 people. No one I follow has ever boosted anything from the server in question. Nothing in my media cache in unusual. My server had a bombardment of images processed that had absolutely no business being there.

I understand how the media cache is supposed to work, and that it should only include local, follows, boosted, etc. That is not what was happening. "Valid" activitypub activity was causing my instance to process a flood of images, enough to heat up my 16c/32t cpu.

@CEbbinghaus
Copy link

Off topic and not related to this Issue, Can this discussion be moved into a separate issue specifically for media caching?

@ineffyble ineffyble added the security Security issues and fixes, vulnerabilities label Dec 5, 2022
@d3cline
Copy link

d3cline commented Dec 5, 2022

Some observations I have had because of this attack.

  1. We can poll domains from a few places in advance. The poduptime GNU licensed project can provide the core of the 'data collection' and has a graphQL api.
  2. An app can be built which processes these domains and can post back to the mastodon API to manage the block lists dynamically or based on rules.
  3. Once this system is in place, more robust checks could be added later for content using pytorch libraries

Case in point the TLD of most of these domains is very suspicious already for SPAM and other abuse. I as an admin likely don't need .cf unless its approved in advance. of course TLD is just one check. I am curious as to any other checks and am digging into the RDAP spec and the actual data now.

I have a general outline of this in a repo here,
https://github.com/d3cline/fossilize

I like the idea of a core fix for recursion or something to this effect.

We will see how far I get on this idea. Figured I would post here since this thread and the attack directly caused me to work in this direction as a potential solution. I welcome any feedback there.

@afontenot
Copy link
Contributor

It is a single user instance, I only follow about 40 people. No one I follow has ever boosted anything from the server in question.

This is one aspect of this I'm confused by. The attack tricks the server into fetching recursive remote accounts; but when the server receives a message in its inbox (a) that doesn't mention any local accounts, and (b) is sent by an account that is not followed by any local user, why doesn't it just drop the message without processing any of its contents? Wouldn't that have prevented this issue? Processing every incoming message even if the server doesn't plan to do anything with it seems to open up a lot more avenues for DoS.

Now, maybe an attacker could work around this fix by mentioning one or more local users in each message, but hopefully this kind of "mentions spam" is already addressed in some other way, as it seems an obvious issue? Some kind of timeout for mentions flooding. Just speculating here.

@april83c
Copy link

april83c commented Apr 9, 2023

and it left lots of garbage in the database too:

mastodon=# select count(*) from accounts where domain like '%activitypub-troll.cf';
 count
-------
 10553
(1 row)
mastodon=# select count(*) from mentions where account_id in (select id from accounts where domain like '%activitypub-tr
oll.cf');
 count
-------
 10541
(1 row)
mastodon=# select count(*) from statuses where account_id in (select id from accounts where domain like '%activitypub-tr
oll.cf');
 count
-------
  5273
(1 row)

is there any good way to clean these (i.e., which tables should I lookup) ?

EDIT: add domain block with “suspend” severity will delete all these things.

even after adding a domain block for activitypub-troll.cf, it did not go through and retroactively delete all the garbage:

mastodon=# select count(*) from accounts where domain like '%activitypub-troll.cf';
 count  
--------
 126456
(1 row)

mastodon=# select count(*) from mentions where account_id in (select id from accounts where domain like '%activitypub-troll.cf');
 count 
-------
 21408
(1 row)

mastodon=# select count(*) from statuses where account_id in (select id from accounts where domain like '%activitypub-troll.cf');
 count 
-------
 10704
(1 row)

it seems that for subdomains, it doesn't do that?

is there a correct/proper way to clean this up? (like, if i just deleted the rows, i assume there would probably be references to them elsewhere that cause problems?)

edit: it seems that tootctl domains purge has an undocumented option --include-subdomains

@ClearlyClaire
Copy link
Contributor

even after adding a domain block for activitypub-troll.cf, it did not go through and retroactively delete all the garbage

Domain suspension does not remove account records, to not clear individual suspensions, limits, blocks and mutes in case you end up unsuspending the domain at a later date (but I guess we could change it to delete account records that do not have that kind of information stored).

Account records can be wiped per-domain using the “Purge” button in the admin interface, but it's obviously ill-suited for this case.

As you figured out, the best way to perform this kind of clean up with current Mastodon is tootctl domains purge --include-subdomains.

@phocks
Copy link

phocks commented Dec 20, 2023

even after adding a domain block for activitypub-troll.cf, it did not go through and retroactively delete all the garbage

Domain suspension does not remove account records, to not clear individual suspensions, limits, blocks and mutes in case you end up unsuspending the domain at a later date (but I guess we could change it to delete account records that do not have that kind of information stored).

Account records can be wiped per-domain using the “Purge” button in the admin interface, but it's obviously ill-suited for this case.

As you figured out, the best way to perform this kind of clean up with current Mastodon is tootctl domains purge --include-subdomains.

Thanks so much for this. 120,000 + subdomains removed on my single-user instance.

As user mastodon: tootctl domains purge --include-subdomains activitypub-troll.cf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Security issues and fixes, vulnerabilities suggestion Feature suggestion
Projects
None yet