Skip to content

Reduce load of preview fetching on third-party servers #23662

Open

Description

Steps to reproduce the problem

Posting an image on Mastodon causes hundreds or thousands of federated servers to fetch the preview details all at once, causing a high load for the target server and effectively creating the by now well-known "Mastodon DDoS" effect. For static websites this may not be much of an issue but it routinely brings down larger pages or those which have dynamic server-rendered content.

Expected behaviour

No DDoS

Actual behaviour

DDoS

Detailed description

The discussion at #4486 has been going on for six years and is quite a mess. I'm opening this ticket to start fresh and collect the information scattered throughout the thread into one place so that the discussion doesn't have to keep going in circles.

Rationale

It is the responsibility of software like Mastodon to be a good neighbor on the internet. DDoSing others is not being a good neighbor! It's important to figure out how to prevent this issue from occurring.

IMPORTANT: Let's make this discussion better than the last one. For Mastodon devs: don't blame the victims. It's not the website's fault that Mastodon is DDoSing them. For server operators: Mastodon is a volunteer-run free software project. Be mindful of that.

Present mitigations

Currently a random jitter is added of between 0 and 60 seconds after a federated server is made aware of a post which includes a link, before it fetches the preview details. This does not seem to be sufficient to prevent the DDoS effect from occurring.

Suggested mitigations

The discussion has focused on two main suggestions for a fix.

Federating previews

The original poster's instance can fetch the preview details and attach them to the message, federating the preview details without requiring other servers to fetch it themselves. Criticism of this solution is mainly focused around the fact that Mastodon is a zero-trust environment, so instance A cannot trust instance B's word that a preview accurately represents the URL.

Because the original post is always fetched when a post is federated, the trust space can be reduced to the origin server alone; intermediates need not be trusted. Opinions on the depth of this problem have ranged from "it's no different from posting an image" to "we absolutely cannot trust anyone ever for any reason".

Answers proposed to the objections of trust have included random sampling wherein the preview is fetched 1 in N times, and if it's found to be inconsistent with the federated preview, some action can be taken, such as setting a flag on that instance which causes future (and past?) previews to be fetched unconditionally from that server, automatically making a report to the instance admins of suspected foul play, or federating the flag so that other servers can force a sample from that instance when foul play is suggested.

Any of these changes would likely involve a slow roll-out across the fediverse. Sometimes link previews might not work for older clients as a consequence. My take: I believe this is quite acceptable, it's a small price to pay for correcting this behavior. User experience does not outweigh "don't DDoS people". Furthermore, a slow roll-out will naturally imply that the problem does not get fixed overnight, but rather that the behavior corrects gradually over time as the fix is rolled out across the fediverse -- not an issue imo.

Reducing load

No federated previews, but instances don't immediately fetch the preview. Most to least effective mitigations along this line of thought:

  1. Add "show preview" button or so to the UI (or otherwise detect when the user is "interacting" with a post) and fetch the preview only then. Note that the preview only needs to be fetched once, if one user on an instance interacts with a post then the preview becomes available for all users without further interaction.
  2. Fetch robots.txt (and cache it) for that remote URL. Link previews should respect robots.txt #21738
  3. Lazily fetch the preview only when it's shown in the UI. For smaller instances, this would reduce the load if no one is actively using the server (e.g. idling Mastodon in an unfocused tab). It would also likely reduce the need to fetch previews for posts that appear on the federated timeline.
  4. Policy-based approach, e.g. if a post comes in because a user follows the poster, fetch the preview; otherwise don't.
  5. Increase the random jitter for fetching previews from 60 seconds to some higher number.

Some combination of these mitigations is also possible, for instance the jitter could be increased to five minutes, but done immediately if the post shows up in the UI.

Specifications

n/a, all versions affected

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions