Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitswap/server: wantlist overflows fails in a toxic maner preventing any data transfer #527

Open
Tracked by #10398
Jorropo opened this issue Dec 19, 2023 · 1 comment
Labels
kind/bug A bug in existing code (including security flaws) P1 High: Likely tackled by core team if no one steps up

Comments

@Jorropo
Copy link
Contributor

Jorropo commented Dec 19, 2023

This is a bug I introduced in 9cb5cb5 when fixing CVE-2023-25568.

Here is a screenshot of our gateway's wantlists:
Screenshot 2023-12-19 at 13-36-37 View panel - Gateway - Main - IPFS Gateway - Dashboards - Grafana

You can see that many gateways instances have more than 1024 entries.
This cause issues with:

if wouldBe := s + uint(len(wants)); wouldBe > e.maxQueuedWantlistEntriesPerPeer {
log.Debugw("wantlist overflow", "local", e.self, "remote", p, "would be", wouldBe)
// truncate wantlist to avoid overflow
available, o := bits.Sub(e.maxQueuedWantlistEntriesPerPeer, s, 0)
if o != 0 {
available = 0
}
wants = wants[:available]
}

e.peerRequestQueue.PushTasksTruncated(e.maxQueuedWantlistEntriesPerPeer, p, activeEntries...)

e.peerRequestQueue.PushTasksTruncated(e.maxQueuedWantlistEntriesPerPeer, entry.Peer, peertask.Task{

Theses three sections of code are needed to fix CVE-2023-25568, they make the server ignore any new queries overflowing 1024. Previously the server would remember infinitely many CIDs eventually OOMing.
However the truncation code always prefer keeping existing entries over new ones.

This means we can get in a situation where we get stuck:

  • let's say the client send a 2048 entries wantlist.
  • out of theses we only have 1/3 of the CIDs, and theses are uniformly distributed.
  • the server truncate the wantlist to 1024.
  • the server starts serving theses entries first, out of theses we don't have 683 CIDs.
  • we now only have 341 effective wantlist size. This is because the bitswap server never cleanup entries after sending DONT_HAVE.
    The point of this feature is for -1 scalling, if the server is also downloading the same blocks, it might get them after having already sent DONT_HAVE, then the server can either send the block or the a HAVE message overriding the previous DONT_HAVE.
  • This repeat each time shrinking the usable wantlist on this connection because the part of the wantlist the server is willing to keep fills up with CIDs it does not have.
    Eventually reaching 0

(note: the client can send CANCEL or a message with the full flag and theses 1024 "stuck" CIDs will be cleaned out properly, but the client isn't smart enough to realize this is happening)


I see three possible solutions:

Truncate left instead of truncating right.

This means newer entries would override older ones, instead of having older ones stick around.
My original thinking when writing the fix is that the server currently does not apply back-pressure, so we should let existing entries first so we can make some progress and clear them out when we send responses. If we always cancel what is already in flight a client that is too fast would never get any blocks because everything would be canceled before being sent.
This could be fixed by only canceling what is already queued but not entries that are actively being worked on.

Completely rewrite the server and architecture it request response

If we handled messages in a request response manner we could apply back-pressure when the client is sending too many queries.
We could still have a global CID → [PeerID] wantlist map however this would be limited to CIDs we don't have, for -1 scalling. CIDs we already have would be completely skip this flow and the queue and be handled purely in the message handler.
We would still need a limit on how many -1 tracked CIDs we have, but that means this bug would only impact blocks we don't have which SGTM.

Make the client aware and handle rotation itself by truncating itself and handle canceling overwrote entries.

This is nice because it does not require updating the servers, only the client, so fixed client can download from buggy servers.
Sounds PITA to code and I don't really want to do it tbh. Also this means we commit to having broke the protocol.

@Jorropo Jorropo added need/triage Needs initial labeling and prioritization P1 High: Likely tackled by core team if no one steps up kind/bug A bug in existing code (including security flaws) and removed need/triage Needs initial labeling and prioritization labels Dec 19, 2023
@Jorropo Jorropo self-assigned this Dec 19, 2023
@Jorropo Jorropo changed the title Bitswap server fails in a toxic maner when the client exceed 1024 entries in the wantlist by a *lot* preventing any data transfer bitswap/server: wantlist overflows fails in a toxic maner preventing any data transfer Dec 19, 2023
@BigLep BigLep mentioned this issue Jan 3, 2024
3 tasks
@hacdias hacdias mentioned this issue Jan 16, 2024
8 tasks
@lidel
Copy link
Member

lidel commented Feb 19, 2024

I don't think it was written down anywhere, but me and @aschmahmann had verbal conversation
that as a compromise, to avoid rewriting everything, we should try:

  • implementing "Truncate left instead of truncating right" (make new entries override old ones)
    • with metric that shows how full the queue is (if we don't have it)
    • with configuration option to override default MaxQueuedWantlistEntiresPerPeer (allowing operators to tune their setup to have queue match increased load)

@hacdias hacdias mentioned this issue Mar 1, 2024
11 tasks
@Jorropo Jorropo removed their assignment Mar 4, 2024
@hacdias hacdias mentioned this issue Apr 11, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) P1 High: Likely tackled by core team if no one steps up
Projects
Status: 🏃‍♀️ In Progress
Development

No branches or pull requests

2 participants