Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GossipSub 1.2] IDONTWANT control message #548

Merged
merged 11 commits into from
May 15, 2024
102 changes: 102 additions & 0 deletions pubsub/gossipsub/gossipsub-v1.2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Gossipsub v1.2

# Overview

This document aims to provide a minimal extension to the [gossipsub
v1.1](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md)
protocol.

The proposed extensions are backwards-compatible and aim to enhance the
efficiency (minimize amplification/duplicates and decrease message latency) of
the gossip mesh networks for larger messages.

In more specific terms, a new control message is introduced: `DONTSEND`. It's primarily
intended to notify mesh peers that the node already received a message and there is no
need to send its duplicate.

# Specification

## Protocol Id

Nodes that support this Gossipsub extension should additionally advertise the
version number `1.2.0`. Gossipsub nodes can advertise their own protocol-id
prefix, by default this is `meshsub` giving the default protocol id:
- `/meshsub/1.2.0`
Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved

## Parameters

This section lists the configuration parameters that needs to agreed on across clients to avoid
peer penalizations

| Parameter | Description | Reasonable Default |
|-------------------------|------------------------------------------------------------------|--------------|
| `max_dontsend_messages` | The maximum number of `DONTSEND` messages per heartbeat per peer | ??? |
Copy link
Contributor

@Menduist Menduist May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that necessary? We can just descore peer if they send duplicate DONTSENDs, or we don't eventually see the message id
GossipSub already has too many parameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we don't eventually see the message id

Sounds like a good idea 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we don't eventually see the message id

Considering DONTSEND is allowed to be sent before validation, a peer can be downscored if a message DONTSEND has been sent for appears to be invalid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have numbers on how much we can gain by allowing to send DONTSENDs before validation? (ie, how long the validation is in practice)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid the downscoring by keeping a bounded cache, that gets overfilled to /dev/null.

We probably dont need a parameter for this, each peer can configure appropriately according to the expected message rate.

If we do want to downscore excessive rates of IDONTWANT, then we should validate first or else we open the door for a spam attack.

Copy link
Contributor

@vyzo vyzo May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that validation in some networks can be slow, so there is real benefit by sending early.



## DONTSEND Message

### Basic scenario

When the peer receives the first message instance it immediately broadcasts
(not queue for later piggybacking) `DONTSEND` with the `messageId` to all its mesh peers.
This could be performed prior to the message validation to further increase the effectiveness of the approach.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerns about spam attacks triggering amplified IDONTWANT spam?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like a feasible attack vector to me:

  • IDONTWANT is primarily intended for larger messages, so the cumulative size of resulting IDONTWANT messages is expected to be significantly smaller than the original message
  • If an attacker is sending invalid messages to initiate IDONTWANT spamming it would be pretty quickly banned due to negative scoring
  • And we have max_idontwant_messages limit as the last resort


On the other side a node maintains per-peer `dont_send_message_ids` set. Upon receiving `DONTSEND` from
a peer the `messageId` is added to the `dont_send_message_ids` set.
When later relaying the `messageId` message to the mesh the peers found in `dont_send_message_ids` could be skipped.
mxinden marked this conversation as resolved.
Show resolved Hide resolved

Old entries from `dont_send_message_ids` could be pruned during heartbeat processing.
The prune strategy is outside of the spec scope and can be decided by implementations.

`DONTSEND` message is supposed to be _optional_ for both receiver and sender. I.e. the sender may or may not utilize
this message. The receiver in turn may ignore `DONTSEND`: sending a message after the corresponding `DONTSEND`
should not be penalized.
mxinden marked this conversation as resolved.
Show resolved Hide resolved

The `DONTSEND` may have negative effect on small messages as it may increase the overall traffic and CPU load.
Thus it is better to utilize `DONTSEND` for messages of a larger size.
The exact policy of `DONTSEND` appliance is outside of the spec scope. Every implementation may choose whatever
is more appropriate for it. Possible options are either choose a message size threshold and broadcast `DONTSEND`
on per message basis when the size is exceeded or just use `DONTSEND` for all messages on selected topics.

To prevent DoS the number of `DONTSEND` control messages is limited to `max_dontsend_messages` per heartbeat

### Relying on `IHAVE`s

Another potential additional strategy could be as follows. If a node receives `IHAVE` (from one or more peers)
before the message is appeared in the mesh the node may request the message with `IWANT` and notify all mesh
peers that it don't want that message from them.
Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved

### Sending `IHAVE` to mesh peers who choked that particular message

Reasonable addition to the later scenario would be to _immediately_ send `IHAVE` instead of a full message
to those mesh peers who reported `DONTSEND`. That would notify mesh peers that the node has this message
and they could request it from you in case their `IWANT` requests fail in the previous scenario

### Cancelling `IWANT`

If a node requested a message via `IWANT` and then occasionally receives the message from other peer it may
try to cancel its `IWANT` requests with the corresponding `DONTSEND` message. It may work in cases when a
peer delays/queues `IWANT` requests and the `IWANT` request would be removed from the queue if not processed yet

## Protobuf Extension

The protobuf messages are identical to those specified in the [gossipsub v1.0.0
specification](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md)
with the following control message modifications:

```protobuf
message RPC {
// ... see definition in the gossipsub specification
}

message ControlMessage {
// messages from v1.0
repeated ControlDontSend dontSend = 5;
}

message ControlDontSend {
required bytes messageID = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for discussion, what would be the pros/cons of also including the topic?
I think it's the first time that we reference messages only by their id on the wire, so needs some consideration

I guess the only con of include the topic is more bandwidth usage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe an optional field?
Agreed about bandwidth usage, we should aim to keep this lean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IHAVE also has an optional topic field. But it looks like it is utilized by one implementation only (not sure which one exactly). I couldn't find any reasonable usage of topic for IDONTWANT tbh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no one is opposed or has any ideas on usage scenarios I would keep it without optional topic field to be more explicit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no one is opposed or has any ideas on usage scenarios I would keep it without optional topic field to be more explicit

The usage can be #548 (comment)

Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved
Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved
}

```