Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multiline] Multiline message batches #398

Merged
merged 13 commits into from Apr 27, 2020
Merged

[multiline] Multiline message batches #398

merged 13 commits into from Apr 27, 2020

Conversation

jwheare
Copy link
Member

@jwheare jwheare commented Sep 23, 2019

A solution to multiline messages (#208) using client to server batches based on https://gist.github.com/jesopo/092866e52b40cda6f6205263d04dbcf9


Currently implemented and testable with the IRCCloud client and servers (prattle: teams, bnc, slack) and bitbot.


Client sending a mutliline batch

Client: BATCH +123 draft/multiline #channel
Client: @batch=123 PRIVMSG #channel hello
Client: @batch=123 privmsg #channel :how is<SPACE>
Client: @batch=123;draft/multiline-concat PRIVMSG #channel :everyone?
Client: BATCH -123

Server sending the same batch

Server: @msgid=xxx;account=account :n!u@h BATCH +123 draft/multiline #channel
Server: @batch=123 :n!u@h PRIVMSG #channel hello
Server: @batch=123 :n!u@h PRIVMSG #channel :how is<SPACE>
Server: @batch=123;draft/multiline-concat :n!u@h PRIVMSG #channel :everyone?
Server: BATCH -123

Server sending messages to clients without multiline support

Server: @msgid=xxx;account=account :n!u@h PRIVMSG #channel hello
Server: @account=account :n!u@h PRIVMSG #channel :how is<SPACE>
Server: @account=account :n!u@h PRIVMSG #channel :everyone?

Final concatenated message

hello
how is everyone?

@jwheare jwheare added this to the Roadmap milestone Sep 23, 2019
@jwheare jwheare mentioned this pull request Sep 23, 2019

### Message ids ([spec][message ids])

Servers MUST only include a message ID on the first message of a batch when sending a fallback to non-supporting clients.
Copy link
Contributor

@progval progval Sep 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the first one? If the message ID is used eg for a reply or a reaction emoji, it visually makes more sense to have it attached to the last message.

Copy link
Contributor

@slingamn slingamn Nov 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is saying that (some of) the fallback PRIVMSG should be sent without any msgid tag at all? This seems to upend some expectations (e.g., msgids being available for deduplication purposes).

Copy link
Contributor

@slingamn slingamn Nov 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comparison, oragono.io/maxline-2 generates n+1 distinct message IDs for the original message and all n lines of the split message. This isn't perfect, but it maintains the expectation that PRIVMSG should have an ID.

The msgid spec says: "If this tag is being used, it SHOULD be attached to all PRIVMSG and NOTICE events."

Copy link
Member Author

@jwheare jwheare Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msgids can't appear more than once. Generating unique n+1 msgids and only sending them on the fallback creates undesirable situations where people might reply to a fallback message and the context would then be missing for people who fully support maxline (i.e. discard fallback message tags)

If you're concerned about missing msgids, the solution would be to properly support the maxline batch spec.

Copy link
Contributor

@edk0 edk0 Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's true. It'd be easy to generate unique fallback msgids that the server can recognise as such and translate back or refuse replies for or something.

Copy link
Contributor

@slingamn slingamn Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jwheare, just to be clear, when you say "maxline" you're referring to the present multiline spec, not to oragono.io/maxline-2?

Some degradation of experience is inevitable under any proposal to support PRIVMSG over 512 bytes. I'm much less concerned about the case you're describing than I am about sending PRIVMSG without IDs, which seems to complicate the handling of chat history at a more fundamental level.

I'm not 100% sold on @edk0's suggestion because it violates an expectation I have that servers shouldn't need to care about client-only tags, but I could be convinced.

Hypothetically, we could define a new tag like draft/fmsgid ("fallback msgid") for transmitting the fallback msgids to clients that support the multiline spec.

Copy link
Contributor

@edk0 edk0 Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to care about client-only tags anyway, so it's not a bother for me. You don't have to implement my suggestion; I'm just saying a solution is possible.

@slingamn
Copy link
Contributor

@slingamn slingamn commented Dec 20, 2019

Are there any constraints on which messages in the client batch can carry client-only tags, and what those tags can be?

@jwheare
Copy link
Member Author

@jwheare jwheare commented Dec 20, 2019

The way I've implemented this in the client, the only tags within the batch that I use are the @draft/multiline-concat, every other tag must appear on the opening BATCH message to be recognised.

Any tags that would have been added to the batch, e.g. message IDs, account tags etc MUST be included on the first message line. Tags MAY also be included on subsequent lines where it makes sense to do so.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Dec 20, 2019

Actually that wording is from the fallback handling and refers to unbatched messages. Could use some clearer guidance on non-fallback BATCHed messages and where to include the tag, though I feel it is implied by the fact that the BATCH indicates a single logical message.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Dec 20, 2019

Oh it is mentioned actually:

Clients MUST only send non-batch tags on the opening BATCH command when sending multiline batches.

That means only @batch and @multiline-concat are allowed on messages within the batch.

extensions/multiline.md Outdated Show resolved Hide resolved

This section is non-normative.

In the context of flood protection, keep in mind that clients might send an entire batched multiline message all at once.
Copy link
Contributor

@slingamn slingamn Dec 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought this through fully, but there's an issue here with sendqs. If the sendq is small relative to the value of max-bytes, you can DoS clients by completing multiple multiline batches at once and filling up their sendqs.

Copy link
Contributor

@slingamn slingamn Dec 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on a clarifying question from #ircv3: yeah, client A can try to disconnect client B from a server by filling up its sendq. Normally, flood control is a check on this. It's not clear what the best practices are for applying flood control to multiline messages --- the way to get parity with pre-multiline behavior w.r.t. the sendq would be to apply artificial delays to the outgoing batch.

Copy link
Contributor

@slingamn slingamn Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If max-lines is not set, the worst case for a DoS on the sendq is to send max-bytes lines, with a single byte on each line, and multiline-concat on every line after the first. Assuming the longest possible nickmask, the total size of the relayed message will be about 100 times max-bytes. Even for a conservative value of max-bytes, this should be large enough to overwhelm both the sendq and the TCP send buffer (I think it's typically around 85kB on modern Linux machines).

extensions/multiline.md Outdated Show resolved Hide resolved
@slingamn
Copy link
Contributor

@slingamn slingamn commented Dec 23, 2019

re. interactions between multiline and labeled-response: on which message does the client send the label tag, the BATCH + or the BATCH - line?

@jwheare
Copy link
Member Author

@jwheare jwheare commented Dec 23, 2019

BATCH +

@slingamn
Copy link
Contributor

@slingamn slingamn commented Dec 27, 2019

I have a (likely buggy) implementation of this up on testnet.oragono.io. It uses fallback msgids (each individual message in the multiline batch has an alternative msgid, which is sent to multiline-capable clients under the draft/fmsgid tag, and to legacy clients under the msgid tag).

Please report any bugs at https://github.com/oragono/oragono :-)

@xPaw
Copy link

@xPaw xPaw commented Jan 6, 2020

Do I understand correctly that the <SPACE> at the end of the line in these examples means there shouldn't be a new line inserted in the concatenated message?

foreach msg in messages
 concat_msg += msg
 if msg does not end with space
  concat_msg += "\n"

This is almost explained in splitting long lines, but for the sake of verbosity, it probably warrants an explicit explanation.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Jan 6, 2020

No, the draft/multiline-concat tag indicates that. The <SPACE> is there because without a newline, the words are separated with a space.

@xPaw
Copy link

@xPaw xPaw commented Jan 6, 2020

Oh duh, I completely missed the tag there.

The "leave the space character at the end of the line. This ensures that concatenated lines don't lose the space" paragraph mislead me. I do see why it's this way for fallback messages though.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Jan 6, 2020

Any suggestions for clarifying that paragraph to make it less misleading?

@xPaw
Copy link

@xPaw xPaw commented Jan 6, 2020

Probably mentioning the tag there would help clarify it.

I can't think of an example where "This ensures that concatenated lines don't lose the space" is relevant besides buggy code that trims the messages?

@jwheare
Copy link
Member Author

@jwheare jwheare commented Jan 17, 2020

Any outstanding comments before we merge this as a draft?

jesopo
jesopo approved these changes Jan 17, 2020
@slingamn
Copy link
Contributor

@slingamn slingamn commented Jan 17, 2020

Two issues I raised haven't been addressed:

  1. The question of "fallback msgids"
  2. Denial-of-service attacks that exhaust the sendq

@jwheare
Copy link
Member Author

@jwheare jwheare commented Jan 17, 2020

  1. I'm not in favour of them.
  2. Any suggested wording that should be in the spec to address that?

@slingamn
Copy link
Contributor

@slingamn slingamn commented Feb 7, 2020

These messages would only be sent to fully compliant clients, which in any case need to take the batch tag into account when interpreting the message. So there should be no increase in client complexity over the baseline required to implement multiline. As for server complexity, it's a MAY, so servers don't have to implement it if they don't want it.

The potential benefit is a 60% reduction in the size of relayed messages, which seems worthwhile to me.

@kythyria
Copy link
Contributor

@kythyria kythyria commented Feb 7, 2020

Compromise: Put only the nick as the sender. Could even put * as the sender, though that's perhaps inconsistent with other extensions using * to mean "yourself".

@slingamn
Copy link
Contributor

@slingamn slingamn commented Feb 7, 2020

Implementations might want to use the target for routing messages (in addition to the batch tag), so I withdraw my proposal to send * as the target.

I still think we should allow servers to omit the source nickmask: this is syntactically unproblematic, and should be semantically unproblematic for clients as well (at least one implementation supports this "by default", since sources for messages within the relayed S2C batch are ignored).

A lot of numbers have been bandied back and forth, so here's what I think is a realistic estimate: for a 100-line code snippet with an average line length of 60 characters, relayed to a channel with an 8-character name by a user with a 40-character nickmask, I compute a 30% reduction in wire size.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Apr 2, 2020

Did we decide if this needs edits before merging as draft?

@slingamn
Copy link
Contributor

@slingamn slingamn commented Apr 2, 2020

  1. CRLF is still incorrectly specified as \n\r
  2. No action has been taken on my suggested "security considerations" section (proposed language)
  3. I am still holding out hope for including "servers MAY omit the source nickmask on PRIVMSG within the batch" --- is this proposal definitely vetoed? I haven't really heard compelling arguments against it, other than the convention that lines without a source are sent by the local server. Since this change would only affect clients with full support for multiline, this doesn't seem that persuasive to me.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Apr 2, 2020

I’ll take a look at 1 and 2.

3: I think it comes under the same “not worth the micro optimisation” justification as the other stuff on this theme.

@slingamn
Copy link
Contributor

@slingamn slingamn commented Apr 2, 2020

The other proposed optimizations had stronger arguments against them. I think this one is safer than the other proposals, and achieves most of their possible gains.

@jesopo
Copy link

@jesopo jesopo commented Apr 2, 2020

I find the inclusion of entirely redundant information (:source) to make this look and feel strange. I'd much prefer we totally omit this, even if I ignore the fact that it is an optimisation.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Apr 22, 2020

@slingamn those changes should cover your 1 and 2 suggestions.

I still think we need some sort of guidelines for configuring limits, but I'm struggling to arrive at something that's sensible, future-proof, not overly restrictive etc.

I tried plotting some numbers but it didn't really help.

Any ideas?

@slingamn
Copy link
Contributor

@slingamn slingamn commented Apr 22, 2020

The proposed language in my earlier comment is aimed more at drawing attention to the issue, rather than suggesting concrete numbers (which I agree is a difficult problem).

@jwheare
Copy link
Member Author

@jwheare jwheare commented Apr 22, 2020

Yes I added a version of your proposal in ff7f73b but I still feel like something more is needed.

@slingamn
Copy link
Contributor

@slingamn slingamn commented Apr 22, 2020

Oh, sorry I missed that. I don't think I have much insight into this question, other than the suggestion of "envelope math", which you already tried.

It sounds like there is an upwards constraint on these limits (supporting all realistic use cases for the feature) and a downwards constraint (blocking abuse), and probably the best way to get more information about both of these constraints is simply to move forward with the specifications process (i.e., make it an official draft, if that's the correct next step) and see what happens in the wild. This is because:

a. Any guidance about limits we add later will be in a "non-normative" section of the spec
b. These limits will typically be configurable, not hardcoded, so operators will be able to adopt new recommendations without requiring any code changes from developers

p.s. In terms of moving forwards on the specifications process, I haven't given up on item 3, but I don't really have any further arguments for it beyond what I've already stated.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Apr 22, 2020

OK, makes sense re: further limits guidance.

I think in the absence of consensus over how to optimise these messages, again we should just proceed to merged draft and see if anything comes out of implementations.

At the end of the day, the unbatched fallback messages will never be able to be optimised, so implementations being able to cope without it is effectively a requirement.

@slingamn
Copy link
Contributor

@slingamn slingamn commented Apr 22, 2020

That sounds fair.

@jwheare
Copy link
Member Author

@jwheare jwheare commented Apr 23, 2020

OK will merge if no further comments are forthcoming.

extensions/multiline.md Outdated Show resolved Hide resolved
extensions/multiline.md Outdated Show resolved Hide resolved
extensions/multiline.md Outdated Show resolved Hide resolved
extensions/multiline.md Show resolved Hide resolved
extensions/multiline.md Outdated Show resolved Hide resolved
@jwheare
Copy link
Member Author

@jwheare jwheare commented Apr 27, 2020

@DarthGandalf all good?

@DarthGandalf
Copy link
Member

@DarthGandalf DarthGandalf commented Apr 27, 2020

@DarthGandalf all good?

For draft, no further comments.

@jwheare jwheare merged commit eec2f95 into ircv3:master Apr 27, 2020
@jwheare jwheare deleted the multiline branch Apr 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants