New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow increased IRC message lengths #281

Closed
wants to merge 8 commits into
base: master
from

Conversation

@DanielOaks
Member

DanielOaks commented Nov 18, 2016

One of the big issues we want to solve is that IRC lines are capped at 512 octets. This... works, but it would be very nice to allow for longer messages and things like longer topics without needing to implement dodgy hacks for every single command we want to allow longer lengths on.

This should ensure that things stay 100% backwards compatible and work correctly for clients that do not support longer lines, while allowing more up-to-date clients to negotiate the longer message length allowed by the server.

This is gonna be something that is a bit controversial, but it would be extremely useful to allow and something we've been looking at for a while.

@Shawn-Smith

This comment has been minimized.

Show comment
Hide comment
@Shawn-Smith

Shawn-Smith Nov 18, 2016

Replace every instance of octets with chars or characters

Other than that, seems like a good addition. Variable line lengths is something I've been wanting for awhile now.

Shawn-Smith commented Nov 18, 2016

Replace every instance of octets with chars or characters

Other than that, seems like a good addition. Variable line lengths is something I've been wanting for awhile now.

@DarthGandalf

This comment has been minimized.

Show comment
Hide comment
@DarthGandalf

DarthGandalf Nov 18, 2016

Member

@Shawn-Smith every character can be more than one octet, especially as UTF-8 is recommended. The RFC says about octets (bytes), not about unicode characters.

Member

DarthGandalf commented Nov 18, 2016

@Shawn-Smith every character can be more than one octet, especially as UTF-8 is recommended. The RFC says about octets (bytes), not about unicode characters.

@clokep

This comment has been minimized.

Show comment
Hide comment
@clokep

clokep Nov 18, 2016

Contributor

Any thoughts on how the server is supposed to split the text? We (Instantbird and Thunderbird) will split a user's message on the closest space that makes the message small enough to send (if there's no space, then we'll split in the middle of a "word".) This is a somewhat sub-optimal though, depending on the language...If I recall correctly, certain spaces in French grammar are essentially the 'middle' of a word.

Probably out of scope for the specification, but figured I'd ask.

Also 👍 on being clear about octets vs. characters.

Contributor

clokep commented Nov 18, 2016

Any thoughts on how the server is supposed to split the text? We (Instantbird and Thunderbird) will split a user's message on the closest space that makes the message small enough to send (if there's no space, then we'll split in the middle of a "word".) This is a somewhat sub-optimal though, depending on the language...If I recall correctly, certain spaces in French grammar are essentially the 'middle' of a word.

Probably out of scope for the specification, but figured I'd ask.

Also 👍 on being clear about octets vs. characters.

@Shawn-Smith

This comment has been minimized.

Show comment
Hide comment
@Shawn-Smith

Shawn-Smith Nov 18, 2016

@DarthGandalf IRC line length goes by number of characters regardless of how many octets it takes per character.

IRC messages are always lines of characters terminated with a CR-LF
   (Carriage Return - Line Feed) pair, and these messages shall not
   exceed 512 characters in length, counting all characters including
   the trailing CR-LF. Thus, there are 510 characters maximum allowed
   for the command and its parameters.  There is no provision for
   continuation message lines.  See section 7 for more details about
   current implementations.

Shawn-Smith commented Nov 18, 2016

@DarthGandalf IRC line length goes by number of characters regardless of how many octets it takes per character.

IRC messages are always lines of characters terminated with a CR-LF
   (Carriage Return - Line Feed) pair, and these messages shall not
   exceed 512 characters in length, counting all characters including
   the trailing CR-LF. Thus, there are 510 characters maximum allowed
   for the command and its parameters.  There is no provision for
   continuation message lines.  See section 7 for more details about
   current implementations.
@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Nov 18, 2016

Member

The RFCs refer to bytes, octets and characters (which is sketchy, but we need to work with). Octets seems to be the one that is most specific and is least likely to cause confusion/etc among people.

@clokep I don't really define how to split messages in this on purpose, but my thoughts essentially boil down to "somewhere that makes sense", which is difficult to codify. Some will split on spaces, some on dashes and things as well, etc.

Member

DanielOaks commented Nov 18, 2016

The RFCs refer to bytes, octets and characters (which is sketchy, but we need to work with). Octets seems to be the one that is most specific and is least likely to cause confusion/etc among people.

@clokep I don't really define how to split messages in this on purpose, but my thoughts essentially boil down to "somewhere that makes sense", which is difficult to codify. Some will split on spaces, some on dashes and things as well, etc.

@Shawn-Smith

This comment has been minimized.

Show comment
Hide comment
@Shawn-Smith

Shawn-Smith Nov 18, 2016

@DanielOaks An octet is a group of 8. You're talking specifically about characters in your specification.

Shawn-Smith commented Nov 18, 2016

@DanielOaks An octet is a group of 8. You're talking specifically about characters in your specification.

@DarthGandalf

@Shawn-Smith octet is 8 bit = 1 byte.
@DanielOaks to make it less confusing, let's use "byte" instead of octet. These days there are no non-8-bit bytes anymore AFAIK.

Show outdated Hide outdated extensions/line-lengths-3.3.md
Show outdated Hide outdated extensions/line-lengths-3.3.md
Show outdated Hide outdated extensions/line-lengths-3.3.md
Show outdated Hide outdated extensions/line-lengths-3.3.md
@digitalcircuit

This comment has been minimized.

Show comment
Hide comment
@digitalcircuit

digitalcircuit Nov 18, 2016

Contributor

@clokep In some cases, the language/libraries might handle this automatically - e.g. Quassel uses Qt's QTextBoundaryFinder to handle word and grapheme splitting in different languages.

That might be overkill for simpler/low-resource servers, though, and I'd agree with @DanielOaks on not firmly defining it in the spec. The ideal path involves all clients using the new maxline capability anyways, eventually removing this issue entirely.

However, it might help to also suggest looking into whatever tools already exist for your given language/framework (if any).

Contributor

digitalcircuit commented Nov 18, 2016

@clokep In some cases, the language/libraries might handle this automatically - e.g. Quassel uses Qt's QTextBoundaryFinder to handle word and grapheme splitting in different languages.

That might be overkill for simpler/low-resource servers, though, and I'd agree with @DanielOaks on not firmly defining it in the spec. The ideal path involves all clients using the new maxline capability anyways, eventually removing this issue entirely.

However, it might help to also suggest looking into whatever tools already exist for your given language/framework (if any).

@Shawn-Smith

This comment has been minimized.

Show comment
Hide comment
@Shawn-Smith

Shawn-Smith Nov 18, 2016

@Shawn-Smith octet is 8 bit = 1 byte.
@DanielOaks to make it less confusing, let's use "byte" instead of octet. These days there are no non-8-bit bytes anymore AFAIK.

@DarthGandalf The amount of bytes required for a character varies between charset and encoding. The reason RFC1459 states the max line length in characters and not octets and bytes is because it's charset/encoding agnostic. Regardless of what you use there will be 510 usable characters in an IRC line + the terminating CRLF.

This should be done the same way. Using characters and not octets or bytes.

Shawn-Smith commented Nov 18, 2016

@Shawn-Smith octet is 8 bit = 1 byte.
@DanielOaks to make it less confusing, let's use "byte" instead of octet. These days there are no non-8-bit bytes anymore AFAIK.

@DarthGandalf The amount of bytes required for a character varies between charset and encoding. The reason RFC1459 states the max line length in characters and not octets and bytes is because it's charset/encoding agnostic. Regardless of what you use there will be 510 usable characters in an IRC line + the terminating CRLF.

This should be done the same way. Using characters and not octets or bytes.

@DarthGandalf

This comment has been minimized.

Show comment
Hide comment
@DarthGandalf

DarthGandalf Nov 18, 2016

Member

@Shawn-Smith

2.2 Character codes
[...] The protocol is based on a set of codes which are composed of eight (8) bits, making up an octet.

RFC assumes that a character is 8 bits, that's why it uses such terms interchangeably.

Member

DarthGandalf commented Nov 18, 2016

@Shawn-Smith

2.2 Character codes
[...] The protocol is based on a set of codes which are composed of eight (8) bits, making up an octet.

RFC assumes that a character is 8 bits, that's why it uses such terms interchangeably.

@dwfreed

This comment has been minimized.

Show comment
Hide comment
@dwfreed

dwfreed Nov 18, 2016

@Shawn-Smith Section 8.2 of RFC1459 specifies a buffer of 512 bytes holds 1 full message. Section 2.2 can easily be interpreted to mean that a character is 8 bits (as @DarthGandalf pointed out while I was writing this). Furthermore, every implementation I've seen has used C char arrays (or something semantically similar), with 512 usable locations, to hold a line. Specifying that the line length limit is based on encoding-agnostic characters means that the IRCd must know the encoding used (not always possible) and that the storage is variable, which does not work well in C-based IRCds. Therefore, the line length should be specified in a unit that does not vary (eg bytes) for simplicity and ease of implementation.

dwfreed commented Nov 18, 2016

@Shawn-Smith Section 8.2 of RFC1459 specifies a buffer of 512 bytes holds 1 full message. Section 2.2 can easily be interpreted to mean that a character is 8 bits (as @DarthGandalf pointed out while I was writing this). Furthermore, every implementation I've seen has used C char arrays (or something semantically similar), with 512 usable locations, to hold a line. Specifying that the line length limit is based on encoding-agnostic characters means that the IRCd must know the encoding used (not always possible) and that the storage is variable, which does not work well in C-based IRCds. Therefore, the line length should be specified in a unit that does not vary (eg bytes) for simplicity and ease of implementation.

@dequis

This comment has been minimized.

Show comment
Hide comment
@dequis

dequis Nov 19, 2016

Contributor

What if the server advertises a line length that is larger than what the client is willing to accept?

If this is enabled directly through CAP REQ, that leaves no place for negotiation.

My idea of that was that the value in CAP LS is just a suggestion, and doing CAP REQ just enables a new verb to set maximum line length, which is client initiated and acked by the server. Something like this:

C: CAP LS
S: CAP * LS :maxline=4096
C: CAP REQ :maxline
S: CAP * ACK :maxline
C: MAXLINE 2048
S: MAXLINE * ACK 2048
(or)
C: MAXLINE 8192
S: MAXLINE * NAK 8192 4096

(verb name subject to change)

Downside: this may add too much additional complexity for servers and reduce their chances to reuse messages. For other ircd devs: How does this look from your point of view? My ircd isn't real enough to care about reusing messages.

Contributor

dequis commented Nov 19, 2016

What if the server advertises a line length that is larger than what the client is willing to accept?

If this is enabled directly through CAP REQ, that leaves no place for negotiation.

My idea of that was that the value in CAP LS is just a suggestion, and doing CAP REQ just enables a new verb to set maximum line length, which is client initiated and acked by the server. Something like this:

C: CAP LS
S: CAP * LS :maxline=4096
C: CAP REQ :maxline
S: CAP * ACK :maxline
C: MAXLINE 2048
S: MAXLINE * ACK 2048
(or)
C: MAXLINE 8192
S: MAXLINE * NAK 8192 4096

(verb name subject to change)

Downside: this may add too much additional complexity for servers and reduce their chances to reuse messages. For other ircd devs: How does this look from your point of view? My ircd isn't real enough to care about reusing messages.

@dequis

This comment has been minimized.

Show comment
Hide comment
@dequis

dequis Nov 19, 2016

Contributor

Another option: Specify a maximum upper bound that clients must accept.

Say, up to 64kb or some other arbitrary high-but-not-too-high-number. 640kb should be enough for everyone.

One could say that clients aren't as resource constrained as servers, or, at least, can't easily take advantage of message reuse like servers often do.

If the server advertises something higher than $upperbound, the client may reject it and choose to stay with 512+512 message lengths. Or it might accept it anyway, if its internal limit is higher than what the server advertises. But to be compliant with this spec that internal limit must not be lower than $upperbound

Contributor

dequis commented Nov 19, 2016

Another option: Specify a maximum upper bound that clients must accept.

Say, up to 64kb or some other arbitrary high-but-not-too-high-number. 640kb should be enough for everyone.

One could say that clients aren't as resource constrained as servers, or, at least, can't easily take advantage of message reuse like servers often do.

If the server advertises something higher than $upperbound, the client may reject it and choose to stay with 512+512 message lengths. Or it might accept it anyway, if its internal limit is higher than what the server advertises. But to be compliant with this spec that internal limit must not be lower than $upperbound

@IotaSpencer

This comment has been minimized.

Show comment
Hide comment
@IotaSpencer

IotaSpencer Nov 19, 2016

If IRC is a protocol, which it is, clients should if anything, use what the server sends them, granted, there should be an upper limit that servers can set, so we aren't having 1000+ character topics etc.

IotaSpencer commented Nov 19, 2016

If IRC is a protocol, which it is, clients should if anything, use what the server sends them, granted, there should be an upper limit that servers can set, so we aren't having 1000+ character topics etc.

@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Nov 19, 2016

Member

RFC1459:

2.3 Messages

   IRC messages are always lines of characters terminated with a CR-LF
   (Carriage Return - Line Feed) pair, and these messages shall not
   exceed 512 characters in length, counting all characters including
   the trailing CR-LF. Thus, there are 510 characters maximum allowed

5.8 Ison Message

   set of nicks  given  in  the parameter.  The only limit on the number
   of nicks that may be checked is that the combined length must not be
   too large as to cause the server to chop it off so it fits in 512
   characters.

8.2 Command Parsing

   results of the most recent read and parsing are kept.  A buffer size
   of 512 bytes is used so as to hold 1 full message, although, this

RFC refers to both bytes and characters. For this spec, I think just saying bytes makes sense (afaik most IRC servers should already be doing it based on bytes) so I'll do that.

Member

DanielOaks commented Nov 19, 2016

RFC1459:

2.3 Messages

   IRC messages are always lines of characters terminated with a CR-LF
   (Carriage Return - Line Feed) pair, and these messages shall not
   exceed 512 characters in length, counting all characters including
   the trailing CR-LF. Thus, there are 510 characters maximum allowed

5.8 Ison Message

   set of nicks  given  in  the parameter.  The only limit on the number
   of nicks that may be checked is that the combined length must not be
   too large as to cause the server to chop it off so it fits in 512
   characters.

8.2 Command Parsing

   results of the most recent read and parsing are kept.  A buffer size
   of 512 bytes is used so as to hold 1 full message, although, this

RFC refers to both bytes and characters. For this spec, I think just saying bytes makes sense (afaik most IRC servers should already be doing it based on bytes) so I'll do that.

@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Nov 19, 2016

Member

@dequis The main reason I've done it this way is to avoid that complexity. Just with mine, having to regenerate where to split messages for every single user based on whatever line length they've accepted seems really dodgy and could be more resource-intensive than it needs to be. At least with this (taking into account max nicklen/userlen/hostlen and all), you can generally split it once and use it across all the clients that haven't accepted longer line lengths.

If the client's not willing to accept the larger line length, I think they'd just not request the cap and leave it there.

Member

DanielOaks commented Nov 19, 2016

@dequis The main reason I've done it this way is to avoid that complexity. Just with mine, having to regenerate where to split messages for every single user based on whatever line length they've accepted seems really dodgy and could be more resource-intensive than it needs to be. At least with this (taking into account max nicklen/userlen/hostlen and all), you can generally split it once and use it across all the clients that haven't accepted longer line lengths.

If the client's not willing to accept the larger line length, I think they'd just not request the cap and leave it there.

@djahandarie

This comment has been minimized.

Show comment
Hide comment
@djahandarie

djahandarie Nov 20, 2016

A few thoughts on the basic idea of longer message lengths:

  • Rate limiting: Most IRCds rate limit on lines/second for a given PRIVMSG/NOTICE target. With longer potential line lengths, those rate limits would need to be reduced to account for worst-case floods. Unfortunately, simply reducing the rate limit can cause problems for active channels — to support both active channels and protecting user's sendq buffers, IRCds would likely need to implement bandwidth-based rate limiting.

  • Denial of service: One effective protection most IRC-based software has is that the length of any message they need to parse is bound to 512 bytes. With longer messages, inefficient parsers become exposed to a far wider range of potentially dangerous inputs.

    Putting services and normal clients/bots aside, even IRCds themselves have various parsers/algorithms that I'd be fairly worried about under long message lengths: what happens when someone provides a max-length input to a cryptographic algorithm (like SASL or oper passwords or CHALLENGE), or to a channel mode parser, or to an extban parser, etc.

    Every single reasonably complex block of code which had been able to take this max-length protection for granted would need to be tested for performance issues under long inputs.

I think there should likely be some guidance about these points in the spec.

djahandarie commented Nov 20, 2016

A few thoughts on the basic idea of longer message lengths:

  • Rate limiting: Most IRCds rate limit on lines/second for a given PRIVMSG/NOTICE target. With longer potential line lengths, those rate limits would need to be reduced to account for worst-case floods. Unfortunately, simply reducing the rate limit can cause problems for active channels — to support both active channels and protecting user's sendq buffers, IRCds would likely need to implement bandwidth-based rate limiting.

  • Denial of service: One effective protection most IRC-based software has is that the length of any message they need to parse is bound to 512 bytes. With longer messages, inefficient parsers become exposed to a far wider range of potentially dangerous inputs.

    Putting services and normal clients/bots aside, even IRCds themselves have various parsers/algorithms that I'd be fairly worried about under long message lengths: what happens when someone provides a max-length input to a cryptographic algorithm (like SASL or oper passwords or CHALLENGE), or to a channel mode parser, or to an extban parser, etc.

    Every single reasonably complex block of code which had been able to take this max-length protection for granted would need to be tested for performance issues under long inputs.

I think there should likely be some guidance about these points in the spec.

@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Nov 20, 2016

Member

Hmm, considering most users' messages (privmsgs/notices) are likely to be under 512 chars, could simply suggest something along the lines of this in a non-normative implementation considerations section:

Servers should carefully consider their rate-limiting policies when implementing this extension. For example, with a rate-limiter based on penalties applied to each client, where the server currently applies one rate-limiting penalty for messages 512-bytes and under, they may apply multiple rate-limiting penalties for larger messages to compensate.

This is particularly useful on the PRIVMSG and NOTICE commands, where the required message splitting may increase the workload on servers by a significant amount on larger messages.

Those general denial-of-service issues on the other hand... yeah that's hard to address, and becomes very interesting particularly as you look at passwords on registration/authentication. Unfortunately I don't think we can address that in the spec itself, more just something that implementers need to keep a close eye on. Maybe a similar sort of non-normative suggestion:

Most current algorithms and parsers are tuned and intended specifically to be used with 512-length messages. This extension allows servers to vastly increase these limits. As such, server, client, and services authors should all take a very close look at the algorithms and message-parsing code while investigating this extension. In particular, certain algorithms may present certain overflows or degraded performance when dealing with lines longer than were intended when those algorithms were written.

As an example, certain clients may find it worth having a minimum and a maximum limit to allow.

One example of an area to pay specific attention to is usernames/passwords when used for registration and authentication. As an example, clients may be locked out of accounts and channels when registering on a client with maxline, but later using a client without maxline to authenticate, and similar issues.

Thoughts?

edit: Better specified section and improved language, thanks @jwheare

Member

DanielOaks commented Nov 20, 2016

Hmm, considering most users' messages (privmsgs/notices) are likely to be under 512 chars, could simply suggest something along the lines of this in a non-normative implementation considerations section:

Servers should carefully consider their rate-limiting policies when implementing this extension. For example, with a rate-limiter based on penalties applied to each client, where the server currently applies one rate-limiting penalty for messages 512-bytes and under, they may apply multiple rate-limiting penalties for larger messages to compensate.

This is particularly useful on the PRIVMSG and NOTICE commands, where the required message splitting may increase the workload on servers by a significant amount on larger messages.

Those general denial-of-service issues on the other hand... yeah that's hard to address, and becomes very interesting particularly as you look at passwords on registration/authentication. Unfortunately I don't think we can address that in the spec itself, more just something that implementers need to keep a close eye on. Maybe a similar sort of non-normative suggestion:

Most current algorithms and parsers are tuned and intended specifically to be used with 512-length messages. This extension allows servers to vastly increase these limits. As such, server, client, and services authors should all take a very close look at the algorithms and message-parsing code while investigating this extension. In particular, certain algorithms may present certain overflows or degraded performance when dealing with lines longer than were intended when those algorithms were written.

As an example, certain clients may find it worth having a minimum and a maximum limit to allow.

One example of an area to pay specific attention to is usernames/passwords when used for registration and authentication. As an example, clients may be locked out of accounts and channels when registering on a client with maxline, but later using a client without maxline to authenticate, and similar issues.

Thoughts?

edit: Better specified section and improved language, thanks @jwheare

@grawity

This comment has been minimized.

Show comment
Hide comment
@grawity

grawity Nov 20, 2016

Contributor

SASL, at least, shouldn't be a problem since AUTHENTICATE already supports continuations, so mechanisms already expect huge messages.

Contributor

grawity commented Nov 20, 2016

SASL, at least, shouldn't be a problem since AUTHENTICATE already supports continuations, so mechanisms already expect huge messages.

line-lengths: Add implementation considerations.
Thanks @djahandarie for the advice here and for bringing this issue up.
@nomis

This comment has been minimized.

Show comment
Hide comment
@nomis

nomis Nov 25, 2016

Contributor

Is the problem with wanting increased line lengths or just with knowing when your line is going to be truncated?

I don't think defining a limit in terms of line length alone works because the final message limit varies based on the hostname that the recipient sees, which may be different from what the sender thinks their hostname is or for privileged users that can see real hostnames. The end result is that clients may still try to guess where the text will be truncated and get it wrong.

If you split messages on behalf of clients, what happens when a message with colours and other formatting characters gets split?

What happens when a message with commands (in the middle of the line) intended for a bot gets split?
i.e. a long response from bot A (negotiating very long lines) prefixed with "nickname: " is split and another bot B (negotiating very short lines) interprets text inside that long response as a command.

I'd be inclined to add a MAXMSGLEN= to 005 and reduce the current limit based on the maximum length of other components of the message...

Contributor

nomis commented Nov 25, 2016

Is the problem with wanting increased line lengths or just with knowing when your line is going to be truncated?

I don't think defining a limit in terms of line length alone works because the final message limit varies based on the hostname that the recipient sees, which may be different from what the sender thinks their hostname is or for privileged users that can see real hostnames. The end result is that clients may still try to guess where the text will be truncated and get it wrong.

If you split messages on behalf of clients, what happens when a message with colours and other formatting characters gets split?

What happens when a message with commands (in the middle of the line) intended for a bot gets split?
i.e. a long response from bot A (negotiating very long lines) prefixed with "nickname: " is split and another bot B (negotiating very short lines) interprets text inside that long response as a command.

I'd be inclined to add a MAXMSGLEN= to 005 and reduce the current limit based on the maximum length of other components of the message...

@jwheare jwheare added the protocol label Jan 7, 2017

@jwheare jwheare modified the milestone: Roadmap Jan 7, 2017

@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Jan 13, 2017

Member

I'll write the changes to let it interact much more nicely with message-tags' extension of tag lengths once those changes are merged in.

Member

DanielOaks commented Jan 13, 2017

I'll write the changes to let it interact much more nicely with message-tags' extension of tag lengths once those changes are merged in.

@DarthGandalf

This comment has been minimized.

Show comment
Hide comment
@DarthGandalf

DarthGandalf Jan 13, 2017

Member

CAP LS 302

Otherwise, server isn't allowed to add =digits to the response

Member

DarthGandalf commented on extensions/line-lengths.md in 73f538c Jan 13, 2017

CAP LS 302

Otherwise, server isn't allowed to add =digits to the response

@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Jan 19, 2017

Member

There are still some q's to answer in the spec itself, but for peeps looking to test I've got an example implementation here.

Member

DanielOaks commented Jan 19, 2017

There are still some q's to answer in the spec itself, but for peeps looking to test I've got an example implementation here.

@jwheare

This comment has been minimized.

Show comment
Hide comment
@jwheare

jwheare Jan 19, 2017

Member

How will this interact with message IDs?

Given a stupidly short maxline (just for the sake of the example), a maxline supporting client might see:

@msgid=123 PRIVMSG #chan :long line that may or may not get split

While a legacy client might see:

@msgid=123 PRIVMSG #chan :long line that may
@msgid=123 PRIVMSG #chan : or may not get split

Is something like a @continuation=123 tag needed for the second line so the client can treat it as a single logical message. So e.g.

@msgid=123 PRIVMSG #chan :long line that may
@continuation=123 PRIVMSG #chan : or may not get split

Member

jwheare commented Jan 19, 2017

How will this interact with message IDs?

Given a stupidly short maxline (just for the sake of the example), a maxline supporting client might see:

@msgid=123 PRIVMSG #chan :long line that may or may not get split

While a legacy client might see:

@msgid=123 PRIVMSG #chan :long line that may
@msgid=123 PRIVMSG #chan : or may not get split

Is something like a @continuation=123 tag needed for the second line so the client can treat it as a single logical message. So e.g.

@msgid=123 PRIVMSG #chan :long line that may
@continuation=123 PRIVMSG #chan : or may not get split

@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Jan 19, 2017

Member

I could see a continuation tag make sense there (that works akin to that last example you posted). I'll throw that into the spec.

Member

DanielOaks commented Jan 19, 2017

I could see a continuation tag make sense there (that works akin to that last example you posted). I'll throw that into the spec.

@DarthGandalf

This comment has been minimized.

Show comment
Hide comment
@DarthGandalf

DarthGandalf Jan 19, 2017

Member
Member

DarthGandalf commented Jan 19, 2017

@jwheare

This comment has been minimized.

Show comment
Hide comment
@jwheare

jwheare Jan 19, 2017

Member

Yeah perhaps a @split or @continued tag might make sense. Maybe with a value indicating how many continuations are to follow? So e.g. a theoretical message split into 3:

@msgid=123;continued=2 PRIVMSG #chan :long line that may
@continuation=123;continued=1 PRIVMSG #chan : or may not
@continuation=123 PRIVMSG #chan : get split

Dunno if the tag value is necessary, would it enable any valuable use cases?

Member

jwheare commented Jan 19, 2017

Yeah perhaps a @split or @continued tag might make sense. Maybe with a value indicating how many continuations are to follow? So e.g. a theoretical message split into 3:

@msgid=123;continued=2 PRIVMSG #chan :long line that may
@continuation=123;continued=1 PRIVMSG #chan : or may not
@continuation=123 PRIVMSG #chan : get split

Dunno if the tag value is necessary, would it enable any valuable use cases?

@jwheare

This comment has been minimized.

Show comment
Hide comment
@jwheare

jwheare Jan 19, 2017

Member

Oops, forgot about @ mentioning usernames :/

Member

jwheare commented Jan 19, 2017

Oops, forgot about @ mentioning usernames :/

@LFP6

This comment has been minimized.

Show comment
Hide comment
@LFP6

LFP6 Jan 19, 2017

Any reason why that continuation type can't be a batch tag?

LFP6 commented Jan 19, 2017

Any reason why that continuation type can't be a batch tag?

@LFP6

This comment has been minimized.

Show comment
Hide comment
@LFP6

LFP6 Jan 19, 2017

*Continuation tag

LFP6 commented Jan 19, 2017

*Continuation tag

@jwheare

This comment has been minimized.

Show comment
Hide comment
@jwheare

jwheare Jan 19, 2017

Member

Yes it could be I suppose.

@msgid=123 BATCH +foo split-msg
@batch=foo PRIVMSG #chan :long line that may
@batch=foo PRIVMSG #chan : or may not
@batch=foo PRIVMSG #chan : get split
BATCH -foo

And I guess if the client didn't enable batch, only the first message would get the tags. Although all messages could safely have e.g. an @account or @time tag. But mgid, label and client tags should probably not be duplicated.

Member

jwheare commented Jan 19, 2017

Yes it could be I suppose.

@msgid=123 BATCH +foo split-msg
@batch=foo PRIVMSG #chan :long line that may
@batch=foo PRIVMSG #chan : or may not
@batch=foo PRIVMSG #chan : get split
BATCH -foo

And I guess if the client didn't enable batch, only the first message would get the tags. Although all messages could safely have e.g. an @account or @time tag. But mgid, label and client tags should probably not be duplicated.

@jwheare

This comment has been minimized.

Show comment
Hide comment
@jwheare

jwheare Jan 19, 2017

Member

@lp0:

What happens when a message with commands (in the middle of the line) intended for a bot gets split?
i.e. a long response from bot A (negotiating very long lines) prefixed with "nickname: " is split and another bot B (negotiating very short lines) interprets text inside that long response as a command.

Is this a new or real world issue? Client side splitting has the same theoretical issue but does it cause problems?

Member

jwheare commented Jan 19, 2017

@lp0:

What happens when a message with commands (in the middle of the line) intended for a bot gets split?
i.e. a long response from bot A (negotiating very long lines) prefixed with "nickname: " is split and another bot B (negotiating very short lines) interprets text inside that long response as a command.

Is this a new or real world issue? Client side splitting has the same theoretical issue but does it cause problems?

@ProgVal

This comment has been minimized.

Show comment
Hide comment
@ProgVal

ProgVal Jan 19, 2017

Contributor

Is there a client these continuation tags or batches will benefit to?

To be more specific: is there a client that will be updated to support continuation tags or batches, but not to support arbitrary line length?

Contributor

ProgVal commented Jan 19, 2017

Is there a client these continuation tags or batches will benefit to?

To be more specific: is there a client that will be updated to support continuation tags or batches, but not to support arbitrary line length?

@jwheare

This comment has been minimized.

Show comment
Hide comment
@jwheare

jwheare Jan 19, 2017

Member

It's a good point. It does seem like quite an edge case and may not be worth the bother.

Member

jwheare commented Jan 19, 2017

It's a good point. It does seem like quite an edge case and may not be worth the bother.

@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Jan 19, 2017

Member

Exactly, that's why I'll just go with whatever option is simplest and throw in an update to the proposal. Should go well

Member

DanielOaks commented Jan 19, 2017

Exactly, that's why I'll just go with whatever option is simplest and throw in an update to the proposal. Should go well

@SaberUK

This comment has been minimized.

Show comment
Hide comment
@SaberUK

SaberUK Jan 19, 2017

Contributor

To be more specific: is there a client that will be updated to support continuation tags or batches, but not to support arbitrary line length?

Having completely arbitrary line lengths is risky, what else should happen if the server limit is beyond that which the client limits to?

Contributor

SaberUK commented Jan 19, 2017

To be more specific: is there a client that will be updated to support continuation tags or batches, but not to support arbitrary line length?

Having completely arbitrary line lengths is risky, what else should happen if the server limit is beyond that which the client limits to?

@dequis

This comment has been minimized.

Show comment
Hide comment
@dequis

dequis Jan 19, 2017

Contributor

Having completely arbitrary line lengths is risky, what else should happen if the server limit is beyond that which the client limits to?

Yeah, since this spec explicitly doesn't cover length negotiation to simplify things, clients have to choose to accept it or reject the length offered by the server, but they might still want to reassemble split messages. Either continuation tags or batches are fine for that.

Contributor

dequis commented Jan 19, 2017

Having completely arbitrary line lengths is risky, what else should happen if the server limit is beyond that which the client limits to?

Yeah, since this spec explicitly doesn't cover length negotiation to simplify things, clients have to choose to accept it or reject the length offered by the server, but they might still want to reassemble split messages. Either continuation tags or batches are fine for that.

@ProgVal

This comment has been minimized.

Show comment
Hide comment
@ProgVal

ProgVal Jan 19, 2017

Contributor

Why would a client refuse a given message size if it can reassemble the same size it afterward?

Contributor

ProgVal commented Jan 19, 2017

Why would a client refuse a given message size if it can reassemble the same size it afterward?

@dequis

This comment has been minimized.

Show comment
Hide comment
@dequis

dequis Jan 19, 2017

Contributor

Not necessarily the same size. For example, if the server offers 1mbyte and the client accepts up to 64kbytes, it can choose to stop reassembling incoming messages after appending that much to the original message and show them as separate messages. Most manually-written messages are going to be way smaller than the limit anyway, I'd expect almost everything to be in the range of 1-4 parts.

Contributor

dequis commented Jan 19, 2017

Not necessarily the same size. For example, if the server offers 1mbyte and the client accepts up to 64kbytes, it can choose to stop reassembling incoming messages after appending that much to the original message and show them as separate messages. Most manually-written messages are going to be way smaller than the limit anyway, I'd expect almost everything to be in the range of 1-4 parts.

LFP6 added a commit to EteRNAgame/HTML-Chat that referenced this pull request Feb 5, 2017

Update max message length
Hopefully this can be boosted as time goes on, particularly via ircv3/ircv3-specifications#281

LFP6 added a commit to EteRNAgame/HTML-Chat that referenced this pull request Sep 28, 2017

Update max message length
Hopefully this can be boosted as time goes on, particularly via ircv3/ircv3-specifications#281
@DanielOaks

This comment has been minimized.

Show comment
Hide comment
@DanielOaks

DanielOaks Oct 29, 2017

Member

I'd rather go with some sort of continuation cap that works with labels than this. The boosted tag space in the new message-tags spec helps out to a decent extent anyway, and given the complicated nature of this spec, I'd rather focus on something grounded a bit more in reality.

Member

DanielOaks commented Oct 29, 2017

I'd rather go with some sort of continuation cap that works with labels than this. The boosted tag space in the new message-tags spec helps out to a decent extent anyway, and given the complicated nature of this spec, I'd rather focus on something grounded a bit more in reality.

@DanielOaks DanielOaks closed this Oct 29, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment