Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the UTF8ONLY IRCv3 specification #2514

Merged
merged 2 commits into from
Aug 4, 2023
Merged

Conversation

progval
Copy link
Contributor

@progval progval commented Aug 28, 2021

When the server sends a UTF8ONLY isupport token, override the user configuration and use UTF-8 to send and receive messages.

Relevant bits from https://ircv3.net/specs/extensions/utf8-only:

Servers publishing this token MUST NOT relay content [...] containing non-UTF-8 data to clients.

Clients implementing this specification MUST NOT send non-UTF-8 data to the server once they have seen this token.

If a client implementing this specification sees this token, they MUST set their outgoing encoding to UTF-8 without requiring any user intervention

@wodim
Copy link
Member

wodim commented Aug 28, 2021

override the user configuration

???

@progval
Copy link
Contributor Author

progval commented Aug 28, 2021

One of the purposes of UTF8ONLY is to prevent misconfigured clients from sending or decoding messages with the wrong charset, so I think it makes sense to override it in this case.

@DarthGandalf
Copy link
Member

DarthGandalf commented Aug 28, 2021 via email

@wodim
Copy link
Member

wodim commented Aug 28, 2021

Considering the server will send that "utf8only" command every single session, this permanently denies the user the choice of using any other encoding.

A general option that says something like "let servers choose an encoding for me" which is on by default would be a good compromise

@progval
Copy link
Contributor Author

progval commented Aug 28, 2021

this permanently denies the user the choice of using any other encoding.

Yes, that's a feature. It means that, once enough clients support it, server operators can safely enable UTF8ONLY, as it will switch all clients, even misconfigured ones.

Letting user configuration override UTF8ONLY nullifies the goals of UTF8ONLY.

@wodim
Copy link
Member

wodim commented Aug 28, 2021

I don't understand this thing where you think it's appropriate to implement a backdoor in a client to allow a server to change (or ignore) the configuration without informing the user or asking for consent.

As far as I remember we already default to utf8 on all platforms. Changing the encoding for a particular server is a conscious decision by the user. What will happen when users find that they can no longer understand messages written by other users who have an old version of mirc that uses cp1252 for example?

@progval
Copy link
Contributor Author

progval commented Aug 28, 2021

If I'm not mistaken, this PR only changes the encoding for the connection to servers that advertises UTF8ONLY, not for other connections.

What will happen when users find that they can no longer understand messages written by other users who have an old version of mirc that uses cp1252 for example?

If a server advertises UTF8ONLY, it will reject these messages. So with or without this patch, they won't be displayed.

@SoniEx2
Copy link
Contributor

SoniEx2 commented Aug 28, 2021

So how do you use base128 scripts with this?

@progval
Copy link
Contributor Author

progval commented Aug 28, 2021

I don't know how these scripts work, but if they send non-UTF8 data, then UTF8ONLY servers will reject them; with or without this patch.

@SoniEx2
Copy link
Contributor

SoniEx2 commented Aug 28, 2021

They send all bytes > 0x7F. So they can use a full 7 bits per byte. It works great today.

@progval
Copy link
Contributor Author

progval commented Aug 28, 2021

Did you try on a UTF8ONLY network (eg. ergo.chat)? They are likely to be rejected.

@SoniEx2
Copy link
Contributor

SoniEx2 commented Aug 28, 2021

Yes, why break perfectly good stuff that's more efficient than the alternatives?

@progval
Copy link
Contributor Author

progval commented Aug 28, 2021

If it works, it means your script only sends UTF-8 data, so it won't be affected by this patch.

@SoniEx2
Copy link
Contributor

SoniEx2 commented Aug 28, 2021

No, your network breaks it. Something that had worked for years without anyone ever complaining.

@progval
Copy link
Contributor Author

progval commented Aug 28, 2021

This is not my network; and I don't see how that's relevant to this discussion. If you don't want UTF8ONLY, you should bring it up there or not use their network. Either way, this patch won't affect your scripts.

@SadieCat
Copy link

SadieCat commented Aug 28, 2021

Nobody is using "base128 scripts" other than Soni so this is not really a concern. They keep coming up with ridiculous extensions and trying to force developers to implement them to the point that they are banned from several IRC projects (InspIRCd, IRCv3, ircdocs, etc) for trolling.

On a more productive note, over on ircv3-ideas I have proposed a "UTF-8 recommended" ISUPPORT token to be used as part of a migration path to UTF8ONLY which tells clients that they should reconfigure their connection configuration to send UTF-8 data but should be prepared to receive non-UTF-8 data from other users. This will allow users of non-UTF-8 encodings to be automatically migrated over time. I'm likely to implement this into InspIRCd at some point in the future.

@SoniEx2
Copy link
Contributor

SoniEx2 commented Aug 28, 2021

Please don't wave around personal attacks as fact.

Also, wasn't there a CHARSET=UTF-8 ISUPPORT already? why's nobody using it?

@ctrlaltca
Copy link
Contributor

Since this:

  • doesn't overwrite any config
  • doesn't remove any freedom from the user (you can't send non-utf8 data to a server using this spec anyway)
  • is already implemented in multiple server and clients (hexchat, mirc, ...)
    I guess it's good to be merged.

@ctrlaltca ctrlaltca merged commit 1fc6abe into kvirc:master Aug 4, 2023
progval added a commit to progval/ircv3.github.io that referenced this pull request Aug 4, 2023
jwheare pushed a commit to ircv3/ircv3.github.io that referenced this pull request Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants