Encoding issues on latest windows build ? #1636
Comments
The people who you're getting invalid characters from are not using UTF-8. Tell them to use UTF-8. |
I am sorry to see, that this issue has been closed, as I believe it is valid. "Tell them to use UTF-8" is to me, a rather ignorant approche, as this is an issue introduced by the latest hexchat build. |
The reverse of the issue (UTF-8 text incorrectly getting interpreted as system-locale-encoding text and thus breaking) was present in the previous builds. You were just lucky you never encountered it. If two sides of a conversation pass text to each other, they have to agree on the encoding. This is common sense, not ignorance. |
So, it is pure coincidence, that the last 3 months of usage within the same channel, with the same people, have not shown any sign of the problem ? |
Yes, since I and many others encountered it constantly with words with quotes getting treated incorrectly.
I'm sure it does. Common sense is rather hard to be found these days. |
I see, and have not seen any issues with quotes getting treated incorrectly.
Seriously ? |
It is as simple as they are sending an encoding that is not the encoding you use. Previously releases had various hacks that attempted multiple encodings. This was removed because it is an awful idea that is client specific and introduces various corruption issues that can be solved by everybody agreeing on an encoding. |
Says the guy who called me ignorant for knowing what I'm talking about. |
Thank you for actually explaining what and why we are seeing this issue.
I commented on the "Tell them to use UTF-8" statement. I am aware of the technical reasons for this happening. I just never got any response as to why something had worked previously for me, and obviously other users, had stopped working. |
This change was mentioned in the release notes because it is a "regression" for some users. |
All right, I see there was a language problem, so I apologize for assuming the worst. FYI when you say "an ignorant approach" the word "ignorant" applies to the person who suggested the approach. It doesn't mean "an approach for which there is no justification". So |
Hi, Best regards. |
The time to uninstall HC, sign up for Github, search for the relevant issue and comment on it could've been better spent setting the network encoding correctly. |
Hi Arnavion, Sadly impossible, 2 encodings are used on the network UTF-8 + ISO-885915 |
problem is people who don't want to change their clients, or server configuration and as hexchat is more forcing, i simply downgraded to 2.10, may be buggy for some but still it work fine for me and get ride of all this mess. |
They're all mIRC users, aren't they? Anyway, when everyone in this thread and #1198 can come up with a solution that's acceptable to all of you, I'm happy to consider it. |
I think not falling back to ISO-8859-1 on every line is just objectively the right thing to do. A more generic version of the "IRC" encoding is common in other clients though where a user can explicitly configure a fallback encoding which handles most situations (It does re-add old problems, but would be more explicit and opt-in). |
If this feature worked fine for some users (like me btw) and they have problems now, this feature shouldn't be removed. A better idea would be to make it optional and let the user choose what he wants. Telling all users in a chat to change their encoding when some of them use web clients is impossible. |
Which web client uses an awful encoding by default and doesn't support changing the encoding?
I'm not totally against that but the previous solution needed to be ripped out as everything was hardcoded to be broken. |
That's not the point. Users who use webclients have generally no idea about IRC at all, otherwise they would use a real client.
And by unbreaking one feature you broke another feature. I think you should fix it. |
Yes it is. If such a web client exists then tell us who it is, so we can tell them to fix it. |
The problem seems to originate from here: I can't see all characters from users connecting through this webclient and there is no option to change encoding in this client (at least I can't see an obvious one). This chat is used for livestreams that are watched by hundreds of users. Given that size, your solution of telling the members of a chatroom to change their settings is not applicable to the real world. |
That looks like a German network, so it's likely everyone there is using a German encoding. Set HC to use the same encoding. |
It is a German network, but with the same situation that grui has posted here:
So whichever encoding I set in HexChat, I can only lose. WebChat users cannot change their encoding. |
Is it possible to add an option like "[ ] Use heuristic to guess non-UTF8 characters" that would enable the old behavior? |
To be clear, the previous solution did nothing smart as it simply tried a second encoding and then just used that result. Equally as often resulting in corrupt garbage as the correct result. As I've mentioned I would be ok with a feature that allowed selecting a second encoding to just always try. The same problem would basically exist but the user is opting into that broken behavior at least. |
Technically forcing UTF-8 only on IRC is a good thing and I agree with that goal. However, there's not enough time or support resources in the world to educate the dozens of people that send broken characters. I tried to work with one using irssi (on a Linux shell machine) but they claim they have everything set right and I'm only one with problems so it's "my problem". There are also various, especially on mobile platforms, IRC clients that don't even support UTF8 so they are unfixable. So it will be impossible to get everybody else fix their characters so we are stuck with the ugly UTF8 unknown character symbols. This is especially problem with country specific channels in IRCnet. I have a feeling all the other IRC clients are doing this detection as nobody else is having problems (not only mIRC users) except latest hexchat version users. I guess there's no provisions in the IRC protocol for the encoding and the "standard" still assumes pure ASCII? Or has that changed in later revisions? Is there any other work going on to address this very real problem or is the patch reversal and/or version downgrade only way? I feel there's possibility that was something else was broken while removing this feature as it doesn't seem to work against some IRC clients. Has a core developer verified latest version on IRCnet with non-ASCII characters? If not, anybody has time to check? I'll try to verify what settings this particular irssi user has and can provide any help I can to fix this issue. In any case, please reopen and reconsider this issue. Otherwise I'm afraid lots of people will be stuck on an older version with any security problems. |
The number of clients that don't support setting an encoding and default to not-utf8 is pretty small I believe. I'll repeat myself yet again and say that an option to opt-in to being broken would be fine, I don't know of anybody who plans on doing that work though. |
Reopening because a PR to add a fallback encoding option is welcomed. |
Hello everyone,
I'm wondering if someone else is experiencing this problem, after updating to the latest windows build of Hexchat (2.12.0).
from: 97af149822e11c53385b2358a4419972 (HexChat 2.12.0 x86 installer)
Some characters seems not properly rendered.
I'm always in UTF-8 (Unicode) on my channels and i never had this problem with previous hexchat versions.
Another guy (who's using the x64 version) got the same problem after updating to 2.12.0 that why i opened this thread to see if anyone is experiencing troubles.
as French we have a lot of accents and shits, and it's a bit frustrating to read text with '�' char instead of accents.
i don't know why some chars are badly parsed like that, and depending of people talking, accents come good:
I'm looking at my parameters but i have no idea of what's can i do to fix this kind of problem.
Regards
The text was updated successfully, but these errors were encountered: