Skip to content
This repository has been archived by the owner on Feb 10, 2024. It is now read-only.

Enhancement: Raw __undecoded__ line access from scripts #1430

Open
programadorhedonista opened this issue Jul 23, 2015 · 6 comments
Open

Enhancement: Raw __undecoded__ line access from scripts #1430

programadorhedonista opened this issue Jul 23, 2015 · 6 comments

Comments

@programadorhedonista
Copy link

I wish the scripts (python) could access to the raw line received from server, before pass it through text_convert_invalid() (in server_inline)

Maybe something like another function, hook_server_rawline(), or another print event, "UNDECODED_LINE", ...

This way, the scripts could get an ISO-8859 (or cp1252) line from servers who still use ISO, encode it to utf-8, and print it, like they can do with old xchat.

Many thanks for considering my request.

@TingPing
Copy link
Member

If you want the raw line after endcoding see: https://hexchat.readthedocs.org/en/latest/script_python.html#hexchat.hook_server

If you want to change encodings see /charset

I have no clue why what you are asking for is valuable.

@programadorhedonista
Copy link
Author

I use a irc network where people use iso-8859-15, some people cp1252, and some people utf-8.
I can't configure one single encoding who work with all the encodings.

Some time ago, with xchat, I can configure iso-8859-15 enconding and put a script who get the original strings as it is sent by the server, and if detect a utf-8 enconding reencode it as iso-8859.
(a example script at the end of this page: http://labix.org/xchat-python )
This way, I can see all the text, no matter what encoding the people use.

In hexchat it's impossible, because all strings are reenconded before the script can get it, losing the original string and can't repair it.

It's vital to get the strings unchanged, before encoding to utf-8.

(excuse my english)

@TingPing
Copy link
Member

I think what you are trying to do should happen at a layer much lower than the plugin api. All strings stored in-memory by hexchat should be guaranteed to be valid utf-8 which means converting them the second they come in over the network. This is obviously a good thing that removes garbage text from being passed around. To allow a single api to bypass this only has this one use-case.

I think the best option for you would to be directly modify hexchat to maybe add some voodoo 'auto' encoding that tries to guess what encoding things are and hope its correct.

@programadorhedonista
Copy link
Author

And if Hexchat continue using utf-8 only strings, but the bytes received from server are copied as a block of bytes, and a copy is keep unmanaged by hexchat, accessed by scripts (and plugins) ?

This way hexchat don't must to do voodoo, leaving this thing to scripts.

@Arnavion
Copy link
Contributor

HC doesn't do voodoo. It assumes the user selected the correct encoding for the server.

HC does already maintain the raw line bytes (server->linebuf), so it's not impossible to expose this to plugins in server_inline before running it through the server converter.

Even if we did that, the plugin's output would still be run through the network decoder. So say if you selected 1252 for the network encoding, then your plugin would have to output 1252 back to HC. Presumably you would select utf-8 since that is easiest for plugins to output.

@programadorhedonista
Copy link
Author

HC does already maintain the raw line bytes (server->linebuf), so it's not impossible to expose this to plugins in server_inline before running it through the server converter.

I hope so, as I said in the first message.

Even if we did that, the plugin's output would still be run through the network decoder. So say if you selected 1252 for the network encoding, then your plugin would have to output 1252 back to HC. resumably you would select utf-8 since that is easiest for plugins to output.

yeah, this way will work too. Configuring hexchat to use UTF-8, and the script detecting "raw bytes in ISO" and printing all UTF-8 after reencode it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

3 participants