Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GoldSrc VGUIs READ_STRING() and 0xFF character bug #950

Open
ghost opened this issue Apr 5, 2013 · 11 comments
Open

GoldSrc VGUIs READ_STRING() and 0xFF character bug #950

ghost opened this issue Apr 5, 2013 · 11 comments
Assignees
Labels

Comments

@ghost
Copy link

ghost commented Apr 5, 2013

When VGUI element is using READ_STRING() function and it's text contains at least one character with ASCII code 0xFF, all text starting from this character is missing. For example, such a VGUI element is MOTD window in HL:OF / TFC.

I searched a way to bypass this bug a long time ago, and, IIRC, the problem is a comparison of every character's ASCII code with "-1" value (meaning the string's EOF) in READ_STRING() function. So, why not to use 0x00 byte as EOF, like in standard ASCIIZ?

@alfred-valve
Copy link
Contributor

How would this bug actually effect a customer? Shouldn't a server op just not put invalid chars in their motd?

@ghost
Copy link
Author

ghost commented Apr 26, 2013

Actually, Western European languages use 0xFF character. Also, usage of -1 as EOF looks like some cumbersome rudiment. :)

@johndrinkwater
Copy link
Contributor

Problem is slightly complicated FrenchMan, but might not be half‐life’s fault except for assuming the quality of data given to it - if users enter ÿ as UTF-16 the engine is given two bytes of 0x00 0xFF, whereas in UTF-8 its 0xC3 0xBF. Making certain you input UTF-8 afaics would mitigate this issue.

As a codepoint, ÿ (U+00FF) is valid Unicode, not valid ascii so I guess that compounds it. Moving to a different EOL is going to have the same issue with faulty input.

@ghost
Copy link
Author

ghost commented Apr 26, 2013

@johndrinkwater

Yes, you are partially right, but you're talking about Unicode representation while I'm talking about VGUI1 which uses ANSI and is able to display any given data as ANSI only -- it uses 8 bits per character. The text in motd.txt for VGUI1 games is actually stored as ANSI, and transmitted from server to client 'as is'.

So, the problem is certainly caused by the engine.

And, considering your remark about moving to different EOF, ASCII 0x00 is Unicode 0x00 0x00 (both UTF-8 and UTF-16). Of course, Unicode range is much wider than ANSI, but AFAIK there is no way to make Unicode text to have 0x00 0x00 together, so that is not the case when detected EOF may be the wrong one. OTOH, maybe I missed or misunderstood something?

@johndrinkwater
Copy link
Contributor

To take one of the vgui files as an example, I imagine the problem is this,

Half-Life/platform/resource > file vgui_french.txt 
vgui_french.txt: Little-endian UTF-16 Unicode text, with CRLF line terminators

Little endian would put 0xFF 0x00 into the buffer and it would read and keep 0xFF, and then take 0x00 to be EOL and discard the rest. Could you supply an example of the motd taken from a server?

@ghost
Copy link
Author

ghost commented Apr 26, 2013

Yep, but again, *_[language].txt files are UTF-16 with 0xFF 0xFE magic numbers in the beginning. The motd.txt is ANSI, probably using Windows-1252. Example of motd.txt is here (removed).

@johndrinkwater
Copy link
Contributor

Would you gist.github.com it, that website is horrendous.

@alfred-valve
Copy link
Contributor

The core issue here is around the network engine which uses -1 as a deliminator in HL1, but VGUI1 expects plain ANSI files. I am not sure a reasonable fix is possible here (no, I am not rewriting the network system ;-) , and it appears on only effect MOTD's, and in an obvious way that you can work around.

@ghost ghost closed this as completed Apr 27, 2013
@ghost
Copy link
Author

ghost commented Apr 27, 2013

Well, it seems that I found out a workaround: 0xFF character might be replaced with any other unused ASCII code that cannot exist in input text ANSI sequence (say, 0x01) on server side before transmitting and replaced back to 0xFF on client side after receiving. It looks like the same method used in Spirit of Half-Life.

@ghost ghost reopened this Apr 27, 2013
@LevShisterov
Copy link

Can be http://en.wikipedia.org/wiki/Substitute_character used instead?
Just curious. You have to be sure that non-updated client will behave fine.

@ghost
Copy link
Author

ghost commented Apr 27, 2013

@LevShisterov Nice find, man! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants