Skip to content

Conversation

@thekhalifa
Copy link
Contributor

Files in ISO-8859-15 or similar encoding are not always displayed correctly. Convert them to UTF-8 to make sure they're displayed correctly and to avoid warnings when packaging FreeBASIC for debian.

Note, the difference in encoding may not be detectable on github or when viewing a patch containing both.
Only when you open the old vs. new files separately.

@countingpine
Copy link
Collaborator

Hi. For what it's worth, the farptr.bi symbols look like Mojibake. They should probably be interpreted as box drawing characters as in Codepage 850:

'╔═══════════════════════════════════════════════════════════════════════╗
'║               Far Pointer Simulation Functions                        ║
'╚═══════════════════════════════════════════════════════════════════════╝

(I also converted 8-tabs to spaces in the middle row, and since it seemed one character short, added another one before the text to better centralise it.)

It might also be worth just using ASCII single-quotes in stabs.bi.

@thekhalifa
Copy link
Contributor Author

Codepage 850:

Good catch, it looks much better that way.
I can convert it that way, along with the spacing

FYI, this is what I saw with the patch - it was a little better

-'<C9><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD>
<CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD>ͻ
-'<BA>          Far Pointer Simulation Functions                        <BA>
-'<C8><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD>
<CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD><CD>ͼ
+'ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
+'º             Far Pointer Simulation Functions                        º
+'ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍŒ
 '

single-quotes in stabs.bi.

Is that what you see in the original or you mean as general cleanup?
It's all in a comment, so it matters little, but I didn't want to override original too much

@thekhalifa thekhalifa force-pushed the fix-encoding-to-utf8 branch from 95829ae to dc9d7aa Compare January 29, 2025 20:21
@countingpine
Copy link
Collaborator

Thanks.
For stabs.bi, I'm suggesting just to remove the smart quotes and replace them with ASCII single-quotes, i.e. '.cb' file instead of ‘.cb’ file.

@thekhalifa thekhalifa force-pushed the fix-encoding-to-utf8 branch 2 times, most recently from 1643d12 to 562eaca Compare January 31, 2025 18:02
Files in ISO-8859-15, CP-1252 and CP-850 encoding are not always
displayed correctly.

Convert them to UTF-8 to make sure they're displayed
correctly and to avoid warnings when packaging FreeBASIC for debian
@thekhalifa
Copy link
Contributor Author

I replaced that along with reconverting 'examples/graphics/OpenGL/NeHe/lesson22.bas' from CP-1252 instead of ISO-8859 as it didn't look quite right.

@jayrm
Copy link
Member

jayrm commented Feb 18, 2025

Even though fbc can handle all the encodings, anything that is code page dependent is going to be a display problem eventually somewhere. And UTF-8 is going to be a display problem on any platform that doesn't support it.

Agree, source for the fbc compiler itself and build tools should probably be plain ASCII, that way editing the files has near zero dependency on author tools used across multiple platforms. So agree, using the plain single quotes in stabs.bi

The include files and example files probably ok as UTF-8 since those files don't get changed much, and fbc can handle the encodings.

Except ./inc/dos/sys/farptr.bi could just remove the fancy box characters, or be changed to use ASCII characters only like +---+, | |, etc. The conversion to UTF-8 does nothing to enhance displaying this file within DOS.

@jayrm jayrm merged commit 94dc414 into freebasic:master Feb 18, 2025
@thekhalifa
Copy link
Contributor Author

Except ./inc/dos/sys/farptr.bi could just remove the fancy box characters, or be changed to use ASCII characters only like +---+, | |, etc. The conversion to UTF-8 does nothing to enhance displaying this file within DOS.

Thanks for converting it to plain ASCII characters. I picked UTF-8 as the universal encoding, but forgot that maybe DOS doesn't support it.
I can still see the BOM sequence at the beginning of the file (3 bytes: EF BB BF) - if those are not readable in DOS or show up as rubbish, I can send a fix to remove them (i.e. convert the file to plain ASCII encoding)

@jayrm
Copy link
Member

jayrm commented Feb 18, 2025

I can still see the BOM sequence at the beginning of the file (3 bytes: EF BB BF) - if those are not readable in DOS or show up as rubbish, I can send a fix to remove them (i.e. convert the file to plain ASCII encoding)

Dang it. I thought my editor removed the BOM.

Thanks for spotting this. I'll commit a fixup soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants