Skip to content
This repository has been archived by the owner on May 28, 2023. It is now read-only.

Website charset issues #35

Open
Pecon opened this issue Nov 6, 2017 · 4 comments
Open

Website charset issues #35

Pecon opened this issue Nov 6, 2017 · 4 comments
Labels

Comments

@Pecon
Copy link

Pecon commented Nov 6, 2017

Looks like you guys have the database set to latin1 so that it'll properly store and reproduce characters for Blockland to read. Unfortunately, the frontend site is set to utf-8; and the characters are read from the database as-is, and as such are single-byte encoded. This causes the characters to show up as random glyphs. The solution to this is to either have php do:

$text=mb_convert_encoding($text, "UTF-8", "Windows-1252");

Or set the charset meta tag of all the pages to ISO-8859-1 (closest supported set). Obviously, converting from php would probably be the best solution, but setting the page charset should also be an acceptable solution.

@McTwist
Copy link
Contributor

McTwist commented Nov 6, 2017

The database charset is default for MySQL. I believe that they never changed it when they created this.

The best fix would be to translate the database to utf-8 and then do convertions on content only, where needed.

Some content are sent to Blockland, but those could be easily converted on the fly, if not stored in an another charset.

The API on the other hand requires to be windows-1252 to avoid any character issues. Making the whole site to be that encoding could work, but should be avoided to make site more future proof. Of course, it will put some strain on the programmer that needs to remember this.

@McTwist McTwist added the bug label Nov 6, 2017
@Pecon
Copy link
Author

Pecon commented Nov 6, 2017

The database charset is default for MySQL. I believe that they never changed it when they created this.

According to this, it's specifically set to latin1. Which is literally Windows-1252.
https://github.com/BlocklandGlass/GlassWebsite/blob/master/private/class/db.sql

If you switched it to utf-8 and converted all the content in it to that, all you would have to do is convert text that is served through the api. Nothing would have to change on the website end that way.

@McTwist
Copy link
Contributor

McTwist commented Nov 7, 2017

That is an sql dump. That means it's just a copy from their own installation which most probably could have been default set. The dump always will set the charset, as the database always requires one.

@McTwist
Copy link
Contributor

McTwist commented Nov 8, 2017

As you mentioned, something like this could be achieved: bl_convert_encoding
So, to make a detailed list of what needs to be done:

  • Convert database to UTF-8 (utf8_unicode_ci)
  • API will output Windows-1252 when Blockland client (Check User-Agent or Accept-Charset along with Content-Type (Might need to modify TCPClient, I've contacted @Greek2me))
  • Reading Add-On will convert to UTF-8; Writing Add-On will convert to Windows-1252

This should minimize work on the programmer as the content is left untouched, but make the whole thing transparent to the user.

I am curious how this haven't been found out sooner.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Development

No branches or pull requests

2 participants