Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selecting from JSON columns does not decode them from UTF-8 #142

Closed
Grinnz opened this issue Mar 8, 2019 · 11 comments
Closed

Selecting from JSON columns does not decode them from UTF-8 #142

Grinnz opened this issue Mar 8, 2019 · 11 comments

Comments

@Grinnz
Copy link
Contributor

Grinnz commented Mar 8, 2019

I don't have a new enough MySQL handy, but in a bug report for Mojo::mysql it was indicated with a test case that when selecting from a JSON type column, the output is not UTF-8 decoded (jhthorsen/mojo-mysql#57 (comment)). MySQL documents that when JSON is interacted with as a string it is treated as the utf8mb4 charset so it should be returned that way, the same way it is passed as a utf8mb4 string.

The issue should be reproducible by inserting into a JSON type column with a query like so:

INSERT INTO unicode_json_test SET my_json = JSON_OBJECT('drink', CONVERT(UNHEX('537AC5916CC5916CC3A9') USING utf8mb4));

then selecting that column. JSON and encoding is a tricky subject because it is commonly represented as UTF-8 encoded bytes rather than text, but as MySQL treats it on input and output as a character string, it should be decoded as the input string is encoded; this also is what happens in DBD::Pg.

Also opened for DBD::mysql perl5-dbi/DBD-mysql#309

@pali
Copy link
Member

pali commented Mar 8, 2019

use strict;
use warnings;
use feature 'say';
use DBI;
binmode *STDOUT, ':utf8';

my $dbh = DBI->connect("DBI:MariaDB:test");

$dbh->do(q(CREATE TEMPORARY TABLE unicode_json_test(my_json JSON)));
$dbh->do(q(INSERT INTO unicode_json_test SET my_json = JSON_OBJECT('drink', CONVERT(UNHEX('537AC5916CC5916CC3A9') USING utf8mb4))));

say "JSON output: " . $dbh->selectrow_array(q(SELECT my_json FROM unicode_json_test));
say "Server version: " . $dbh->selectrow_array(q(SELECT VERSION()));
say "Driver version: " . $dbh->{Driver}->{Version};
say "DBI version: $DBI::VERSION";

Output is:

JSON output: {"drink": "Szőlőlé"}
Server version: 10.3.10-MariaDB
Driver version: 1.21
DBI version: 1.636

Seems that DBD::MariaDB 1.21 with MariaDB 10.3.10 is working fine. Or is there any problem?

CC @ViktorNacht @Tekki

@pali
Copy link
Member

pali commented Mar 8, 2019

And here is another test case with decoding RAW jsons:

use strict;
use warnings;
use DBI;
use Cpanel::JSON::XS;
use feature 'say';
use utf8;
binmode *STDOUT, ':utf8';

my $cjson = Cpanel::JSON::XS->new->allow_nonref;

my $dbh = DBI->connect("DBI:MariaDB:test");

$dbh->do(q(CREATE TEMPORARY TABLE unicode_json_test(my_json JSON)));
$dbh->do(q(INSERT INTO unicode_json_test SET my_json = JSON_OBJECT('drink', 'Szőlőlé')));

say "RAW JSON output: " . $dbh->selectrow_array(q(SELECT my_json FROM unicode_json_test));
say "Decoded drink JSON output: " . $cjson->decode($dbh->selectrow_array(q(SELECT my_json FROM unicode_json_test)))->{drink};
say "RAW JSON field output: " . $dbh->selectrow_array(q(SELECT json_extract(my_json, "$.drink") FROM unicode_json_test));
say "Decoded JSON field output: " . $cjson->decode($dbh->selectrow_array(q(SELECT json_extract(my_json, "$.drink") FROM unicode_json_test)));
say "Server version: " . $dbh->selectrow_array(q(SELECT VERSION()));
say "Driver version: " . $dbh->{Driver}->{Version};
say "DBI version: $DBI::VERSION";

Output is:

RAW JSON output: {"drink": "Szőlőlé"}
Decoded drink JSON output: Szőlőlé
RAW JSON field output: "Szőlőlé"
Decoded JSON field output: Szőlőlé
Server version: 10.3.10-MariaDB
Driver version: 1.21
DBI version: 1.636

So output seems to be correct too.

If there is something wrong let me know. Otherwise I will close this issue as invalid.

@Grinnz
Copy link
Contributor Author

Grinnz commented Mar 8, 2019

The test cases were done on MySQL, so perhaps it is a difference between MySQL and MariaDB.

@pali
Copy link
Member

pali commented Mar 8, 2019

Ok, so it depends on a server version? Can you (or anybody else) specify the version which doesn't work, together with the output of the above test cases? Without it, it is impossible to debug this problem (if there really is any).

@Grinnz
Copy link
Contributor Author

Grinnz commented Mar 8, 2019

@ViktorNacht specified his test cases were run in MySQL 5.7.22 and 8.0.13

@pali
Copy link
Member

pali commented Mar 20, 2019

See jhthorsen/mojo-mysql#57 (comment) and jhthorsen/mojo-mysql#57 (comment). This is bug in MySQL server. So please report it to the MySQL server bug tracker.

@pali
Copy link
Member

pali commented May 14, 2019

I'm closing this issue as this reported problem is on MySQL server, not in DBD::MariaDB driver.

@pali pali closed this as completed May 14, 2019
@Grinnz
Copy link
Contributor Author

Grinnz commented Jun 7, 2019

FYI: I have opened a bug against MySQL here, but I still have no way to reproduce or test it myself. https://bugs.mysql.com/bug.php?id=95698

@pali
Copy link
Member

pali commented Oct 14, 2019

@Grinnz: Do you have any response about that reported bug in MySQL bugtracker?

@Tekki
Copy link

Tekki commented Oct 14, 2019

I see no sign of life on the side of MySQL since July 10.

@Grinnz
Copy link
Contributor Author

Grinnz commented Oct 14, 2019

There's been no response other than confirming the bug so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants