Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very recent UTF-16 issue? Not.sure. #2704

Closed
MSP-Greg opened this issue Aug 10, 2022 · 7 comments · Fixed by sparklemotion/sqlite3-ruby#334
Closed

Very recent UTF-16 issue? Not.sure. #2704

MSP-Greg opened this issue Aug 10, 2022 · 7 comments · Fixed by sparklemotion/sqlite3-ruby#334
Assignees
Labels

Comments

@MSP-Greg
Copy link

About 17 hours ago a CI run in sqlite3-ruby passed. There has been an update in a Window's vcpkg used for the CI, and I just ran it in my fork a moment ago. Both were on current master. TruffleRuby failed in my fork build.

Passing version:
truffleruby 22.3.0-dev-925274f4, like ruby 3.0.3, GraalVM CE Native [x86_64-linux]

Failing version:
truffleruby 22.3.0-dev-6e384700, like ruby 3.0.3, GraalVM CE Native [x86_64-linux]

Text of the errors below:

ArgumentError: UTF-16 string byte length is not a multiple of 2
    string.c:190:in `rb_str_resize'
    string.c:190:in `rb_str_cat'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/ext/sqlite3/database.c:34:in `rb_sqlite3_open16'
    /home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext_ruby.rb:41:in `open16'

Everything else seems to match in the CI runs. Reporting only, not a bug affecting me...

@eregon
Copy link
Member

eregon commented Aug 11, 2022

Thanks for the report.
TruffleString raises this error, it does not support UTF-16 strings with an odd bytesize.
I think we should adapt the upstream test doing that if possible, @andrykonchin could you do that?
Full output to preserve it:

  1) Error:
SQLite3::TestDatabase#test_new_with_options:
ArgumentError: UTF-16 string byte length is not a multiple of 2
    string.c:190:in `rb_str_resize'
    string.c:190:in `rb_str_cat'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/ext/sqlite3/database.c:34:in `rb_sqlite3_open16'
    /home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext_ruby.rb:41:in `open16'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/lib/sqlite3/database.rb:70:in `initialize'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/test/test_database.rb:207:in `test_new_with_options'

  2) Error:
SQLite3::TestEncoding#test_db_with_utf16:
ArgumentError: UTF-16 string byte length is not a multiple of 2
    string.c:190:in `rb_str_resize'
    string.c:190:in `rb_str_cat'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/ext/sqlite3/database.c:34:in `rb_sqlite3_open16'
    /home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext_ruby.rb:41:in `open16'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/lib/sqlite3/database.rb:70:in `initialize'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/test/test_encoding.rb:114:in `test_db_with_utf16'

  3) Error:
SQLite3::TestEncoding#test_insert_encoding:
ArgumentError: UTF-16 string byte length is not a multiple of 2
    string.c:190:in `rb_str_resize'
    string.c:190:in `rb_str_cat'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/ext/sqlite3/database.c:34:in `rb_sqlite3_open16'
    /home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext_ruby.rb:41:in `open16'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/lib/sqlite3/database.rb:70:in `initialize'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/test/test_encoding.rb:32:in `test_insert_encoding'

  4) Error:
SQLite3::TestEncoding#test_select_encoding_on_utf_16:
ArgumentError: UTF-16 string byte length is not a multiple of 2
    string.c:190:in `rb_str_resize'
    string.c:190:in `rb_str_cat'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/ext/sqlite3/database.c:34:in `rb_sqlite3_open16'
    /home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext_ruby.rb:41:in `open16'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/lib/sqlite3/database.rb:70:in `initialize'
    /home/runner/work/sqlite3-ruby/sqlite3-ruby/test/test_encoding.rb:17:in `test_select_encoding_on_utf_16'

@eregon
Copy link
Member

eregon commented Aug 11, 2022

Re why it passed 17 hours ago is because the truffleruby dev builds did not work for a while due to some issue that is now fixed, and so it was using a dev build before the adoption of TruffleString.

@eregon eregon added the cexts label Aug 11, 2022
@andrykonchin
Copy link
Member

Yeah. Will work on it.

@eregon
Copy link
Member

eregon commented Aug 11, 2022

The issue seems https://github.com/sparklemotion/sqlite3-ruby/blob/94b7447b016b6eabb0d91d68ee73478b7db313f0/ext/sqlite3/database.c#L34
The intention seems to add a second \0 byte, just before the one added by RSTRING_PTR().
Given the error it seems the string before the rb_str_buf_cat() has an even size, but an odd size after, but would be good to verify.
A fix seems to use rb_str_buf_cat(str, "\x00\x00", 2L); instead.
Or potentially rely on Ruby to double-\0 terminate but that's currently not implemented in TruffleRuby (might be in CRuby with with the TERM_LEN stuff, unsure).

@eregon
Copy link
Member

eregon commented Aug 11, 2022

I'm not sure why sqlite3_open16 is used at all and not just sqlite3_open_v2, UTF-8 seems a much better choice/much more common for Ruby than UTF-16. That seems worth asking upstream.

@MSP-Greg
Copy link
Author

Thanks.

IDK.

I saw the change, (recent pass to recent fail), and all the other platforms were passing in both CI runs. As some point, right/wrong, good/bad isn't the issue, it's about compatibility.

I was just running CI because the Windows mswin build wasn't working (fixed), and then MSYS2 and vcpkg updated their sqlite3/sqlcipher packages, so I wanted to check the updates. Or, I'm not that familiar with SQLite3...

eregon added a commit to eregon/sqlite3-ruby that referenced this issue Aug 11, 2022
* TruffleRuby uses TruffleString which requires all UTF-16 strings to
  always have an even bytesize.
* The code here ensures the UTF-16 string always ends up with 2 \x00
  bytes, so append them explicitly to have an even bytesize instead of
  relying on RSTRING_PTR() adding the second \x00 byte.
* Fixes oracle/truffleruby#2704
@flavorjones
Copy link
Contributor

flavorjones commented Aug 12, 2022

FWIW, this appears to be happening only when sqlite3 is compiled to use packaged libraries (but not when it uses sytem libraries).

I now see @eregon's comment about about the dev builds.

graalvmbot pushed a commit that referenced this issue Aug 15, 2022
* So whether the string is UTF-32, UTF-16 or another encoding there are
  a safe amount of \0 bytes when passed to C functions.
* See #2704 (comment)
graalvmbot pushed a commit that referenced this issue Aug 15, 2022
nirvdrum pushed a commit to Shopify/truffleruby that referenced this issue Aug 17, 2022
* So whether the string is UTF-32, UTF-16 or another encoding there are
  a safe amount of \0 bytes when passed to C functions.
* See oracle#2704 (comment)
john-heinnickel pushed a commit to thermofisher-jch/truffleruby that referenced this issue Aug 16, 2023
* So whether the string is UTF-32, UTF-16 or another encoding there are
  a safe amount of \0 bytes when passed to C functions.
* See oracle#2704 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants