Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FFI::Pointer#write_string to always terminate with a \0 byte #806

Merged
merged 9 commits into from
Dec 11, 2020

Conversation

eregon
Copy link
Collaborator

@eregon eregon commented Jul 11, 2020

There was no specs for write_string yet, so actually changing as proposed in #805 caused no spec failures.

@eregon
Copy link
Collaborator Author

eregon commented Jul 18, 2020

@larskanis Could you review?

The CI fails on TruffleRuby as expected (If we are OK with this change I'll apply it in TruffleRuby before merging) but it's otherwise passing.

Copy link
Member

@larskanis larskanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in a struggle with mixing null termination and no null termination. IMHO write_string should always add a null termination. The mixed behaviour reminds me of strncpy() which is often considered insecure therefore. So what do you think about adding a null after len bytes, if len is given?

@eregon
Copy link
Collaborator Author

eregon commented Jul 19, 2020

I'm in a struggle with mixing null termination and no null termination. IMHO write_string should always add a null termination.

I kind of agree, and write_bytes can be used for the behavior without null termination.
I'm a little worried that people might expect ptr.write_string("str", 3) to write 3 and not 4 bytes though.

I'll update the PR to always add a \0 for write_string.

When len is given the null character is appended after the truncated string.
Due to discussion in ffi#806 (review)

Also add some Unicode characters to point out we're counting bytes, not characters.
@larskanis
Copy link
Member

@eregon I changed code and specs to add a null character in any case in write_string. The Truffleruby tests fail now as expected. Can you adjust it as well?

lib/ffi/pointer.rb Outdated Show resolved Hide resolved
…tring

* And try to improve the clarity of the docs.
@eregon eregon changed the title Fix FFI::Pointer#write_string to terminate with a NUL byte if not given a length Fix FFI::Pointer#write_string to always terminate with a NUL byte Sep 24, 2020
@eregon eregon changed the title Fix FFI::Pointer#write_string to always terminate with a NUL byte Fix FFI::Pointer#write_string to always terminate with a \0 byte Sep 24, 2020
@eregon
Copy link
Collaborator Author

eregon commented Sep 24, 2020

I did the change in TruffleRuby, will ping here when it's merged and there is a new truffleruby-head build.

@eregon
Copy link
Collaborator Author

eregon commented Sep 24, 2020

@larskanis I'm not sure if we can do this change regarding compatibility.

In TruffleRuby's own usages of write_string, all 16 of them actually expected no final \0.
A good part of that code is originally from Rubinius, which clearly had the semantics of not appending a final \0 for write_string (OTOH Rubinius FFI had other incompatibilities, some quite unintuitive).
All of them were either single-argument or write_string(str, str.bytesize) (so still writes the full String).
I'll change them to write_bytes(str).
Migrating to write_bytes is not necessarily trivial, because write_string(str, str.bytesize) becomes write_bytes(str).
write_bytes(str, str.bytesize) would write str[str.bytesize, str.bytesize] which does nothing and not what we want.
So if we write only part of the string then write_string(str, len) => write_bytes(str, 0, len).

I also noticed a problematic usage in OptCarrot:
https://github.com/mame/optcarrot/blob/7a38ff7ed8121fdd890bec201f98ebb3e750c611/lib/optcarrot/driver/misc.rb#L120

When running on CRuby there is at least a nice error:

/home/eregon/.rubies/ruby-2.6.6/lib/ruby/gems/2.6.0/gems/ffi-1.13.1/lib/ffi/pointer.rb:107:in `put_string': Memory access offset=0 size=1025 is out of bounds (IndexError)
	from /home/eregon/.rubies/ruby-2.6.6/lib/ruby/gems/2.6.0/gems/ffi-1.13.1/lib/ffi/pointer.rb:107:in `write_string'
	from /home/eregon/code/truffleruby-ws/truffleruby/bench/optcarrot/lib/optcarrot/driver/misc.rb:109:in `icon_data'

So this should help migration, but I'm not sure if the Pointer's size is always known.

On TruffleRuby there are currently no such checks, and so it results in malloc() assertions & errors.
So on the plus side, this really motivates me to implement the bound checks for FFI pointers on TruffleRuby :D

@eregon
Copy link
Collaborator Author

eregon commented Sep 29, 2020

@larskanis I'm OK to merge the change in TruffleRuby-head to experiment.
But if we do so, I'd like to have a FFI release soon, so any issue related to this change gets discovered quickly and the TruffleRuby FFI has the same behavior as the last FFI release.
Sounds good?

eregon added a commit to eregon/optcarrot that referenced this pull request Sep 29, 2020
* Since #write_string might append a final \0 byte in future FFI releases: ffi/ffi#806
mame pushed a commit to mame/optcarrot that referenced this pull request Sep 29, 2020
* Since #write_string might append a final \0 byte in future FFI releases: ffi/ffi#806
@eregon
Copy link
Collaborator Author

eregon commented Oct 28, 2020

@larskanis Could you reply here, do you think we should try this change and see what happens?
I could merge the change in TruffleRuby soon, TruffleRuby is starting a new development cycle. Then I think we should have a FFI release soon after to get feedback whether it's OK or too incompatible.

@larskanis
Copy link
Member

Thank you Benoit for your investigation above! I'm sorry that I've got very little time for OSS development over the past months.

Regarding this change I'm unsure how to solve it best. Obviously we break compatibility in some cases, but we shouldn't in a minor version. We could use Pointer#size to check if we have enough space to write the string with termination. We then have two options:

  1. Refine this PR: If the size permits, then write the string with termination. If the size doesn't permit, write the string without the termination and print a deprecation warning. The size check and the deprecation warning can be removed with ffi-2.0.
  2. Reject this PR: Don't write the string with termination. If the size wouldn't allow termination, then print a deprecation warning instead. The string termination will be introduced with ffi-2.0.

I'm somewhat in favor with option 1, but still unsure. If we reach an agreement on this PR, I can release a new ffi version 1.14.0 within some days.

If the string fits into the memory but the final 0-byte doesn't, print a deprecation warning.
This should mitigate the comptibility issue that @eregon noticed in ffi#806 (comment) .

Also extend the specs for ffi-2.x behavior and the behavior for too large strings.
@larskanis
Copy link
Member

@eregon I added a commit for option 1 above. What do you think?

@eregon
Copy link
Collaborator Author

eregon commented Dec 5, 2020

@larskanis That sounds good to me.

In the case the pointer is unbounded (size=LONG_MAX), then an extra \0 will be written, but I think there is no way to warn for that (the only alternative would be to delay that change until FFI 2).

Given there is a deprecation warning, maybe we could actually use the new behavior after a few FFI 1.x releases?
I think requiring a new major version for this seems a bit much to me.
It's more like a Ruby 2.5->2.6 change, not Ruby 2->3 change (but Ruby versioning is weird).

@eregon
Copy link
Collaborator Author

eregon commented Dec 5, 2020

What should we do about FFI::Pointer#write_string_length?
Same as write_string or same as write_bytes?

Because of the name (string and not bytes) it seems better to just call write_string(str, len) and \0-terminate.
Then we'll need to update the docs.

Currently the documentation is a mix:

    # @param [String] str string to write
    # @param [Numeric] len length of string to return
    # @return [self]
    # Write +len+ first bytes of +str+ in pointer's contents.
    #
    # Same as:
    #  ptr.write_string(str, len)   # with len not nil
    def write_string_length(str, len)
      write_bytes(str, 0, len)
    end

@larskanis
Copy link
Member

I think it's good to merge now. There are some errors on Truffleruby, since it's not implemented there. Can I merge the PR now or is there anything else to consider?

@eregon
Copy link
Collaborator Author

eregon commented Dec 10, 2020

Fine to merge for me, I should be able to fix TruffleRuby soon after.

@larskanis larskanis merged commit 2b44904 into ffi:master Dec 11, 2020
larskanis added a commit to larskanis/ffi that referenced this pull request Dec 19, 2020
.. but add test cases for write_sting semantics.

This reverts commit 2b44904, reversing
changes made to 2918fdf.

Fixes ffi#857
larskanis added a commit that referenced this pull request Dec 19, 2020
Revert "Merge pull request #806 from eregon/fix-write_string"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

#write_string is inconsistent with #put_string and does not NUL-terminates
2 participants