Skip to content
This repository was archived by the owner on Mar 10, 2025. It is now read-only.

Avoid call to rb_string_value_cstr() which doesn't handle binary strings properly#35

Merged
stancampbell3 merged 1 commit intobuoyant-data:masterfrom
rtyler:issues/32-binary-data
Sep 11, 2014
Merged

Avoid call to rb_string_value_cstr() which doesn't handle binary strings properly#35
stancampbell3 merged 1 commit intobuoyant-data:masterfrom
rtyler:issues/32-binary-data

Conversation

@rtyler
Copy link
Copy Markdown

@rtyler rtyler commented Sep 11, 2014

I believe this is all that's necessary to resolve #32

I've done some manual testing to validate that we're sending the right bytes through Kafka, in the integration test I printed out the md5 of the byte string going in, and then printed the md5 of the buffer that a Kafka consumer (using the Java Kafka library) reads off:

producer side

Hermann::Lib::Producer
  #push_single
    with binary data
hash: a3e3316785491f09d90fd57b2d412437
      should return

consumer side

"\n+AuOzetQrTrdwSY14ig7I_1oUwjp3DvTx3YWhSTGD4Fo\x12\x192014-09-10T00:18:47-07:00\x1A,\n\x06scream\x12\x0Emissing_device\x1A\x12flexd-today-0-app0\"\t\n\astarted*(\b\x00\x12$009f0305-b50a-455d-b137-e52b45f674aa*(\b\x01\x12$53c0d817-d94b-4b7a-9a58-95fe8cec4333"
Hash: a3e3316785491f09d90fd57b2d412437
Read: 213 from Kafka

@rtyler rtyler added this to the 0.14 milestone Sep 11, 2014
@rwygand
Copy link
Copy Markdown

rwygand commented Sep 11, 2014

-2 yo shit do not build. ;)

@rtyler
Copy link
Copy Markdown
Author

rtyler commented Sep 11, 2014

@rwygand are you referring to Travis? That's not going to be fixed until #8 is addressed

@stancampbell3
Copy link
Copy Markdown
Contributor

Travis can take a leap for now.

Tyler, good approach. I'm wary of relying on Ruby to "do the right thing"
rather than explicitly calling out the fact that we're passing an array of
bytes as the message. I guess types make me happy.

On Wed, Sep 10, 2014 at 8:37 PM, R. Tyler Croy notifications@github.com
wrote:

@rwygand https://github.com/rwygand are you referring to Travis? That's
not going to be fixed until #8
#8 is addressed


Reply to this email directly or view it on GitHub
#35 (comment).

@rtyler
Copy link
Copy Markdown
Author

rtyler commented Sep 11, 2014

@stancampbell3 Most of what I found on the internet suggested this approach to safely handle binary data in C extensions 1, 2 (see attached patch file), 3 (the author isn't using the macro properly but it's effectively the same thing). The rb_string_value_cstr() function actually calls RSTRING_PTR just prior to futzing with the string buffer >_<

Between Ruby 1.8.7 and the master branch of the Ruby repo, the RSTRING_PTR macro hasn't actually changed which indicates to me that it's a stable API for C extensions to use.

Using the buffer directly is safe IMO and obviates the need to create Ruby Array of FixNum objects only to convert them back into a C buffer.

@stancampbell3
Copy link
Copy Markdown
Contributor

No, that does make sense. What I meant was back up in the Ruby VM, I'm
slightly distrustful of the assumption that binary data can be safely held
in a Ruby string instead of an array of bytes/integers/hedgehogs.. I just
want to assure myself that duck-typing the push with binary data stuffed in
a Ruby string is not going to be an issue.

The RSTRING_PTR
https://github.com/ruby/ruby/blob/master/include/ruby/ruby.h#L659 bizness
is gold and good call just short-circuiting the rb_* conversion. A char*
is exactly what I'd expect the underlying data to present as.

On Wed, Sep 10, 2014 at 10:15 PM, R. Tyler Croy notifications@github.com
wrote:

@stancampbell3 https://github.com/stancampbell3 Most of what I found on
the internet suggested this approach to safely handle binary data in C
extensions 1
http://clalance.blogspot.com/2011/01/writing-ruby-extensions-in-c-part-8.html,
2 https://bugs.ruby-lang.org/issues/1472 (see attached patch file), 3
https://stackoverflow.com/questions/17179744/ruby-c-extension-for-manipulating-binary-data/17180408#17180408
(the author isn't using the macro properly but it's effectively the same
thing). The rb_string_value_cstr()
https://github.com/ruby/ruby/blob/v1_9_3_484/string.c#L1430 function
actually calls RSTRING_PTR just prior to futzing with the string buffer

_<

Between Ruby 1.8.7 and the master branch of the Ruby repo, the RSTRING_PTR
https://github.com/ruby/ruby/blob/master/include/ruby/ruby.h#L659 macro
hasn't actually changed which indicates to me that it's a stable API for C
extensions to use.

Using the buffer directly is safe IMO and obviates the need to create Ruby
Array of FixNum objects only to convert them back into a C buffer.


Reply to this email directly or view it on GitHub
#35 (comment).

stancampbell3 added a commit that referenced this pull request Sep 11, 2014
Avoid call to rb_string_value_cstr() which doesn't handle binary strings properly
@stancampbell3 stancampbell3 merged commit e7f0f90 into buoyant-data:master Sep 11, 2014
@rtyler rtyler deleted the issues/32-binary-data branch September 11, 2014 18:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hermann's Producer.push should be happy with messages containing binary data.

3 participants