Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert_byte_array should be Vec<i8>? #101

Closed
daschl opened this issue Aug 20, 2018 · 9 comments
Closed

convert_byte_array should be Vec<i8>? #101

daschl opened this issue Aug 20, 2018 · 9 comments

Comments

@daschl
Copy link

daschl commented Aug 20, 2018

Hi,

I was wondering, since the byte java type maps to an i8, shouldn't byte array conversion return a Vec<i8> and not Vec<u8>?

@dmitry-timofeev
Copy link
Contributor

That’s a good question — how often do users need signed bytes? I personally more often have to work around the lack of unsigned bytes in Java (e.g., use ints and cast them to bytes safely) and confusion caused by signed bytes.

Josh Bloch, the author of Effective Java, considers the signed bytes as one of the platform mistakes: https://youtu.be/hcY8cYfAEwU?t=747

@daschl
Copy link
Author

daschl commented Aug 21, 2018

I agree with you - but what happens if the byte areay coming from java has negative values?

Maybe better to take i8 and if the user know it is safe they can cast the slice?

@dmitry-timofeev
Copy link
Contributor

what happens if the byte areay coming from java has negative values?

I'd assume 8-bit values having the same binary representation, that is, -128=1000000 in two's complement as a Java byte being 10000000=128 as a Rust i8 (or -1=11111111/byte == 11111111=255/i8)

Maybe better to take i8 and if the user know it is safe they can cast the slice?

Would love to hear opinions from the Rust-literate people, as I am not one :-) Is it a safe and efficient (O(1)) operation?

@daschl
Copy link
Author

daschl commented Aug 22, 2018

I think the only way to do this cast is through unsafe/transmute without iterating through the whole thing, but if you know the data in it is fine then it should be okay.

I'm currently working on a tool that does this for you (not yet announced/released but if you want to take a look: https://github.com/roast-rs/roast/blob/master/docs/docs.adoc) so I'm trying to figure out the right mappings.

@dmitry-timofeev
Copy link
Contributor

I see, thank you. I’d probably go with a list of u8 as a default — I think that is much more common use case when the sign is not relevant (e.g., hashes, serialized data, raw strings).

The project looks cool! Is it correct that it is similar to #81, but uses Rust structs instead of Java classes with native methods as its input to auto-generate the JNI glue code (i.e., creates a Java proxy of a native object)? If you'd like to see how we use Rust JNI to solve that, here is a Rust glue code for a proxy of a native object. Would be wonderful if the amount of boilerplate could be reduced.

@daschl
Copy link
Author

daschl commented Aug 22, 2018

Yes basically you write rust code and then using derive and with a combination of a cli tool we derive the proper stuff / do codegen to generate the FFI stubs and the java files. The CLI tool then puts it in the right spot. So all you have to do is write rust code and then consume it from java. the idea is for 0 boilerplate. I still have lots to work through, i.e. if you return results we can turn this into exceptions on the java side and so forth.

@alexander-irbis
Copy link
Contributor

On the one hand, if we convert binary data, in Rust they are usually represented as an array of unsigned bytes. On the other hand, sometimes the data can indeed be an array of signed bytes.

In jni-rs, there are two methods for converting arrays: convert_byte_array, which works with u8, and get_byte_array_region, which works with i8. It would be more consistent if both functions worked with the same type.

But it would be more convenient if the functions working with arrays had two versions - for signed and unsigned data.

@dmitry-timofeev
Copy link
Contributor

Closing for now — if any interest occurs in this feature, will revive. Thanks for reporting!

@MolotovCherry
Copy link

MolotovCherry commented Dec 2, 2021

Well, I'll just say here, that in my API I need to return a byteArray from a Vec<u8>, and there's no way to convert the Vec<u8> to a Vec<i8>. It's nice that I can convert from i8 to u8 when I receive a byteArray from Java, but the other half that lets me return a byteArray by converting the bytes is missing.

I had to use the bytemuck crate as a workaround for this limitation. I felt that, even though using bytemuck is simple to convert it, I shouldn't have needed to add another crate for this when it was an obvious use case.
bytemuck::cast_slice::<u8, i8>(&*bytes);

In the end, I can get/set a java byteArray (Vec<i8>) and also convert it to Rust Vec<u8> - great.
But there's no function for going from Rust's u8 back to Java's i8, so I can't use the set_byte_array_region without manually converting it first

If anyone's coming from Google, just use the bytemuck crate to convert,

bytemuck::cast_slice::<u8, i8>(&*bytes);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants