Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo build failure and questions about handling callback in SPDK #5

Closed
xxks-kkk opened this issue Oct 18, 2018 · 13 comments
Closed

Cargo build failure and questions about handling callback in SPDK #5

xxks-kkk opened this issue Oct 18, 2018 · 13 comments

Comments

@xxks-kkk
Copy link

xxks-kkk commented Oct 18, 2018

Hello,

I'm trying to study your source code on how you build a rust-friendly wrapper on top of the rust bindings. I still cannot figure out how I can wrap around the callback function argument, which is heavily used in SPDK. I'm looking at your implementation of pub async fn write<'a> inside blob.rs, which is a wrapper around spdk_blob_io_write.

I don't understand the Line 286. What's cb_arg inside cb_arg::<()>(sender)? what does cb_arg::<()>(sender) mean? I'm trying to wrap around spdk_bdev_write and the callback function type in my case is pub type spdk_bdev_io_completion_cb = ::std::option::Option<unsafe extern "C" fn(bdev_io: *mut spdk_bdev_io, success: bool, cb_arg: *mut ::std::os::raw::c_void)>;. Can you share some tips on how you do the wrapper?

I try to clone your code and build it to see if cb_arg is from the generated bindings but when I run cargo build, I hit the following error

   Compiling clap v2.32.0
error[E0658]: scoped lint `clippy::cast_ptr_alignment` is experimental (see issue #44690)
   --> /home/zeyuanhu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-core-preview-0.3.0-alpha.8/src/future/future_obj.rs:232:21
    |
232 |             #[allow(clippy::cast_ptr_alignment)]
    |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: add #![feature(tool_lints)] to the crate attributes to enable

However, I notice you use #![feature(tool_lints)] inside lib.rs of spdk-sys. I'm not sure why I still hit the error.

Thanks!

@jkozlowski
Copy link
Owner

There are a few questions here, so let me try to break it down:

what does cb_arg::<()>(sender) mean?

Have a look at the definition: it simply boxes (heap allocates) the Sender arg and then converts it into a raw pointer (*mut c_void) that can be passed to spdk.

Can you share some tips on how you do the wrapper?

Certainly. The way spdk functions work is that you give them a pointer to a function (needs to be a extern "C", so I have either complete_callback_0 or complete_callback_1, depending on whether I get some value back from spdk or just an error code) and then also a pointer to a "context" (which is a heap allocated piece of state that your callback needs - to spdk it's the *mut c_void bit).

Spdk takes those args and when it wants to call your callback, it simply calls the extern "C" function you provided, giving it the *mut c_void pointer you gave it previously (plus some extra gobbins, depending on what the particular function is doing).

The clever bit in here is that I made complete_callback_1 generic over the return type, which means I only define it once, but rust will monomorphise it for me for any type I need to use it with (so e.g. when I used it as an arg to spdk_bs_init and spdk_bs_create_blob, rust will actually resolve those to 2 different extern "C" functions that I can then pass to spdk).

I try to clone your code and build it to see if cb_arg is from the generated bindings but when I run cargo build, I hit the following error

Not quite sure what is happening there, but the build setup is setup on CI, so you can try to have a look what you missed here.

One thing that stands out is that you're using alpha.8 of futures, which I didn't have time to update to yet. Also the error is coming from compiling futures, not my code.

@jkozlowski
Copy link
Owner

Can I also ask what's your interest here? Are you thinking of building something on top this code, or your interest is purely in the mechanics of wrapping callback based C APIs?

@xxks-kkk
Copy link
Author

xxks-kkk commented Oct 19, 2018

Thanks for the detailed response. I'm working with SPDK using Rust as well and I try to wrap the bdev.h, which your repo doesn't cover. I'm amazed how you provide the rust-friendly wrapper and I try to learn more about it :-) I find wrapping around the callbacks are tricky to do.

Regarding the pointer to "context", I haven't seen you wrapper requires passing in any context struct for the callback. Maybe I'm wrong but can you point a place that you actually need to pass in some struct to provide the context for the callback? Doing Some(complete_callback_0) and then cb_arg::<()>(sender) can take up the only spot we can pass in the context struct (i.e., in SPDK, we can pass in the context through void* cb_arg but now this place is occupied by cb_arg::<()>(sender)). How can you pass in the context struct for the callback?

In some places, you use cb_arg(sender) while in some places you use cb_arg::<()>(sender)? Are they equivalent?

You use the pattern of let (sender, receiver) = oneshot::channel(); and invoke the bindings and then do await! macro. I have hard time understanding the benefit of writing the code like this. What's the benefit of this pattern over say directly calling the bindings and return the result?

Thank you in advance!

@jkozlowski
Copy link
Owner

jkozlowski commented Oct 20, 2018 via email

@xxks-kkk
Copy link
Author

Thanks for the explanation. I still not sure I fully understand about the sender as context part. Here is my situation. I'm rewriting hello_bdev.c using Rust. As you can see, I try to pass in context as the additional argument to the callback. However, by how I'm wrapping right now, I have to pass my context struct as the surrounding capture using closure. However, from your code (e.g. the wrapper for spdk_blob_io_write), I think you probably fact the same situation as mine? I'm not sure how you're able to do this (passing context struct) without using closure?

If possible, do you have an example showing how you use your wrapper?

As you can see, I'm trying to build a file system on top of SPDK using Rust. I'll definitely give credits to your kindness help once my wrapper comes into a good shape :-)

@jkozlowski
Copy link
Owner

jkozlowski commented Oct 21, 2018 via email

@jkozlowski
Copy link
Owner

jkozlowski commented Oct 21, 2018 via email

@xxks-kkk
Copy link
Author

xxks-kkk commented Oct 22, 2018

Thanks for the comment. I haven't examples in main.rs in starfish-example-app and I haven't seen any tests under blob.rs.

You're right there will be a cascade of calls and I think my code will have the bug if there are multiple requests to the SPDK, how SPDK decides to finish the requests really depends on its internal scheduler as you pointed out earlier. I'm not a huge fan of SPDK chain of callback style and I think they're doing so due to the limitation of C itself. I think Rust can do a better job when comes to the provide the API and your approach is the solution to this. My current approach is inspired from your code here. However, I'm kind of wondering how my use will be different in main.rs if I use your approach instead (i.e., how write_complete and spdk_bdev_write will be glued together if I switch spdk_bdev_write to your approach).

Thank you for your time.

@jkozlowski
Copy link
Owner

jkozlowski commented Nov 23, 2018

I'm going to assume that you managed to get yourself unblocked. Feel free to reopen if not.

@xxks-kkk
Copy link
Author

xxks-kkk commented Dec 14, 2018

@jkozlowski I have made some small progress and now I hit the error when I try to call await! within some async function. I think the error I got is specific to the SPDK and not Rust. Here is the code. The error I got is:

118 |         tokio::run_async(run());
    |         ^^^^^^^^^^^^^^^^ `*mut libspdk_sys::spdk_bdev` cannot be sent between threads safely
    |
    = help: within `impl std::future::Future`, the trait `std::marker::Send` is not implemented for `*mut libspdk_sys::spdk_bdev`
    = note: required because it appears within the type `spdk_rs::SpdkBdev`
    = note: required because it appears within the type `std::option::Option<spdk_rs::SpdkBdev>`
    = note: required because it appears within the type `{std::option::Option<spdk_rs::SpdkBdev>, impl std::future::Future, ()}`
    = note: required because it appears within the type `[static generator@src/main.rs:63:43: 103:2 {std::option::Option<spdk_rs::SpdkBdev>, impl std::future::Future, ()}]`
    = note: required because it appears within the type `std::future::GenFuture<[static generator@src/main.rs:63:43: 103:2 {std::option::Option<spdk_rs::SpdkBdev>, impl std::future::Future, ()}]>`
    = note: required because it appears within the type `impl std::future::Future`
    = note: required because it appears within the type `impl std::future::Future`
    = note: required because it appears within the type `{impl std::future::Future, ()}`
    = note: required because it appears within the type `[static generator@src/main.rs:53:16: 61:2 {impl std::future::Future, ()}]`
    = note: required because it appears within the type `std::future::GenFuture<[static generator@src/main.rs:53:16: 61:2 {impl std::future::Future, ()}]>`
    = note: required because it appears within the type `impl std::future::Future`
    = note: required because it appears within the type `impl std::future::Future`
    = note: required by `tokio::run_async`

I'm new to Rust and I'm wondering if I need some special treatment when wrapping around spdk_bdev struct here? I'm wondering if you run into this issue before when you try to wrap around blob of spdk? What might be the cause for the error? I notice await! call works fine if I call it right at the beginning of run_inner() before any call relating to SPDK.

Also, I notice you implement your own executor. I'm wondering why you want to write your own instead of using something from Tokio?

Thanks much!

@jkozlowski
Copy link
Owner

jkozlowski commented Dec 15, 2018 via email

xxks-kkk added a commit to utsaslab/rustfs that referenced this issue Dec 15, 2018
@xxks-kkk
Copy link
Author

xxks-kkk commented Dec 15, 2018

@jkozlowski Thanks for the insights. I play around with your executor implementation and use it to implement hello_bdev. I run into the following issue:

  1. drop(poller) leads to seg fault code
ff697a41e in spdk_app_start (opts=0x7fffffffe150, start_fn=0x555555577670 <spdk_rs::event::SpdkAppOpts::start::start_wrapper::h27f22a723f05307c>, arg1=0x7fffffffdea8, arg2=0x0) at app.c:576
#3  0x000055555557772e in spdk_rs::event::SpdkAppOpts::start::h42df88f6a6f8e026 (self=..., f=...) at /home/zeyuanhu/share/rustfs/spdk-rs/src/event.rs:66
#4  0x000055555557aa6e in hello_nvme_bdev_rust_wrapper::main::he8692bb46e87c6f0 () at src/main.rs:107
#5  0x0000555555579f70 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h6afc4ff1576ed53f () at /rustc/21f26849506c141a6760532ca5bdfd8345247fdb/src/libstd/rt.rs:74
#6  0x00005555555c68f3 in std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h89c6e21182e0edb2 () at src/libstd/rt.rs:59
#7  std::panicking::try::do_call::h37211ff12254dd07 () at src/libstd/panicking.rs:310
#8  0x00005555555d465a in __rust_maybe_catch_panic () at src/libpanic_unwind/lib.rs:102
#9  0x00005555555c72c4 in std::panicking::try::hff26ecf2b679d32f () at src/libstd/panicking.rs:289
#10 std::panic::catch_unwind::h51f6113138b46c27 () at src/libstd/panic.rs:398
#11 std::rt::lang_start_internal::haf6df4f58cf174a0 () at src/libstd/rt.rs:58
#12 0x0000555555579f49 in std::rt::lang_start::hb3f6cdb8ed04a4e0 (main=0x55555557a750 <hello_nvme_bdev_rust_wrapper::main::he8692bb46e87c6f0>, argc=1, argv=0x7fffffffe4e8) at /rustc/21f26849506c141a6760532ca5bdfd8345247fdb/src/libstd/rt.rs:74
#13 0x000055555557ab7a in main ()

I check the drop implementation and the call to spdk_poller_unregister makes sense to me but you leave some comment say crash might happen. I'm wondering if you can explain more about your concern there?

  1. I cannot terminate SPDK framework by calling spdk_app_stop (code). If I follow your hello_blob example by calling spdk_app_stop once, reactor still keeps running. I figure that I have to drop(poller), which links to the previous issue. Now, I call spdk_app_stop twice to terminate SPDK framework. I'm wondering if my conjecture is correct (e.g., have to drop poller first before spdk_app_stop to shutdown SPDK)?

Thanks much!

@jkozlowski
Copy link
Owner

jkozlowski commented Dec 15, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants