Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c ffi #39

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

c ffi #39

wants to merge 9 commits into from

Conversation

hanabi1224
Copy link
Contributor

@michaelsproul Here is an implementation proposal with some basic APIs for demonstration, let me know your comments.

@hanabi1224 hanabi1224 mentioned this pull request Aug 29, 2018
@michaelsproul
Copy link
Owner

I think this looks promising.

Do you think the use of a Trie<Vec<u8>, usize> is about as general as we can do? Obviously we can't allow arbitrary value types (like Trie<Vec<u8>, V>) because of C's lack of generics, but we can store pointers to data allocated on the C side, which is pretty good?

Cargo.toml Outdated
@@ -14,6 +14,7 @@ keywords = ["trie", "patricia", "collection", "generic", "prefix"]
categories = ["data-structures", "text-processing"]

[dependencies]
libc = "0.2"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small thing, but it would be good if this was an optional dependency behind a feature flag, see: https://doc.rust-lang.org/cargo/reference/manifest.html#the-features-section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What feature name would you suggest?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ffi or c-bindings or c_bindings would probably be good

src/c_ffi.rs Outdated
ffi_fn! {
fn radix_trie_insert(trie_ptr:*const Trie<Vec<u8>, usize>, key_ptr:*const c_char, value:usize){
let mut trie = unsafe { &mut *(trie_ptr as *mut Trie<Vec<u8>, usize>) };
let key = unsafe { CStr::from_ptr(key_ptr) }.to_bytes().to_vec();
Copy link
Owner

@michaelsproul michaelsproul Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could optimise this (avoid a double allocation), by implementing TrieKey for CStr. Using a Vec<u8> as the key means that it gets cloned each time, see:

impl TrieKey for Vec<u8> {
fn encode_bytes(&self) -> Vec<u8> {
self.clone()
}
}

Copy link
Contributor Author

@hanabi1224 hanabi1224 Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I did this on purpose to tolerant unexpected string pointer deallocation from foreign languages, especially managed ones whose memory is not being handled explicitly, does that make sense?

Copy link
Owner

@michaelsproul michaelsproul Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that we want that behaviour, but the current code copies the string twice. The .to_bytes().to_vec() call will copy everything into a Vec<u8>, and when that Vec<u8> gets inserted in the trie it will get cloned. Instead, if we generate the Vec<u8> in the TrieKey::encode_bytes function, the string will only get copied once, like:

impl TrieKey for CStr {
    fn encode_bytes(&self) -> Vec<u8> {
        self.to_bytes().to_vec()
    }
}

Maybe the compiler is smart enough to optimise the double-copy away, but it doesn't hurt to give it some help (and impl TrieKey for CStr is generally useful anyway)

@hanabi1224
Copy link
Contributor Author

@michaelsproul yes exactly

@hanabi1224
Copy link
Contributor Author

@michaelsproul Can you please take another look?

@michaelsproul
Copy link
Owner

Hi @hanabi1224! Sorry I missed your comment in September, I seemed to only get notified about it last week!

Anyway, I think the code is looking pretty good. I'm just wondering if you have tried using the API from a GC'd language like Python/Ruby/Haskell, and how that went?

src/lib.rs Outdated
#[cfg(feature = "cffi")]
pub use c_ffi::*;

pub use nibble_vec::NibbleVec;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate import (delete this line)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -11,6 +12,17 @@ extern crate quickcheck;
#[cfg(test)]
extern crate rand;

#[macro_use]
mod macros;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate, delete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

src/c_ffi.rs Outdated

ffi_fn! {
fn radix_trie_insert(trie_ptr:*const Trie<CString, usize>, key_ptr:*const c_char, value:usize){
let trie = unsafe { &mut *(trie_ptr as *mut Trie<CString, usize>) };
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the trie_ptr argument to this function should be *mut Trie<CString, usize>, at least to convey the intent that it's a mutable pointer (and to allow us to remove some of the casting gunk)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to run rustfmt over the whole project before the build will pass, you can see the .travis.yml for how to set it up

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! You can remove the cast (trie_ptr as *mut Trie<CString, usize>) too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants