New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capi capture names #282
Capi capture names #282
Conversation
Hmm, it's hard to tell what exactly is going wrong... Here's my advice:
Sorry I can't diagnose your precise problem, but I'd attack this by ignoring Python and trying to make the code a bit simpler. Thank you for working on this! |
Okay, point taken about not using Python's cffi. I changed to calling the
|
@BurntSushi : I think I'm finally grokking the lifetime aspect of this change. However, I think to fully support it is going to require more changes than I'm comfortable making:
This change does not fix the segfault reported in #282 (comment). |
That warning is there for good reason. Lifetime parameters are type parameters, and you can't expose generics in a C ABI. You have to remove it. (You might do it by using I'm not sure about your other problems. I'll need to have a closer look for that, which I can do later today thanks to your C test. :-) |
Just to say a bit more about
This is precisely what that (Technically, I don't think an |
Result::Err(err) => return false | ||
}; | ||
println!("CString cast: '{}'", cs.to_str().unwrap()); | ||
*capture_name = *cs.into_raw(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem is here; the CString
is getting freed when the unsafe
block completes, thus the C unit test triggers a segfault trying to read the memory (similar to https://users.rust-lang.org/t/correct-way-to-implement-a-function-which-returns-a-c-string/315/3 ). I think the solution is to use a Box
, but I'm unsure of the details of how to do that.
@BurntSushi : Figured the problem out; it all boiled down to 4702e69 it appears. The C unit test now passes correctly. Will test in Python's cffi next. |
I've rebased my rure-python branch on top of this, and it's working great. https://github.com/davidblewett/regex/tree/rure-python |
@BurntSushi : Would like to get this merged; it appears everything looks in place. Tests are passing, and I've cleaned up the debug statements I had in from earlier. Tried to condense the code back down so that it is easier to follow as well. I have a functioning Python binding I would like to issue a PR for that is based on this. |
@davidblewett This all looks good. Would you mind squashing this down to one commit and rebase it on to current master with a descriptive commit message? After that it should be good to go. Thanks so much for doing this! Very nice work. :-)
A PR to this repo? Or somewhere else? |
Yeah, I can squash it. Will try to do that tomorrow. The branch I've been working on Python support is here: davidblewett/regex@capi-capture-names...davidblewett:rure-python . |
To be clear, I'm not sure I'm comfortable with adding other language bindings. Maintaining a canonical C API is one thing, but opening the door to every other language doesn't seem like a good idea. :( |
* Add new `rure_iter_capture_names` struct - Opaque pointer encapsulates access to: - Underyling Rust iterator - Each capture group name CString * Add functions for instantiating the iterator and processing: - `rure_iter_capture_names_new` - `rure_iter_capture_names_next` - `rure_iter_capture_names_free` * Track CString objects handed out, and free them when called. * Add unit test for new functions
e52916e
to
7f3d364
Compare
@BurntSushi : went ahead and squashed the commits to a single and updated the PR. |
Yeah, I can understand that. I'm trying to find the best route to do this ( I asked on the web forum awhile back: https://users.rust-lang.org/t/binding-regex-from-python/7253 ). The problem that I'm running into is that I couldn't figure out how to have a simple "wrapper" crate of the By having it inside the |
All right, all set! Thanks so much for your patience and pushing through this. I'm not sure what the in-crate/out-crate issue is, but if you open a new issue with more details I'd be happy to help! |
Refs #279.
This isn't ready, but I need some help. I'm stumped by how Rust is handling strings. In my tests with these changes, I see this when using the API:
I'm not sure if this implementation is correct and the Python FFI is not working, or if the implementation is flawed. What I get back from each call to
rure_iter_next_captures
is either an empty string (when usingffi.new('char[]', [])
), a single character (when usingffi.new('char[]', '')
), or a single character followed by padding (when usingffi.new('char[]', ' ' * 20)
).