New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: re_take_until! #709
Comments
hello, |
Example: Applying the above on |
Honestly, a regex-based combinator would be absolutely amazing to have. I don't doubt for one moment that nom can do everything regex can do, but there's just something nice about the succinctness of being able to write something like If you're then able to throw that into the greater nom ecosystem, that would be splendid. |
Without diving into the academic way of looking at this statement, I don't think there is a nom equivalent of this particular proposal. The take_* macros only do |
there's a lot of regex based combinators, you can find them by looking for the prefix |
Not sure if replying to me, but to clarify, this doesn't exist (yet), so I just wrote it myself: // `take_till_match!(alt!(tag!("John") | tag!("Amanda")))`
// Running that on `"Hello, Amanda"` gives `Ok(("Amanda", "Hello, "))`
macro_rules! take_till_match(
(__impl $i:expr, $submac2:ident!( $($args2:tt)* )) => (
{
use $crate::lib::std::result::Result::*;
use $crate::lib::std::result::Result::*;
use $crate::lib::std::option::Option::*;
// TODO: replace nom with $crate
use nom::{Err, Needed,need_more_err, ErrorKind};
use nom::InputLength;
use nom::FindSubstring;
use nom::InputTake;
use nom::Slice;
let ret;
let input = $i;
let mut index = 0;
loop {
let slice = input.slice(index..); // XXX: this is bad with multi-byte unicode
match $submac2!(slice, $($args2)*) {
Ok((_i, _o)) => {
ret = Ok(input.take_split(index));
break;
},
Err(_e1) => {
if index >= input.len() {
// XXX: this error is dramatically wrong
ret = need_more_err(input, Needed::Size(0), ErrorKind::TakeUntil::<u32>);
break;
} else {
index += 1;
}
},
}
}
ret
}
);
($i:expr, $submac2:ident!( $($args2:tt)* )) => (
take_till_match!(__impl $i, $submac2!($($args2)*));
);
($i:expr, $g:expr) => (
take_till_match!(__impl $i, call!($g));
);
($i:expr, $submac2:ident!( $($args2:tt)* )) => (
take_till_match!(__impl $i, $submac2!($($args2)*));
);
($i:expr, $g: expr) => (
take_till_match!(__impl $i, call!($g));
);
); |
I took @cormacrelf 's macro and made some changes. First, I added a trait to allow "safe-slicing" of strings. Secondly, I modified the macro to make use of the trait. |
@lawliet89 that's closer, but you could reuse existing APIs by making the trait give you an Iterator instead. Just abstract {
let input = $i;
for index in input.char_indices().map(|(i, _)| i) {
let slice = input.slice(index..);
match $submac2!(slice, $($args2)*) {
Ok((_i, _o)) => {
return Ok(input.take_split(index));
},
Err(_e1) => { },
}
}
need_more_err(input, Needed::Size(0), ErrorKind::TakeUntil::<u32>)
} |
@cormacrelf Thanks for your suggestion! Made some changes and it looks much better. |
Hey just stumbled upon this issue, I actually have a PR open for a Unfortunately it seems Geal is very busy right now so I have no idea when it'll get eyes on it again. PR: #469 |
I'd propose closing this as regex functions are no longer present in this crate. I've opened up a new issue on nom-regex, rust-bakery/nom-regex#3, to continue the request. I don't think |
I am trying to parse a not-so-structured document, and this would be a nice feature to have, so that I don't have to directly rely on
regex
package (and for better code readability).If you are interested, I can do a PR since it seems fairly straightforward.
The text was updated successfully, but these errors were encountered: