-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse macro expressions. #219
Conversation
@bors-servo try |
Parse macro expressions. Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.
💔 Test failed - status-travis |
@bors-servo try |
Parse macro expressions. Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.
r? @fitzgen @bors-servo delegate=fitzgen |
✌️ @fitzgen can now approve this pull request |
// FIXME(emilio): LLVM 3.9 at least always include an extra token for no | ||
// good reason (except if we're at EOF). So we do this kind of hack, | ||
// where we skip known-to-cause problems trailing punctuation and | ||
// trailing keywords. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just 3.9, I've observed this in 3.5 through 3.8. The problem is that the cursor extent is always off by one (even on EOF). You don't get an extra token on EOF, because, well, you're at EOF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really unfortunate :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After inspecting a bit clang's source, the source range for macro expansions comes from deep in the lexer. That's more that what I want to chew in right now inside LLVM internals, so I'll open a bug for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't that hard actually.
LLVM bug (open since 2011): https://llvm.org/bugs/show_bug.cgi?id=9069
Patch: https://reviews.llvm.org/D26446
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in trunk :)
CXToken_Literal => token::Kind::Literal, | ||
CXToken_Identifier => token::Kind::Identifier, | ||
CXToken_Keyword => token::Kind::Keyword, | ||
CXToken_Comment => token::Kind::Comment, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should strip out comments (here or somewhere else before calling Identifier::macro_definition).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I saw the other bindgen doing that, but is there a good reason to? I expected cexpr to take care of them... Or do comments before the definition get included in the extent?
Thanks anyway, I'll add a test for it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the docs:
Returns an error if the replacement is not a valid expression, if called on most function-like macros, or if the token stream contains comments, keywords or unknown identifiers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think cexpr should take care of it I can see if I can change that.
Wow, that was fast :) There is one unfortunate thing about C macros evaluating into integers: we don't know what type they should be for convienent usage with the rest of the definitions. If some macros are always intended to be passed into a function that takes a short, the type of the constant should ideally be short. If it's not, you're risking integer type casting all over the Rust code (or you need to manually edit the bindgen output). Upstream selects types based on the highest bit set in the constant, and it allows to user to specify the specific type in each case. This is kind of a kludge but the best I could come up with. |
Yep, I prefer to keep it as simple as possible and don't try to be uber-smart about macro types, so I prefer to keep Seems easy enough to re-export them casting as other type or similar if needed, and it simplifies quite a bit of code. |
Being able to do something about this in an automated fashion is blocking my adoption of this fork in mbedtls-auto-sys (along with several other features I think). I'd be ok with post-processing the AST in my build script, but
|
You mean |
Yes, sorry, I meant to highlight that line. |
What about the API I added in the last commit, does something like that fullfill your needs? I think it's nicer and more generic. |
Yeah, this looks great, thanks! |
Heh, in this case it felt a bit more ugly because I needed to do |
☔ The latest upstream changes (presumably #218) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few questions and nitpicks :)
Thanks!
parsed_macros: HashSet<String>, | ||
/// hard errors while parsing duplicated macros, as well to allow macro | ||
/// expression parsing. | ||
parsed_macros: HashMap<Vec<u8>, cexpr::expr::EvalResult>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this become Vec<u8>
instead of String
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's what cexpr
wants.
/// range. | ||
/// | ||
/// The boolean means a whether this is a signed integer type or not. | ||
Custom(&'static str, bool), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom {
name: &'static str,
is_signed: bool,
}
If you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, sounds good.
// identifier of a token comes straight from clang, and we | ||
// enforce utf8 there, so we should have already panicked at | ||
// this point. | ||
let name = String::from_utf8(id).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment about why unwrapping is safe should become the str literal for .expect("blah blah")
instead of using .unwrap()
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I don't use to use big strings in expect()
, I'll do if it looks pretty :)
|
||
let kind = match kind { | ||
Some(kind) => kind, | ||
None => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want:
let kind = ctx.options().type_chooser
.and_then(|ch| ch.int_macro(&name, value))
.unwrap_or_else(|| {
// <the stuff in this `None => {...}` match arm>
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that looks finer than what I thought. Will do
@@ -24,8 +24,9 @@ impl <T> ::std::clone::Clone for __BindgenUnionField<T> { | |||
fn clone(&self) -> Self { Self::new() } | |||
} | |||
impl <T> ::std::marker::Copy for __BindgenUnionField<T> { } | |||
pub const JSVAL_ALIGNMENT: ::std::os::raw::c_uint = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a alignas(8)
, so we were parsing it by chance.
Thanks for the review Nick! :) @bors-servo r=fitzgen |
📌 Commit 4ee31ce has been approved by |
Parse macro expressions. Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.
☀️ Test successful - status-travis |
Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.