Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse macro expressions. #219

Merged
merged 3 commits into from
Nov 9, 2016
Merged

Parse macro expressions. #219

merged 3 commits into from
Nov 9, 2016

Conversation

emilio
Copy link
Contributor

@emilio emilio commented Nov 6, 2016

Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.

@emilio
Copy link
Contributor Author

emilio commented Nov 6, 2016

@bors-servo try

@bors-servo
Copy link

⌛ Trying commit e0a0b1d with merge 9811b01...

bors-servo pushed a commit that referenced this pull request Nov 6, 2016
Parse macro expressions.

Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.
@bors-servo
Copy link

💔 Test failed - status-travis

@emilio
Copy link
Contributor Author

emilio commented Nov 6, 2016

@bors-servo try

@bors-servo
Copy link

⌛ Trying commit 3c2af5e with merge ee85cd8...

bors-servo pushed a commit that referenced this pull request Nov 6, 2016
Parse macro expressions.

Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.
@emilio
Copy link
Contributor Author

emilio commented Nov 6, 2016

r? @fitzgen

@bors-servo delegate=fitzgen

@bors-servo
Copy link

✌️ @fitzgen can now approve this pull request

// FIXME(emilio): LLVM 3.9 at least always include an extra token for no
// good reason (except if we're at EOF). So we do this kind of hack,
// where we skip known-to-cause problems trailing punctuation and
// trailing keywords.
Copy link
Contributor

@jethrogb jethrogb Nov 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just 3.9, I've observed this in 3.5 through 3.8. The problem is that the cursor extent is always off by one (even on EOF). You don't get an extra token on EOF, because, well, you're at EOF.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really unfortunate :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After inspecting a bit clang's source, the source range for macro expansions comes from deep in the lexer. That's more that what I want to chew in right now inside LLVM internals, so I'll open a bug for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't that hard actually.

LLVM bug (open since 2011): https://llvm.org/bugs/show_bug.cgi?id=9069
Patch: https://reviews.llvm.org/D26446

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in trunk :)

llvm-mirror/clang@3b61b92

CXToken_Literal => token::Kind::Literal,
CXToken_Identifier => token::Kind::Identifier,
CXToken_Keyword => token::Kind::Keyword,
CXToken_Comment => token::Kind::Comment,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should strip out comments (here or somewhere else before calling Identifier::macro_definition).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I saw the other bindgen doing that, but is there a good reason to? I expected cexpr to take care of them... Or do comments before the definition get included in the extent?

Thanks anyway, I'll add a test for it :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the docs:

Returns an error if the replacement is not a valid expression, if called on most function-like macros, or if the token stream contains comments, keywords or unknown identifiers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think cexpr should take care of it I can see if I can change that.

@jethrogb
Copy link
Contributor

jethrogb commented Nov 6, 2016

Wow, that was fast :)

There is one unfortunate thing about C macros evaluating into integers: we don't know what type they should be for convienent usage with the rest of the definitions. If some macros are always intended to be passed into a function that takes a short, the type of the constant should ideally be short. If it's not, you're risking integer type casting all over the Rust code (or you need to manually edit the bindgen output).

Upstream selects types based on the highest bit set in the constant, and it allows to user to specify the specific type in each case. This is kind of a kludge but the best I could come up with.

@emilio
Copy link
Contributor Author

emilio commented Nov 6, 2016

Yep, I prefer to keep it as simple as possible and don't try to be uber-smart about macro types, so I prefer to keep UInt/Int in general, and longs if out of range, to keep with the usual C semantics.

Seems easy enough to re-export them casting as other type or similar if needed, and it simplifies quite a bit of code.

@jethrogb
Copy link
Contributor

jethrogb commented Nov 6, 2016

Being able to do something about this in an automated fashion is blocking my adoption of this fork in mbedtls-auto-sys (along with several other features I think).

I'd be ok with post-processing the AST in my build script, but

  1. I can't distinguish in the Rust AST between C macros and C constants
  2. bindgen won't let me modify the AST and then write it out

@emilio
Copy link
Contributor Author

emilio commented Nov 6, 2016

You mean macro_int_types?

@jethrogb
Copy link
Contributor

jethrogb commented Nov 6, 2016

Yes, sorry, I meant to highlight that line.

@emilio
Copy link
Contributor Author

emilio commented Nov 6, 2016

What about the API I added in the last commit, does something like that fullfill your needs? I think it's nicer and more generic.

@jethrogb
Copy link
Contributor

jethrogb commented Nov 6, 2016

Yeah, this looks great, thanks!

@emilio
Copy link
Contributor Author

emilio commented Nov 6, 2016

Heh, in this case it felt a bit more ugly because I needed to do chooser.as_ref() anyway, I don't think it matters much.

@bors-servo
Copy link

☔ The latest upstream changes (presumably #218) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few questions and nitpicks :)

Thanks!

parsed_macros: HashSet<String>,
/// hard errors while parsing duplicated macros, as well to allow macro
/// expression parsing.
parsed_macros: HashMap<Vec<u8>, cexpr::expr::EvalResult>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this become Vec<u8> instead of String?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's what cexpr wants.

/// range.
///
/// The boolean means a whether this is a signed integer type or not.
Custom(&'static str, bool),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom {
    name: &'static str,
    is_signed: bool,
}

If you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good.

// identifier of a token comes straight from clang, and we
// enforce utf8 there, so we should have already panicked at
// this point.
let name = String::from_utf8(id).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment about why unwrapping is safe should become the str literal for .expect("blah blah") instead of using .unwrap() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I don't use to use big strings in expect(), I'll do if it looks pretty :)


let kind = match kind {
Some(kind) => kind,
None => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want:

let kind = ctx.options().type_chooser
    .and_then(|ch|  ch.int_macro(&name, value))
    .unwrap_or_else(|| {
        // <the stuff in this `None => {...}` match arm>
    });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that looks finer than what I thought. Will do

@@ -24,8 +24,9 @@ impl <T> ::std::clone::Clone for __BindgenUnionField<T> {
fn clone(&self) -> Self { Self::new() }
}
impl <T> ::std::marker::Copy for __BindgenUnionField<T> { }
pub const JSVAL_ALIGNMENT: ::std::os::raw::c_uint = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a alignas(8), so we were parsing it by chance.

@emilio
Copy link
Contributor Author

emilio commented Nov 8, 2016

Thanks for the review Nick! :)

@bors-servo r=fitzgen

@bors-servo
Copy link

📌 Commit 4ee31ce has been approved by fitzgen

@bors-servo
Copy link

⌛ Testing commit 4ee31ce with merge 8d83bdf...

bors-servo pushed a commit that referenced this pull request Nov 9, 2016
Parse macro expressions.

Clang is trolling me really hard so I'm going to see if the extra token I'm always getting is LLVM 3.9 specific.
@bors-servo
Copy link

☀️ Test successful - status-travis

@bors-servo bors-servo merged commit 4ee31ce into rust-lang:master Nov 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants