Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Whitequark's Code to extract #define's and other compile time info from object files #678

Merged
merged 3 commits into from
Apr 18, 2018

Conversation

rgrinberg
Copy link
Member

note: only the last commit is relevant to the PR. Here's the commit message:

Do not run compiled programs to extract #define's

Running a program to extract a #define value doesn't work in a cross compilation
environment. Nevertheless, we can extract #define constants by invoking the
preprocessor directly using the -E flag and doing some parsing to extract
values.

As a consequence, we now ignore the link_flags argument. As we're not going to
be linking any executables. We aren't removing the argument altogether since
it's technically a breaking change. The user will instead see a deprecation
warning when ~link_flags is provided.

@dra27 would be good to hear your opinion about this and how well would this hack work on Windows.

cc @Chris00 because I know he was interested in implementing this in the original configurator

cc @avsm b/c it's a general cross compilation issue

I know that @whitequark implemented a solution to the same problem in ctypes. But he opted for simply compiling the programs and then using preprocessor hacks in conjunction with regex to extract the #define's. Would be good to have his opinion here about this approach instead.

@rgrinberg rgrinberg requested review from a user, Chris00 and dra27 April 6, 2018 03:47
@whitequark
Copy link
Member

I know that @whitequark implemented a solution to the same problem in ctypes. But he opted for simply compiling the programs and then using preprocessor hacks in conjunction with regex to extract the #define's. Would be good to have his opinion here about this approach instead.

In case of ctypes the problem was more complex as the values that I needed weren't just hidden in preprocessor definitions but they were the result of evaluating sizeof. So, nothing short of actual compilation would have (reasonably portably) worked.

Also, she*.

@rgrinberg
Copy link
Member Author

rgrinberg commented Apr 6, 2018 via email

@whitequark
Copy link
Member

Hmm, a cross compilation solution that doesn't support ctypes is certainly
not satisfactory to us. How do you feel about us just taking your code and
incorporating it into configurator? I don't see any other way to support
this otherwise.

I hereby declare any contributions I made for the ctypes build system licensed under CC0, so go ahead and take them.

I also wonder if we should keep this preprocessor appoach as an
optimization or as backup.

The preprocessor approach is clearly cleaner, but is strictly less general, so I'm not sure if much can be gained from it. One thing it might be useful for is getting a list of all definitions, which is much faster.

@rgrinberg
Copy link
Member Author

The preprocessor approach is clearly cleaner, but is strictly less general, so I'm not sure if much can be gained from it. One thing it might be useful for is getting a list of all definitions, which is much faster.

Well I was thinking about possible portability issues. But this is probably just paranoia and me being pretty unfamiliar with this subject. I suppose we can keep the -E solution in the git history until we have a solid use case for it.

@rgrinberg
Copy link
Member Author

@whitequark quick question regarding Extract_from_c: there doesn't seem to have a way to check if a constant is defined (the Switch check . I could probably figure this out, but if you know something off the top of your head, then please shoot.

@rgrinberg
Copy link
Member Author

Ah, I just remembered that a CLA is necessary before we include your code.

https://janestreet.github.io/contributing.html

Sorry for the hassle, I know this is a drag.

@whitequark
Copy link
Member

"Fax"? What century is this?

Anyway, it's going to be a while until I have any access to a printer and scanner, so you'll have to wait, I guess.

@whitequark
Copy link
Member

quick question regarding Extract_from_c: there doesn't seem to have a way to check if a constant is defined (the Switch check . I could probably figure this out, but if you know something off the top of your head, then please shoot.

Sure. Insert into the usual template:

#ifdef %s
const char *s = "BEGIN-Y-END";
#else
const char *s = "BEGIN-N-END";
#endif

@rgrinberg
Copy link
Member Author

Okay, thanks for the quick response. If you have a smart phone then a scanner is likely not necessary btw 😅

There was talk about simplifying the CLA into a DCO. Any updates on that @avsm @yminsky?

@whitequark
Copy link
Member

There was talk about simplifying the CLA into a DCO.

Why not simplify the CLA signing process instead? E.g. with clahub.

@rgrinberg
Copy link
Member Author

Given that we want to keep this approach in the git history (in case it ever comes in handy), the plan is to merge this PR as is (after review). I will prepare a new PR to use the extract_from_c approach.

@emillon emillon mentioned this pull request Apr 6, 2018
@rgrinberg
Copy link
Member Author

Btw, we have 1 more reason to merge this PR in its current form. We don't want to delay things until whitequark signs the CLA. And we don't wanna rush her either.

Though I'm not yet sure if we'll include this feature in 1.21 - which will be the first release with jbuilder.configurator.

@Chris00
Copy link
Member

Chris00 commented Apr 11, 2018

I think @whitequark 's approach is very nice and suits my needs — while the -E approach does not. IMHO, it should be generalized in a straightforward manner to require a single compilation to extract multiple values (this should make the extra cost reasonable).

@ghost
Copy link

ghost commented Apr 11, 2018

Unless there is a need for the current PR right away, I think we should wait and merge only the final version.

@whitequark
Copy link
Member

Okay, I'll try to get CLA signed faster, given that it doesn't look like the situation with it will change. Will try to do it today.

@ghost
Copy link

ghost commented Apr 11, 2018

Thanks @whitequark.

@rgrinberg regarding switching to a DCO, we are making good progress. I just had another meeting to discuss some of the questions, but it's not yet confirmed.

@whitequark
Copy link
Member

CLA signed and sent to core-cla@.

@ghost
Copy link

ghost commented Apr 12, 2018

Well received, thanks!

@rgrinberg rgrinberg changed the title Use -E flag to extract #define's Use Whitequark's Code to extract #define's and other compile time info from object files Apr 13, 2018
@rgrinberg rgrinberg mentioned this pull request Apr 13, 2018
@rgrinberg
Copy link
Member Author

Because of the swift turnaround in the CLA situation, we can just proceed and do everything in this PR.

So I implemented whitequark's method and made it work with a single compilation step. What remains is:

  • Replace use of Str. I guess I'll use ocamllex since using jbuilder's re is a bit painful
  • Test things on Windows.
  • Share code between compile_c_prog and compile_and_link_prog.

@Chris00 or @dra27 running the test on windows would be appreciated.

@rgrinberg
Copy link
Member Author

@whitequark could you describe what the STRINGIFY macros is for? The definition is a bit mysterious:

#define STRINGIFY1(x) #x
#define STRINGIFY(x) STRINGIFY1(x)

@rgrinberg rgrinberg closed this Apr 14, 2018
@rgrinberg rgrinberg reopened this Apr 14, 2018
@whitequark
Copy link
Member

@rgrinberg
Copy link
Member Author

@whitequark thanks.

One thing that's left out of this PR is support for extracting the values of macros like STRINGIFY and other compile time information like alignof or sizeof. I've strictly implemented the old discovery API in terms of this new approach.

I will propose some ways to expose the new type of information plus the ability for a user to define their own macros in a follow up.

@@ -329,8 +329,6 @@ module C_define = struct
);
if has_type Type.String then (
pr {|
#define STRINGIFY1(x) #x
#define STRINGIFY(x) STRINGIFY1(x)

#if __USE_MINGW_ANSI_STDIO && defined(__MINGW64__)
#define REAL_ARCH_INTNAT_PRINTF_FORMAT \"ll\"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the way this is internal to ctypes, you probably don't need it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll remove it. There will be a way to add and use your own custom macros so this use case will be covered. The end goal is to port ctypes to jbuilder, so if you see anything else that needs to be done for that happen, please let me know.

@ghost
Copy link

ghost commented Apr 16, 2018

This is a nice method, I didn't know about it!

@rgrinberg, IIUC the code currently relies on the order of strings in the object file? This looks a bit fragile to me.

@whitequark
Copy link
Member

Yes, it needs to emit strings like "BEGIN-0-" etc.

@ghost
Copy link

ghost commented Apr 16, 2018

ok

Copy link
Member

@Chris00 Chris00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good modulo the remark made that one should not rely on the order of the variables.

@@ -0,0 +1 @@
val extract : string list -> Lexing.lexbuf -> string list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please document it? In particular, you may want to let the reader know that this need to be synchronized with...

Buffer.contents buf

let extract_values obj_file vars =
let obj = Io.read_file obj_file in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation to read the whole file into memory? Granted, it is small so it is no harm but I am wondering...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I pushed a commit to fix this.

@rgrinberg
Copy link
Member Author

I'm no longer relying on the order of strings in the object file. @diml have another look.


let extract_values obj_file vars =
let values =
Exn.protectx (open_in obj_file)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Io.with_lexbuf_from_file here

@rgrinberg
Copy link
Member Author

@diml good point. fixed.

I also added an entry module so that the helper modules isn't accessible to the user.

@rgrinberg rgrinberg force-pushed the configurator-E branch 2 times, most recently from 2bcfc11 to 4c2c7ba Compare April 17, 2018 11:53
@ghost
Copy link

ghost commented Apr 17, 2018

Ok, LGTM!

Running a program to extract a #define value doesn't work in a cross compilation
environment. Nevertheless, we can extract #define constants by invoking the
preprocessor directly using the -E flag and doing some parsing to extract
values.

As a consequence, we now ignore the link_flags argument. As we're not going to
be linking any executables. We aren't removing the argument altogether since
it's technically a breaking change. The user will instead see a deprecation
warning when ~link_flags is provided.
Only expose the public api (v1) in it
@rgrinberg rgrinberg merged commit cc8de7d into ocaml:master Apr 18, 2018
@rgrinberg rgrinberg deleted the configurator-E branch April 18, 2018 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants