New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse inputs from ActionGraphContainer instead of arguments #37
Parse inputs from ActionGraphContainer instead of arguments #37
Conversation
Hey, @alexander-born! Good catch--and definitely not a case I'd thought of. I hadn't realized people named directories with source extensions, but yep, that'll definitely trip us up. Sorry for being a little slow on the reply here. Know that I really appreciate you and your helping make this tool better. Here's what I'm thinking while reading:
Thanks for your thoughtfulness and care! |
|
Thank you! You're great :) |
@alexander-born, a potentially better alternative to directory-checking occurred while I was going to sleep last night: [A potential problem with directory checking is that perhaps the directory might not exist yet if they're build outputs.] Instead, we're already getting information about the inputs a compile command uses from aquery. What about trying to filter the candidate source files from the inputs Bazel says the action takes? That list should exist even if the files themselves haven't been generated yet. We could maybe even just filter through the inputs list for source files, rather than parse the command line at all, but we'd need to test that Bazel never hides the source behind "middlemen"... To see what Bazel thinks the inputs are--and to make sure that they don't include those problematically named directories--you can run, e.g., We're already parsing the aquery output in the python code. See the You can see that raw format either by dumping it from python or via, e.g., Thoughts? P.S. To show you an example of human-readable aquery output for a CppCompile action--and maybe save you some time (note "Inputs:" list) action 'Compiling kstring_test.cpp'
Mnemonic: CppCompile
Target: //:kstring_test
Configuration: darwin-fastbuild
Execution platform: @local_config_platform//:host
ActionKey: 2af4f0466d83be63669629d78d2553b1fd071310f90f59a8634a2e39c123e414
Inputs: [bazel-out/darwin-fastbuild/internal/_middlemen/_S_S_Ckstring_Utest-BazelCppSemantics_build_arch_darwin-fastbuild, external/bazel_tools/tools/cpp/grep-includes.sh, external/local_config_cc/cc_wrapper.sh, external/local_config_cc/libtool, external/local_config_cc/libtool_check_unique, external/local_config_cc/make_hashed_objlist.py, external/local_config_cc/wrapped_clang, external/local_config_cc/wrapped_clang_pp, external/local_config_cc/xcrunwrapper.sh, kstring_test.cpp]
Outputs: [bazel-out/darwin-fastbuild/bin/_objs/kstring_test/kstring_test.d, bazel-out/darwin-fastbuild/bin/_objs/kstring_test/kstring_test.o]
ExecutionInfo: {requires-darwin: '', supports-xcode-requirements-set: ''}
Command Line: (exec external/local_config_cc/wrapped_clang_pp \
'-D_FORTIFY_SOURCE=1' \
-fstack-protector \
-fcolor-diagnostics \
-Wall \
-Wthread-safety \
-Wself-assign \
-fno-omit-frame-pointer \
-O0 \
-DDEBUG \
'-std=c++11' \
'DEBUG_PREFIX_MAP_PWD=.' \
-iquote \
. \
-iquote \
bazel-out/darwin-fastbuild/bin \
-iquote \
external/com_google_googletest \
-iquote \
bazel-out/darwin-fastbuild/bin/external/com_google_googletest \
-iquote \
external/bazel_tools \
-iquote \
bazel-out/darwin-fastbuild/bin/external/bazel_tools \
-isystem \
external/com_google_googletest/googlemock \
-isystem \
bazel-out/darwin-fastbuild/bin/external/com_google_googletest/googlemock \
-isystem \
external/com_google_googletest/googlemock/include \
-isystem \
bazel-out/darwin-fastbuild/bin/external/com_google_googletest/googlemock/include \
-isystem \
external/com_google_googletest/googletest \
-isystem \
bazel-out/darwin-fastbuild/bin/external/com_google_googletest/googletest \
-isystem \
external/com_google_googletest/googletest/include \
-isystem \
bazel-out/darwin-fastbuild/bin/external/com_google_googletest/googletest/include \
-MD \
-MF \
bazel-out/darwin-fastbuild/bin/_objs/kstring_test/kstring_test.d \
'-frandom-seed=bazel-out/darwin-fastbuild/bin/_objs/kstring_test/kstring_test.o' \
-isysroot \
__BAZEL_XCODE_SDKROOT__ \
-F__BAZEL_XCODE_SDKROOT__/System/Library/Frameworks \
-F__BAZEL_XCODE_DEVELOPER_DIR__/Platforms/MacOSX.platform/Developer/Library/Frameworks \
'-mmacosx-version-min=12.3' \
-no-canonical-prefixes \
-pthread \
'-std=c++20' \
-no-canonical-prefixes \
-Wno-builtin-macro-redefined \
'-D__DATE__="redacted"' \
'-D__TIMESTAMP__="redacted"' \
'-D__TIME__="redacted"' \
-target \
x86_64-apple-macosx \
-c \
kstring_test.cpp \
-o \
bazel-out/darwin-fastbuild/bin/_objs/kstring_test/kstring_test.o)
# Configuration: 2bc885e6d31c22261d787b93d674f32836182b8da879aaf54b77618b9a74d265
# Execution platform: @local_config_platform//:host
ExecutionInfo: {requires-darwin: '', supports-xcode-requirements-set: ''} |
Wow you're good. Directory checking did indeed not work because as you correctly assumed, they did not exist yet 👍 I will try your suggested solution. |
heh, well, honored to be of help. Wish only that I'd gotten to it earlier. |
Unfortunately in my case the CppCompile action does not have a "inputs" list.
|
Would it make sense to use the arguements where the previous argument is
On a first glance it seems that the input source files are always prepended with the |
The inputs are sneakily in there! See the "inputDepSetIds"? [If we can, I think we should try to avoid using the proximity to -c, since that's just by coincidence.] Getting late over here! I'd better crash for now, but good luck! And don't hesitate to write back. |
Ah I see, thanks. |
d5118d5
to
b74a4e5
Compare
I just pushed changes to parse the input files from the |
Wahoo! Testing. |
Shucks. Okay, looks like we've got (at least) one case with two cpp files in the input.
I'll dig a bit |
Very weird. So I think we should maybe just intersect the two sets of input candidates? |
I'm considering parsing the file out of the action name instead. It's only available in the --text output, but I could match on actionKey. Another downside is that it excludes the external/ prefix. Hmm. |
That's unfortunate. I am just guessing here. Maybe we don't need to consider transitive dep sets? |
Oh, good point! I'd think, too, that the source file would always be directly depended on. Let's see |
Tried commenting out your good depset traversal code, but no dice. It seems like the source file isn't included directly. I'll dig. In the meantime, to confirm, the code to try disabling is the block starting with |
Correct. |
[Ugh, I'm sorry this parsing was so gross.] |
Looks like the source file is always one transitive layer deep, but it's probably dangerous to depend on that vs just intersecting the two. |
Ok, so creating the intersection it is? |
I think so, but gimme one sec. |
I'm trying to decide whether doing that intersection is cleaner than getting them from the action message. |
The benefit of that would be easy to modify, non-recursive code, and clearly simple and fast, though you've already done a great job with the other. The downside (as above) would be that the action message is a little different with tools and external workspaces. For example: I can imagine solving this with some intersect logic like making sure the path in the action name is contained in the path from the command line, but things like paths with spaces are going to be a bad time. |
I think that's likely to be worse. Let's do the intersect. |
Do you want to do, or should I? (Getting pretty late here, so I'd do tm.) |
Honestly, should we just take advantage of the proximity to the I'm regretting dismissing that earlier. I'd originally thought this would be a much easier filtering operation. |
I see the following options:
|
Okay, yeah, after all that, I think my vote is for using |
Give me a few minutes. |
Thanks for being great. I'm so sorry we're likely wasting the good recursive traversal you wrote. But I love your resourcefulness. |
b74a4e5
to
2424c89
Compare
No worries regarding wasting the traversal code, sometimes one has to explore a few options and it didn't take too much time. |
Can you try the current changes? |
Yep! On it. |
heh, yikes, just ran the pull and overwrote another repo. One sec. |
Works for me yes. |
Works for me! (Though my other repo is still a bit messed) [I'm still going to finish fix that other repo. And I'm going to massage things just a little here, adding one performance trick I noticed while reading the docs exploring, documenting the assumption we made, and defending it with an assert.] You should feel super good about this! I really appreciate your tracking down a new problem case, finding us the best solution, and being willing to explore around to find the best option. |
Thanks for your kind words and very good support! |
And likewise :) |
<Commits actually merged, but GitHub didn't recognize, so I'll manually close.> @alexander-born, if you get a chance, after you update, could you double check that this fixes things for you? Cheers! |
Thanks, yes the compile commands generation works now for the previously problematic targets! |
For some targets the assert for multiple source files in
_get_files
stops the compile commands extraction.This is due to some generated directories ending with source file extensions.
example:
This change does not interpret arguments as source files if the previous argument is
-isystem
.