Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend source file finding heuristic. #72

Merged
merged 1 commit into from
Aug 13, 2022

Conversation

ergawy
Copy link

@ergawy ergawy commented Aug 12, 2022

It looks like the current heuristic to find the source file in the
compilation command doesn't cover all possible scenarios. In particular,
there are situations where the source file is specified just before the
-o flag. This commit extends the heuristic to check for that as well.

It looks like the current heuristic to find the source file in the
compilation command doesn't cover all possible scenarios. In particular,
there are situations where the source file is specified just before the
`-o` flag. This commit extends the heuristic to check for that as well.
@ergawy
Copy link
Author

ergawy commented Aug 12, 2022

Thanks for the awesome tool. For context, I found this while building Carbon.

@cpsauer cpsauer merged commit c5384aa into hedronvision:main Aug 13, 2022
@cpsauer
Copy link
Contributor

cpsauer commented Aug 13, 2022

Cool! Great to meet you, @ergawy. Thanks for using and contributing, leaving things better than you found them.
I'm delighted to merge, and I really appreciate your high quality patch.

@cpsauer
Copy link
Contributor

cpsauer commented Aug 13, 2022

Just for my learning, any chance you have one of these commands lying around so I can see an example?
Anything I should know about what causes that other pattern of commands to be generated? (Like are they CppCompile actions generated from Carbon code?) I'm just wondering to myself whether there's some way of making the heuristic even more general for the future.

@cpsauer
Copy link
Contributor

cpsauer commented Aug 13, 2022

And while you're reading, since I saw you work at GuardSquare :)
We're fans of ProGuard and have been working with the Bazel folks to get a bunch of aspects of their ProGuard support fixed.
Is there anything you'd recommend for iOS that's similar and would work well with Bazel? Our goal is less about obfuscation and more about hiding class names that could cause class name conflicts, if, say, a user of a library we built used a different version of the same dependency. That's a problem we have solved well on Android but not yet on iOS, and I figured there was a chance you might have the answer in your "mental cache." If it's a cache miss, no problem.

cpsauer added a commit that referenced this pull request Aug 13, 2022
@ergawy
Copy link
Author

ergawy commented Aug 14, 2022

Hi Christopher,

Thanks for merging, here is a an example command:

['/usr/local/Cellar/llvm@12/12.0.1_1/bin/clang++
', '-no-canonical-prefixes', '-fcolor-diagnostics', '-Werror', '-Wall', '-Wextra', '-Wthread-safety', '-Wself-assign',
 '-Wimplicit-fallthrough', '-Wctad-maybe-unsupported', '-Wnon-virtual-dtor', '-Wno-unused-parameter', '-c', '-MD', '-M
F', 'bazel-out/darwin-fastbuild/bin/explorer/interpreter/_objs/type_checker/builtins.pic.d', '-frandom-seed=bazel-out/
darwin-fastbuild/bin/explorer/interpreter/_objs/type_checker/builtins.pic.o', '-std=c++17', '-stdlib=libc++', '-ffunct
ion-sections', '-fdata-sections', '-fPIC', '-Wno-builtin-macro-redefined', '-D__DATE__="redacted"', '-D__TIMESTAMP__="
redacted"', '-D__TIME__="redacted"', '-DLLVM_ON_UNIX=1', '-DHAVE_BACKTRACE=1', '-DBACKTRACE_HEADER=<execinfo.h>', '-DL
TDL_SHLIB_EXT=".so"', '-DLLVM_PLUGIN_EXT=".so"', '-DLLVM_ENABLE_THREADS=1', '-DHAVE_DEREGISTER_FRAME=1', '-DHAVE_LIBPT
HREAD=1', '-DHAVE_PTHREAD_GETNAME_NP=1', '-DHAVE_PTHREAD_H=1', '-DHAVE_PTHREAD_SETNAME_NP=1', '-DHAVE_REGISTER_FRAME=1
', '-DHAVE_SETENV_R=1', '-DHAVE_STRERROR_R=1', '-DHAVE_SYSEXITS_H=1', '-DHAVE_UNISTD_H=1', '-DHAVE_MACH_MACH_H=1', '-D
HAVE_MALLOC_MALLOC_H=1', '-DHAVE_MALLOC_ZONE_STATISTICS=1', '-DHAVE_PROC_PID_RUSAGE=1', '-DHAVE_UNW_ADD_DYNAMIC_FDE=1'
, '-DLLVM_NATIVE_ARCH="X86"', '-DLLVM_NATIVE_ASMPARSER=LLVMInitializeX86AsmParser', '-DLLVM_NATIVE_ASMPRINTER=LLVMInit
ializeX86AsmPrinter', '-DLLVM_NATIVE_DISASSEMBLER=LLVMInitializeX86Disassembler', '-DLLVM_NATIVE_TARGET=LLVMInitialize
X86Target', '-DLLVM_NATIVE_TARGETINFO=LLVMInitializeX86TargetInfo', '-DLLVM_NATIVE_TARGETMC=LLVMInitializeX86TargetMC'
, '-DLLVM_NATIVE_TARGETMCA=LLVMInitializeX86TargetMCA', '-DLLVM_HOST_TRIPLE="x86_64-unknown-darwin"', '-DLLVM_DEFAULT_
TARGET_TRIPLE="x86_64-unknown-darwin"', '-D__STDC_LIMIT_MACROS', '-D__STDC_CONSTANT_MACROS', '-D__STDC_FORMAT_MACROS',
 '-DLLVM_ENABLE_TERMINFO=1', '-DLLVM_ENABLE_ZLIB=1', '-DBLAKE3_USE_NEON=0', '-iquote', '.', '-iquote', 'bazel-out/darw
in-fastbuild/bin', '-iquote', 'external/llvm-project', '-iquote', 'bazel-out/darwin-fastbuild/bin/external/llvm-projec
t', '-iquote', 'external/llvm_terminfo', '-iquote', 'bazel-out/darwin-fastbuild/bin/external/llvm_terminfo', '-iquote'
, 'external/zlib', '-iquote', 'bazel-out/darwin-fastbuild/bin/external/zlib', '-isystem', 'external/llvm-project/llvm/
include', '-isystem', 'bazel-out/darwin-fastbuild/bin/external/llvm-project/llvm/include', '-O1', '-mllvm', '-fast-ise
l', '-gmlt', '-fno-omit-frame-pointer', '-mno-omit-leaf-frame-pointer', '-fno-optimize-sibling-calls', '--sysroot=/App
lications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk', '-fsanitize=address,undef
ined,nullability', '-fsanitize-address-use-after-scope', '-fno-sanitize-recover=undefined', '-fsanitize-undefined-stri
p-path-components=-1', '-fno-sanitize=vptr', 'explorer/interpreter/builtins.cpp', '-o', 'bazel-out/darwin-fastbuild/bi
n/explorer/interpreter/_objs/type_checker/builtins.pic.o']

For the corresponding BUILD file, there seems to be nothing special is being done for the target. That's why I thought the problem might be general and is worth handling in the extractor. However, I am new to Bazel so I might be looking in the wrong location here.

@ergawy
Copy link
Author

ergawy commented Aug 14, 2022

Is there anything you'd recommend for iOS that's similar and would work well with Bazel?

Happy to hear you are enjoy working with ProGuard.

We have iXGuard for iOS. However, it is a licensed product (and doesn't have an open-source component) that focuses mainly on obfuscation. Also, it is an LLVM-based compiler that works on the IR level: the only requirement is that the input app embeds bitcode then iXGuard re-compiles the IR into an obfuscated one. Also, unfortunately it doesn't not integrate with any build system: it is a post-processing tool after you build and export your app.

@cpsauer
Copy link
Contributor

cpsauer commented Aug 18, 2022

Sweet! Thanks.

Yeah, similarly non-obvious to me what's causing this one to have such a different order than the others we've seen.
I'm surprised we haven't seen this case before with all the usage there's been!

[Sounds like iXGuard is not our man, but thanks so much for thinking about it!]

@cpsauer
Copy link
Contributor

cpsauer commented Aug 18, 2022

After some more thought, I'd propose we switch over to using just your heuristic.

I'd expect it to be more robust, since presumably the file and it's -o are added together--and clearly the -c is added separately somewhere. And I've been looking through some test repos, and I haven't yet seen any situations where it doesn't work. If it captures all the cases, then I should strip things down for simplicity.

Does that square with what you saw? By default, I'll move on that soon, maybe tomorrow.

The one exception is msvc, where, to my knowledge, there's no /o flag like the one you'd put in (instead /Fo). There, I think we want to just go after the /c, since the order is flipped (example). Wish there were a better overall option, so if you think of one, please do say something.

cpsauer added a commit that referenced this pull request Aug 20, 2022
@cpsauer
Copy link
Contributor

cpsauer commented Aug 20, 2022

Added a commit moving us to your heuristic as the primary for GCC-formatted commands. Please lmk if that works for you, too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants