-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Swift: open(2) interception #10447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swift: open(2) interception #10447
Conversation
9c5d078
to
4dba121
Compare
4dba121
to
c638789
Compare
We probably need a quick description for the PR, but if you prefer as I'm writing an internal document for this, I can later copy & paste the relevant part in the PR description (even after we merge it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job! I think we should at least solve the possible use-after-destructor problem, the other comments are not really required to merge (especially if we later do a cross-platform thing without fishhook)
|
||
namespace codeql { | ||
|
||
static std::string scratchDir; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be a use-after-destructor problem with this, in case open
is called after exit
by some other destructor of an object with static storage duration. Even though this is not happening now, we cannot really guarantee this won't happen. For example, what if the swift library initializes some non-POD static global that writes down diagnostics at the end of the program?
We could either:
- undo the interception more or less at the time of calling
remapArtifacts
, puttingoriginal_open
back toopen
- use a
static const char*
for thescratchDir
, makingcodeql_open
auto fallback onoriginal_open
ifscratchDir == nullptr
, and set it back tonullptr
inremapArtifacts
. TheSwiftExtractorConfiguration
object owning the scratch dir string will be alive between calls toinitInterception
andremapArtifacts
(or maybe both for maximum cleanness)
I'm always wary of using non-const globals (which is probably unavoidable here because of having to interface with C via free functions), but the wariness doubles when using non-POD globals (const or non const) because of the subtle bugs that can happen during end of program destructor calls. So if there is a way to avoid that I would rather take it, and go for the second option above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a very good point and it would've been a nightmare to catch/debug afterwards!
However, SwiftExtractorConfiguration
doesn't really own the scratch dir as we changed tempArtifactDir
to be a function returning a new string every time it's called 😅. We can easily workaround this limitation, but IMO having a char *
is way too fragile.
I'll see what's the best way to rebind the functions back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point on the const char lifetime!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you go for a RAII class you could use the lifetime of that class to make sure the raw pointer is alive while required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played a bit with the options and the best suggestion I have is to add a boolean flag, though as I'm writing I'm wondering if it should be an atomic boolean? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, does llvm or the swift frontend fire up threads? Even if they do, are they still not joined when we execute our code?
Another option could be maybe to make scratchDir
indestructible. Declare it as
static const std::string* scratchDir = nullptr;
and initialize it with
scratchDir = new std::string(dir);
and just fuggedaboutit... but I'm ok with this solution. We will probably need to come back to this code if we generalize the interception to Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, all the available options are more like "which footgun shall we pick" 😄
void initInterception(const std::string& dir); | ||
void remapArtifacts(const std::unordered_map<std::string, std::string>& mapping); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should consider an alternative API using RAII, something like
class Interceptor {
const std::unordered_map<std::string, std::string>& mapping;
public:
// does initInterception, stores reference to mapping
Interceptor(const std::string& dir, const std::unordered_map<std::string, std::string>& mapping);
// does remapArtifacts using stored mapping, does cleanup if needed
~Interceptor();
};
The mapping does not even need to be filled at construction time.
misc/bazel/workspace.bzl
Outdated
@@ -55,3 +56,11 @@ def codeql_workspace(repository_name = "codeql"): | |||
"https://github.com/bazelbuild/rules_python/archive/refs/tags/0.8.1.tar.gz", | |||
], | |||
) | |||
|
|||
new_git_repository( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using http_archive
with GitHubs codeload service, i.e.
https://github.com/facebook/fishhook/archive/aadc161ac3b80db07a9908851839a17ba63a9eb1.zip
tends to be much faster than using git to pull in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, this is not a drop-in replacement as I'm getting include path errors:
/usr/bin/sandbox-exec -f /private/var/tmp/_bazel_alexdenisov/5115cfbf6c0d699ed00ee2a590e1d941/sandbox/darwin-sandbox/46/sandbox.sb /var/tmp/_bazel_alexdenisov/install/0c7899cb691a00c6ca493ede5765e1af/process-wrapper '--timeout=0' '--kill_delay=15' external/local_config_cc/wrapped_clang_pp '-D_FORTIFY_SOURCE=1' -fstack-protector -fcolor-diagnostics -Wall -Wthread-safety -Wself-assign -fno-omit-frame-pointer -O0 -DDEBUG '-std=c++11' 'DEBUG_PREFIX_MAP_PWD=.' -iquote . -iquote bazel-out/darwin-fastbuild/bin -iquote external/swift_prebuilt_darwin_x86_64 -iquote bazel-out/darwin-fastbuild/bin/external/swift_prebuilt_darwin_x86_64 -iquote external/fishhook -iquote bazel-out/darwin-fastbuild/bin/external/fishhook -Ibazel-out/darwin-fastbuild/bin/external/swift_prebuilt_darwin_x86_64/_virtual_includes/swift-llvm-support -Ibazel-out/darwin-fastbuild/bin/external/fishhook/_virtual_includes/fishhook -MD -MF bazel-out/darwin-fastbuild/bin/swift/extractor/remapping/_objs/remapping/SwiftOpenInterception.macOS.d '-frandom-seed=bazel-out/darwin-fastbuild/bin/swift/extractor/remapping/_objs/remapping/SwiftOpenInterception.macOS.o' -isysroot __BAZEL_XCODE_SDKROOT__ -F__BAZEL_XCODE_SDKROOT__/System/Library/Frameworks -F__BAZEL_XCODE_DEVELOPER_DIR__/Platforms/MacOSX.platform/Developer/Library/Frameworks '-mmacosx-version-min=12.3' -no-canonical-prefixes -pthread '-std=c++17' -fno-rtti '-arch=x86_64' -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -target x86_64-apple-macosx -c swift/extractor/remapping/SwiftOpenInterception.macOS.cpp -o bazel-out/darwin-fastbuild/bin/swift/extractor/remapping/_objs/remapping/SwiftOpenInterception.macOS.o)
clang++: warning: argument unused during compilation: '-arch=x86_64' [-Wunused-command-line-argument]
swift/extractor/remapping/SwiftOpenInterception.macOS.cpp:2:10: fatal error: 'fishhook.h' file not found
#include <fishhook.h>
^~~~~~~~~~~~
1 error generated.
Error in child process '/usr/bin/xcrun'. 1
Target //swift/extractor:extractor failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 3.137s, Critical Path: 2.37s
INFO: 6 processes: 6 internal.
FAILED: Build did NOT complete successfully
With a broken symbolic link:
> file bazel-out/darwin-fastbuild/bin/external/fishhook/_virtual_includes/fishhook/fishhook.h
bazel-out/darwin-fastbuild/bin/external/fishhook/_virtual_includes/fishhook/fishhook.h: broken symbolic link to /private/var/tmp/_bazel_alexdenisov/5115cfbf6c0d699ed00ee2a590e1d941/external/fishhook/fishhook.h
The link points to a wrong place as there is an extra folder after archive extraction:
> ls /private/var/tmp/_bazel_alexdenisov/5115cfbf6c0d699ed00ee2a590e1d941/external/fishhook/
BUILD.bazel fishhook-aadc161ac3b80db07a9908851839a17ba63a9eb1 WORKSPACE
With fishhook-aadc161ac3b80db07a9908851839a17ba63a9eb1
being the actual repo root:
> ls /private/var/tmp/_bazel_alexdenisov/5115cfbf6c0d699ed00ee2a590e1d941/external/fishhook/fishhook-aadc161ac3b80db07a9908851839a17ba63a9eb1
CODE_OF_CONDUCT.md CONTRIBUTING.md fishhook.c fishhook.h fishhook.podspec LICENSE README.md
Any ideas what I'm doing wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need to set the strip_prefix
option to (presumably) strip_prefix = "fishhook-aadc161ac3b80db07a9908851839a17ba63a9eb1"
- see the WORKSPACE.bazel
file in semmle-code for examples where we already do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did the trick 🎉
|
||
namespace codeql { | ||
|
||
static std::string scratchDir; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, does llvm or the swift frontend fire up threads? Even if they do, are they still not joined when we execute our code?
Another option could be maybe to make scratchDir
indestructible. Declare it as
static const std::string* scratchDir = nullptr;
and initialize it with
scratchDir = new std::string(dir);
and just fuggedaboutit... but I'm ok with this solution. We will probably need to come back to this code if we generalize the interception to Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bazel changes LGTM
No description provided.