-
Notifications
You must be signed in to change notification settings - Fork 944
Profile low-privileged processes with perf_events #411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4e55259 to
768cf12
Compare
|
Rebased on latest master. |
3f5b544 to
3e2c7ce
Compare
|
Rebased on latest master & squashed all. |
3e2c7ce to
504b54d
Compare
|
Finally had some time to fix the remaining issues here. @apangin , would love your feedback. What's left:
|
|
Excuse me for a delay, I finally got a chance to look into this deeper. First of all, thank you for working on this - it's certainly an interesting feature.
More comments will follow inline in the diffs. |
|
Thanks for the review :)
Syscall vs transfer - you're correct, I have tried to keep the changes at minimum - the API does remain the same, About kallsyms parsing - existing code uses C++
Hmm, originally I designed the JVM as a server because it made some sense, as the JVM is the one who remains in its namespace, and the transfer program moves namespaces and finds the target. I also had in mind the use of this socket as a mutex preventing multiple APs from running concurrently (mentioned in the "Notes" section). We've since talked about the mutex in other issues and I now know it's not enough anyway :/ Swapping client/server will eliminate the listen loop and add the peer check. I'm not sure if the lifecycle of the socket will change - I will still maintain an open socket throughout profiling (so I can respond to new threads being created). Therefore only the initialization code will change. Overall I think it's worth it - so I will try to make this change.
Hmmm good idea, all requests are small enough to fit in a single message of a UDS (IDK what's the limit but I'm certain they fit...)
Nothing is shared indeed, each of the functions is used either in the client or in the server. Okay, I'll split them up and maintain just one header with whatever needs to be shared: I'll also rename So we'll have:
Of course, was planning to fix the build & CI in general after getting your general approval about this implementation.
Will do 👍 |
Right, this part is fine, except that the caller expects
That's what I meant. Thanks.
Yeah, there will be just
Naming is the most difficult programming task :) I thought about two files: |
504b54d to
de86115
Compare
|
I have attempted to make the changes according to your suggestions:
Ran another sanity after the changes. Also, I have rebased on latest master. Here is the result of (no diff besides the added commits) |
|
There are some synchronization issues now:
I tried to add a simple flag file (let fdtransfer flag I think we should add some |
de86115 to
cb874bb
Compare
9aa3cfe to
ed6cdf6
Compare
|
Whoopsie. Saw this just now. I really need to fix my Gmail notifications.
Well, the profiler merely logs the
👍
👍 that's how it goes now.
Naming is indeed hard 😉 |
|
Tested macOS (x64) build & |
|
Thank you for addressing my comments; we're approaching the finish :)
I though about it once again, and now I'm quite sure the error needs to be passed down the connection.
|
|
I realized there are two more problems:
|
No rush, I'm very pleased with the finish level of AP and I fully understand your desire to bring every detail to perfection.
Gotcha. I agree that it will work better for non-
Hmm. I didn't have startup-time support in mind. I can run fdtransfer with Do we want to support it in this PR? We can add partial support, perhaps - only for JVMs running in the host (or, in the same network namespace)
That's true. I previously added minimal protection against incorrect fd transfer, in the form of "request ID". In case requests do get mixed up, we'll get explicit error.
Looking in |
|
I briefly tested the Profiler log, as expected: |
Okay, I figured it out, it's funny.
I'll think of a synchronization mechanism, either something between |
I went with the simplest solution, which is retries. I don't like it much because it does add a bit of unnecessary delay, and because in extreme cases it can also fail (total timeout is 500ms). I thought for a while about a more accurate scheme. The thing is, I don't want to add anything in
@apangin do you have any idea here? I might try the pipe one meanwhile. |
Namespaces make this feature non-trivial indeed. But we can implement it just for a simple yet intuitive case: only if the socket is bind-mounted in the container explicitly (or if a container shares the host's namespace, of course).
I thought about running fdtransfer as a daemon service, which, when started, returns control as soon as the socket is ready. Basically, the same as your first idea. |
* Fork after binding, so caller can continue once we exit & there's no race with the profiler agent * Allow binding on a path * Accept fdtransfer=/path/to/socket in the agent, so it can connect to a pre-running fdtransfer
Sure, done. Here's the |
…s in serialization
I retained the same message - because it's the same error, although in the fdtransfer case it has happened in another process.
Combine SA_NOCLDWAIT with exiting the peer PID NS after the single fork().
ecb1950 to
4fd5f23
Compare
src/fdTransfer_shared_linux.h
Outdated
|
|
||
| static inline bool socketPath(const char *path, struct sockaddr_un *sun, socklen_t *addrlen) { | ||
| const int path_len = strlen(path); | ||
| if (path_len > sizeof(sun->sun_path) + 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, path_len + 1 > sizeof(sun->sun_path) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm the rationale was to save space for the null terminator, so yeah. But anyway the null terminator is not required for the kernel, since length is taken per the addrlen. So I will remove the + 1 and change it to memcpy.
src/perfEvents_linux.cpp
Outdated
| CStack PerfEvents::_cstack; | ||
|
|
||
| int PerfEvents::createForThread(int tid) { | ||
| int PerfEvents::createForThread(pid_t tid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's int in the header file; I'd change type in either place for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicer as pid_t, I suppose. So I'll make use of this option and change it everywhere :)
EDIT: oof, there are really many. I don't want to mix this unrelated change into the PR. Reverting to int.
|
Thank you for your hard work on the fixes. I think everything is conceptually good now. |
|
Made all discussed changes. I checked again both modes of execution ( I also discovered one edge case now, which resulted in unwanted behavior: if we get We can fix it either by remembering to kill |
Oh nvm, the action name has changed but I see they do run. |
|
I have no more substantial comments, I think the feature is ready for merge. |
@apangin done. I have added |
|
Thank you so much, and congratulations on the largest external contribution to async-profiler so far! |
Well, it says about non-privileged user, but fdtransfer needs to be run by root anyway. So I thinks it's OK to leave it as is, unless you know how to put it in a clear and concise way. |
Thanks! It was a pleasure.
Maybe: Anyway I agree that it doesn't matter much, a careful reader who encounters this problem will eventually find |
Implements what's been discussed in #397 .
General flow
startoperation now accepts a flagfdtransfer, which initializes theFdTransferclass (creates a UDS and waits for a connection).FdTransferpeer, and if so, can send requests to that peer, instead of doing certain operations on their own.perf_event_opencall, and obtaining a file descriptor of/proc/kallsyms.profiler.shaccepts--fdtransferwhich lets it start thefdtransferprogram alongside withjattach. Thefdtransferprogram connects to the target async-profiler and serves all requests it receives, until async-profiler closes the peer socket (uponstop).Using this, I'm able to profile a low-privileged container, running
test/Target.javaand started this way:docker run --user 8888:8888 java ...with this command:sudo ./profiler.sh -d 2 -e cpu -f ... -o flamegraph --fdtransfer $(pgrep java)- and I get proper kernel stacks. On my box,perf_event_paranoid = 3andkptr_restrict = 1.TODOs
Got some small ones in the code, these are the major ones.
Base branch - I'm currently based on 5dd9e86 (v1.8.4), on which should I rebase, master or v2.0?I just saw that v2.0 is merged into master, so I'll rebase on master.Decide on appropriate tests to add. I guess something likeAdded a basic smoke test.smoke-test.shwithfdtransfer, which proves that we get kernel stacks in addition to Java stacks.I currently ignoreI implemented this, but IIUC the purpose of thePerfEvents::check(that is, I don't check forfdtransferthere). Do you think it's relevant?checkaction, I don't think it's very relevant. I can add it if you think it's useful.Waiting forI run it without waiting for thefdtransferto exit? Currently it will connect to async-profiler and serve all requests until EOF (which happens uponstop). It works fine with thecollectaction. If runningstartdirectly,fdtransfermight need to be disowned so it continues executing afterprofiler.shexits.collectaction, because we runstopbefore exiting; forstart&resume, I run it withnohup &.perf_event_paranoidandkptr_restrict, since it's not needed if usingfdtransfer.Error handling:
Use ofPost-rebase and 7ec5c19, I'll use the new logger class.perror/fprintf(stderr)in async-profiler side - is it legit? async-profiler already writes to the process' stderr, but perhaps not as verbose as I've been here...fdtransfer, I think the correct behavior should be as if the "original" (non-transferred) operation has failed. Make sure it's the case in all relevant sites.Notes
FdTransfera static class, because it's simpler, and I don't see how we'll ever need more than 1 instance. Plus, in most of async-profiler code we already use such singletons.fdtransferusage in async-profiler, the listener socket remains open (and further invocations ofstart,fdtransferwill use it). This implicitly "solves" Potential crash when running async-profiler while already active #395 because the listener socket acts as a mutex.FdTransferclass in a way it's easier to add new "request types", because I already have some ideas that can be added after this PR:I think it'd be nice to change async-profiler to write all errors & warning to a separate file (instead of the process' stderr). This will be much more convenient when checking for errors. If we make this change -(this change was done on master already) we can, again, pass the fd of this "error file" to async-profiler, so its output is written "in the host".