Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service/dap: Support for replay and core modes #2367

Merged
merged 44 commits into from
Jul 21, 2021

Conversation

lggomez
Copy link
Contributor

@lggomez lggomez commented Mar 4, 2021

This PR aims to add support for rr replay and core actions from the DAP layer. This basically encloses the following:

New launch modes: replay and core

The following modes are added:

  • replay: Replays an rr trace, allowing backwards flows (reverse continue and stepback). Requires a traceDirPath property on launch.json pointing to a valid rr trace directory.
    Equivalent to dlv replay <tracedir> command.
  • core: Replays a core dump file, showing its callstack and the file matching the callsite. Requires a coreFilePath property on launch.json pointing to a valid coredump file.
    Equivalent to dlv core <exe> <corefile> command.

Dependencies

To achieve this the following additional changes were made:

@lggomez lggomez changed the title WIP - service/dap: Support for rr modes WIP - service/dap: Support for rr modes and compatible DAP commands Mar 4, 2021
@lggomez
Copy link
Contributor Author

lggomez commented Mar 7, 2021

For the relatively small work it took without reworking any of the existing support in the dap service it looks very promising already @aarzilli @polinasok (sans the broken tests and the rest of the details I still must take care of)

Here's a preview of an unoptimized binary (compiled with gcflags '-N -l'). The debug flow along with the stacktrace and symbols seem to work just fine:

rr2.mp4

Here's an unoptimized binary (the common use case, as people will not usually compile with optimizations turn off except for edge cases). As expected, symbols that were optimized away due to inlining and further heuristics are broken, but no errors are triggered on the backend:

rr1.mp4

I still have to test the core dumps, but I have already some questions/TODOs in which I could use some help:

  • Variable handling: how should be handle the broken variables? assuming we do not have a way to tell apart optimized apps from debug builds
  • CPU identification: rr is only compatible with linux and certain CPU families. There is an official intel package which might be worth looking at for that end: https://github.com/intel-go/cpuid, but since this involves adding a third party dependency into the project (and still a TODO for AMD cpus) I bring this into the conversation first

@aarzilli
Copy link
Member

aarzilli commented Mar 7, 2021

Variable handling: how should be handle the broken variables? assuming we do not have a way to tell apart optimized apps from debug builds

I'm not sure why this is a problem specific to this. It can already happen with the "exec" launch mode. You say that this is:

the common use case, as people will not usually compile with optimizations turn off except for edge cases

but why would that be the case? Why wouldn't the rr backend be supported with the "debug" and "test" launch modes?

CPU identification: rr is only compatible with linux and certain CPU families. There is an official intel package which might be worth looking at for that end: https://github.com/intel-go/cpuid

Setting aside the cpuid discussion for a moment, you have a chicken and egg problem. You are trying to tell the client if SupportStepBack is true in the initialization response but you can't know that until you know the launch mode in the launch request. I think that's the wrong place to do it, either you do it with a capabilities event (assuming it works for this in VSCode), after you get the launch request, or with some magic in the javascript extension.

@lggomez
Copy link
Contributor Author

lggomez commented Mar 7, 2021

I'm not sure why this is a problem specific to this. It can already happen with the "exec" launch mode. You say that this is:

the common use case, as people will not usually compile with optimizations turn off except for edge cases

What I meant with this is that people replaying traces will see a lot of bogus entries on the variables pane unless the trace itself comes from a deoptimized build, which is not the default case. I aimed at the possibility of maybe trying to sanitize the invalid symbols from the debugger state but I guess that won't be a worthwhile approach

Setting aside the cpuid discussion for a moment, you have a chicken and egg problem. You are trying to tell the client if SupportStepBack is true in the initialization response but you can't know that until you know the launch mode in the launch request. I think that's the wrong place to do it, either you do it with a capabilities event (assuming it works for this in VSCode), after you get the launch request, or with some magic in the javascript extension.

Indeed, I handled this with the nodejs API on the old dap implementation I'm porting https://github.com/golang/vscode-go/pull/89/files#diff-42370209a717d9c657a14f0dcfc980d8321fff898c864c7be23b315deb977394R822-R827. I need to check with @hyangah if this is still feasible with the new design (I'm adding the VS code parameters to this dap service on golang/vscode-go#1268)

@aarzilli
Copy link
Member

aarzilli commented Mar 8, 2021

I aimed at the possibility of maybe trying to sanitize the invalid symbols from the debugger state but I guess that won't be a worthwhile approach

AFAICT there's nothing useful we can do about it.

@hyangah
Copy link
Contributor

hyangah commented Mar 8, 2021

I'm not sure why this is a problem specific to this. It can already happen with the "exec" launch mode. You say that this is:

the common use case, as people will not usually compile with optimizations turn off except for edge cases

What I meant with this is that people replaying traces will see a lot of bogus entries on the variables pane unless the trace itself comes from a deoptimized build, which is not the default case. I aimed at the possibility of maybe trying to sanitize the invalid symbols from the debugger state but I guess that won't be a worthwhile approach

Setting aside the cpuid discussion for a moment, you have a chicken and egg problem. You are trying to tell the client if SupportStepBack is true in the initialization response but you can't know that until you know the launch mode in the launch request. I think that's the wrong place to do it, either you do it with a capabilities event (assuming it works for this in VSCode), after you get the launch request, or with some magic in the javascript extension.

Indeed, I handled this with the nodejs API on the old dap implementation I'm porting https://github.com/golang/vscode-go/pull/89/files#diff-42370209a717d9c657a14f0dcfc980d8321fff898c864c7be23b315deb977394R822-R827. I need to check with @hyangah if this is still feasible with the new design (I'm adding the VS code parameters to this dap service on golang/vscode-go#1268)

I don't see a way for the Go extension to step in during initialization between the editor & the dap server using the DAP protocol. (unless we have a flag to supply when starting dlv dap). Any capabilities change has to be done through the CapabilitiesEvent. Not sure whether it will be too late if dlv dap tells the change when receiving the launch request that specifies rr mode, but I think it's worth trying.

Copy link
Collaborator

@polinasok polinasok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is mixing standalone features, without comments relating them to dlv cli, so it took me a moment to orient myself. I am not sure if I got this right, but I think this is what is going on:

  • Replay of core dumps

This appears to be the equivalent of dlv core <exe> <corefile>
You enable it with mode="coredump" while using program attribute for <exe> and coreDumpPath attribute for <corefile>.

  • Replay of rr recordings

This appears to be the equivalent of dlv replay <trace dir>.
So it looks like this one is enabled with mode="replay" and attribute traceDirectory. And program is ignored then?
dlv replay without any additional args generates Error: you must provide a path to a binary. But I think it actually refers to the trace dir, doesn't it? Unfortunately, I don't have a system right now to test this on myself.

  • Create an rr recording of the current app (from VS Code)

Is this the equivalent of dlv --backend=rr that enables rev and rewind commands with dlv cli (and additional buttons with vscode UI)? It is not clear to me why it is enabled with a separate mode while also requiring the backend to be set to rr. If I understand correctly, this backend option could be used with a pre-built binary (dlv exec) or a binary that is built by delve (dlv debug/test), so shouldn't you be able to specify rr backend with mode="debug", etc?

service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Show resolved Hide resolved
@polinasok
Copy link
Collaborator

Setting aside the cpuid discussion for a moment, you have a chicken and egg problem. You are trying to tell the client if SupportStepBack is true in the initialization response but you can't know that until you know the launch mode in the launch request. I think that's the wrong place to do it, either you do it with a capabilities event (assuming it works for this in VSCode), after you get the launch request, or with some magic in the javascript extension.

Indeed, I handled this with the nodejs API on the old dap implementation I'm porting https://github.com/golang/vscode-go/pull/89/files#diff-42370209a717d9c657a14f0dcfc980d8321fff898c864c7be23b315deb977394R822-R827. I need to check with @hyangah if this is still feasible with the new design (I'm adding the VS code parameters to this dap service on golang/vscode-go#1268)

I don't see a way for the Go extension to step in during initialization between the editor & the dap server using the DAP protocol. (unless we have a flag to supply when starting dlv dap). Any capabilities change has to be done through the CapabilitiesEvent. Not sure whether it will be too late if dlv dap tells the change when receiving the launch request that specifies rr mode, but I think it's worth trying.

If we were to change things after the initialize response, the Capabilities event would be the only way to go, but the editor gives no guarantees. We would need to try this to know for sure. That said, not all launch configuration attributes are meant to become launch/attach args. Some should stay as command-line arguments the dlv command. For example, the extension already parses and passes --log when starting dlv-dap, and it is ignored by the launch/attach request handler. I intended for backend to be handled the same way. You can access the value via s.config.Debugger.Backend in onInitializeRequest. You should be able to test this already if you just use the port in your launch config to connect to a server that you started manually, but we would need to make a change on the extension side to support --backend explicitly (although the shortcut for now would be to use the free-form dlvFlags attribute).

@polinasok
Copy link
Collaborator

Variable handling: how should be handle the broken variables? assuming we do not have a way to tell apart optimized apps from debug builds

I'm not sure why this is a problem specific to this. It can already happen with the "exec" launch mode. You say that this is:

the common use case, as people will not usually compile with optimizations turn off except for edge cases

but why would that be the case? Why wouldn't the rr backend be supported with the "debug" and "test" launch modes?

Agree. Whatever happens in other modes makes sense here. I don't think we should do anything special with the variables as opposed to just displaying whatever we get from Debugger backend.

I also remember seeing at some point a warning with dlv cli that the program being debugged was compiled with optimizations. I guess it could be nice to display a pop-message warning the user that that's what they are about to do and that it will impact variables, and if that's what they want, they should use so and so flags.

@lggomez
Copy link
Contributor Author

lggomez commented Mar 11, 2021

Variable handling: how should be handle the broken variables? assuming we do not have a way to tell apart optimized apps from debug builds

I'm not sure why this is a problem specific to this. It can already happen with the "exec" launch mode. You say that this is:

the common use case, as people will not usually compile with optimizations turn off except for edge cases

but why would that be the case? Why wouldn't the rr backend be supported with the "debug" and "test" launch modes?

Agree. Whatever happens in other modes makes sense here. I don't think we should do anything special with the variables as opposed to just displaying whatever we get from Debugger backend.

I also remember seeing at some point a warning with dlv cli that the program being debugged was compiled with optimizations. I guess it could be nice to display a pop-message warning the user that that's what they are about to do and that it will impact variables, and if that's what they want, they should use so and so flags.

I haven't gone deep enough into the debugger but if that info is available it should be inferred in some way because AFAIK the go compiler does not embed its compile flags into the ELF file or executable (please correct me if I'm wrong as I'm still not well versed in this territory). But either way, having that would be a plus to tell to the user as it would save them diagnosis time (and it would save you folks probably some time on non-issues here)

@aarzilli
Copy link
Member

I haven't gone deep enough into the debugger but if that info is available it should be inferred in some way because AFAIK the go compiler does not embed its compile flags into the ELF file or executable

It actually does since 1.10. We use it to compute proc.(*Function).Optimized, which is where the optimized warning comes from.

@polinasok
Copy link
Collaborator

polinasok commented Mar 23, 2021

Meta comment. We have quite a bit of back-and-forth and from the commits, I can see that things keep evolving, stuffs gets added, removed, reshuffled. It is starting to get easy to get lost in the evolution. Would it be possible to keep the top overview comment up-to-date on what the PR provides with some high-level summary for each bullet as to how that feature is provided (using which mode/backend attribute and/or cli flag) and on what platform. Thank you.

@aarzilli
Copy link
Member

Do you want a code review or is this still WIP?

@lggomez
Copy link
Contributor Author

lggomez commented Apr 1, 2021

It's still a WIP but it's halfway there. With most of the design concerns resolved (at least up to this point) I have work to do on testing and fixing any remaining issues on the core dump replay and rr trace record modes, and then implement the tests. rr traces work fine for the cases I've tested, although testing on other platforms and programs is more than welcome

Copy link
Member

@aarzilli aarzilli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestOptionalNotYetImplementedResponses is now failing because of the requests you implemented.

service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
vendor/modules.txt Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
@lggomez lggomez force-pushed the dev.rr-delve branch 3 times, most recently from 898be28 to 8d03c06 Compare April 6, 2021 17:00
@lggomez lggomez changed the title WIP - service/dap: Support for rr modes and compatible DAP commands service/dap: Support for rr modes and compatible DAP commands Apr 6, 2021
@lggomez
Copy link
Contributor Author

lggomez commented Apr 6, 2021

@aarzilli @polinasok @hyangah well, I can finally say that the main implementation is done, with https://github.com/golang/vscode-go/pull/1268/files I can record, and replay traces from vs code, along with core dumps

Now some review is in order and I should ask how should I start on testing this. As I understand currently the continue requests are blocking, thus not addressable on the tests but I wonder if we could emulate at least a threadsRequest for a fixture with a core file, which should return a stack trace on the supported scenario

@polinasok
Copy link
Collaborator

@aarzilli @polinasok @hyangah well, I can finally say that the main implementation is done, with https://github.com/golang/vscode-go/pull/1268/files I can record, and replay traces from vs code, along with core dumps

Now some review is in order and I should ask how should I start on testing this. As I understand currently the continue requests are blocking, thus not addressable on the tests but I wonder if we could emulate at least a threadsRequest for a fixture with a core file, which should return a stack trace on the supported scenario

Glad to see so much progress! I am currently trying to finish my code that will make the server support asynchronous requests and make continue not blocking (so it possible to pause, disconnect and set breakpoints while the program is running). I will be trying to get as much as I can done this week before being on vacation next week. If you could wait a bit for the dust to settle, we would be in a better place both code-wise and time-wise to give your work the attention that it deserves.

pkg/proc/gdbserial/rr.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
service/dap/server.go Outdated Show resolved Hide resolved
@polinasok
Copy link
Collaborator

I think this PR
Updates golang/vscode-go#110
Updates golang/vscode-go#1243
Updates golang/vscode-go#168

lggomez added a commit to lggomez/delve that referenced this pull request Apr 9, 2021
Copy link
Member

@derekparker derekparker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@derekparker
Copy link
Member

The TeamCity failures are known at this point and not because of this PR.

@derekparker derekparker merged commit 69615b3 into go-delve:master Jul 21, 2021
@polinasok
Copy link
Collaborator

Congratulations, @lggomez! Whoo-hoo!

gopherbot pushed a commit to golang/vscode-go that referenced this pull request Aug 10, 2021
## New launch modes: replay and core

The following modes are added:

- **replay**: Replays an rr trace, allowing backwards flows (reverse continue and stepback). Requires a `traceDirPath` property on `launch.json` pointing to a valid rr trace directory.
Equivalent to `dlv replay <tracedir>` command.
- **core**: Replays a core dump file, showing its callstack and the file matching the callsite. Requires a `coreFilePath` property on `launch.json` pointing to a valid coredump file.
Equivalent to `dlv core <exe> <corefile>` command.

With the new dap service, most of the heavy work will go to delve, so this PR mainly involves getting around passing the corresponding launchArgs to delve. It´s service counterpart lies on go-delve/delve#2367

Change-Id: Idc21f27152387c07c844fa471b89d95f38c2545b
GitHub-Last-Rev: e055352
GitHub-Pull-Request: #1268
Reviewed-on: https://go-review.googlesource.com/c/vscode-go/+/298569
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Trust: Hyang-Ah Hana Kim <hyangah@gmail.com>
Trust: Suzy Mueller <suzmue@golang.org>
gopherbot pushed a commit to golang/vscode-go that referenced this pull request Aug 30, 2021
The following modes are added:

- **replay**: Replays an rr trace, allowing backwards flows (reverse continue and stepback). Requires a `traceDirPath` property on `launch.json` pointing to a valid rr trace directory.
Equivalent to `dlv replay <tracedir>` command.
- **core**: Replays a core dump file, showing its callstack and the file matching the callsite. Requires a `coreFilePath` property on `launch.json` pointing to a valid coredump file.
Equivalent to `dlv core <exe> <corefile>` command.

With the new dap service, most of the heavy work will go to delve, so this PR mainly involves getting around passing the corresponding launchArgs to delve. It´s service counterpart lies on go-delve/delve#2367

GitHub-Last-Rev: e055352
GitHub-Pull-Request: #1268
Reviewed-on: https://go-review.googlesource.com/c/vscode-go/+/298569
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Trust: Hyang-Ah Hana Kim <hyangah@gmail.com>
Trust: Suzy Mueller <suzmue@golang.org>
Change-Id: I7b6b3b7b9d7c30d552a4401892492c74ec2e3023
Reviewed-on: https://go-review.googlesource.com/c/vscode-go/+/346092
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Suzy Mueller <suzmue@golang.org>
TryBot-Result: kokoro <noreply+kokoro@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants