Skip to content

Extend support for data breakpoints in VS Code API #226735

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

martin-fleck-at
Copy link

@martin-fleck-at martin-fleck-at commented Aug 27, 2024

What it does

The breakpoint API in VS Code supports to add

  • Allow the user to manage data breakpoints through vscode.debug API
  • Add methods to gain information about a potential data breakpoint
  • Allow data breakpoints to have different origins
    • Keep use case where data id is already known (current)
    • Add infrastructure for already existing address resolution

Communication:

  • Adapt DataBreakpoint with source between extension host and main
  • Expose DataBreakpoint in VS Code API, previously not exposed (should not cause any backwards compatibility issues)

Minor:

  • Make bytes optional in data bytes info, as it is the same in the DAP
  • Make access type optional in data breakpoint, as it is the same in the DAP

Fixes #195151

How to test

To test this change, you need to make sure you have a C++ file that you can debug as the Typescript debugger does not support data breakpoints as far as I can see.

  1. Install VS Code C++ Tools
  2. Create a C++ file, e.g.,
#include <iostream>

int main() {
    int counter = 0;

    for (int i = 0; i < 10; i++) {
        counter++;
        std::cout << "Counter: " << counter << std::endl;
    }

    return 0;
}
  1. Create and start an extension (or modify an existing one that is already started with VS Code) with the code similar to this one:
	vscode.debug.onDidChangeBreakpoints(evt => {
		vscode.window.showInformationMessage(JSON.stringify(evt, undefined, 2));
	});

	vscode.debug.onDidChangeActiveDebugSession(session => {
		setTimeout(async () => {
			if (!session) {
				return;
			}
			try {
				const info = await new vscode.VariableScopedDataBreakpointOrigin(1000, 'i').info(session);
				vscode.window.showInformationMessage(JSON.stringify(info, undefined, 2));
				if (info.dataId) {
					vscode.debug.addBreakpoints([new vscode.DataBreakpoint(info as vscode.AvailableDataBreakpointInfo)]);
				}
			} catch (error) {
				vscode.window.showErrorMessage(error.message);
			}
			try {
				const info = await new vscode.AddressDataBreakpointOrigin('0x7fffffffd7e8', 4).info(session);
				vscode.window.showInformationMessage(JSON.stringify(info, undefined, 2));
			} catch (error) {
				vscode.window.showErrorMessage(error.message);
			}
		}, 2000);
	});
	vscode.window.showInformationMessage('Breakpoints:', vscode.debug.breakpoints.length.toString());

	vscode.debug.addBreakpoints([
		new vscode.DataBreakpoint(new vscode.ResolvedDataBreakpointOrigin('0x7fffffffd7e8,counter,4'), 'readWrite', false, 'condition', 'hitCondition', 'logMessage'),
		new vscode.DataBreakpoint(new vscode.AddressDataBreakpointOrigin('0x7fffffffd7e8', 4), 'readWrite', false, 'condition', 'hitCondition', 'logMessage'),
		new vscode.DataBreakpoint(new vscode.VariableScopedDataBreakpointOrigin(1000, 'i'), 'readWrite', false, 'condition', 'hitCondition', 'logMessage'),
		new vscode.FunctionBreakpoint('Hello')
	]);

When starting the extension you should already see the three data breakpoints that you added:
image

  1. Enable the data breakpoints (or already produce them enabled in your extension code).
  2. Debug the program with the Debug C/C++ File command.
  3. Depending on your other setup not all breakpoints may resolve correctly but you can see that they are properly evaluated.

@martin-fleck-at
Copy link
Author

@microsoft-github-policy-service agree company="EclipseSource"

@martin-fleck-at
Copy link
Author

@thegecko FYI, I believe you have also shown interest in this issue at some point.

Copy link
Member

@roblourens roblourens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I just took a quick look at it. How would you use this without exposing more info about variables in the extension API? What scenario do you want to enable specifically?

* @param name The name of the variable's child to obtain data breakpoint information for. If `variablesReference` isn't specified, this can be an expression.
* @param variablesReference Reference to the variable container if the data breakpoint is requested for a child of the container.
*/
getDataBreakpointInfo(name: string, variablesReference?: number): Thenable<DataBreakpointInfo | undefined>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would variablesReference come from? I don't think this is exposed in API elsewhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that there is no API to directly retrieve the variablesReference at the moment. I made that choice to keep the API changes smaller but we could introduce such methods if we need to. The two main entry points I see that can currently be used to retrieve the reference is to:

  • Register a debug adapter factory vscode.debug.registerDebugAdapterTrackerFactory and listen to the messages that are being sent and tracking the variable information since there is no direct access to the variables through the API.
  • Call the debug adapter directly with vscode.debug.activeDebugSession.customRequest and use the Debug Adapter Protocol to access the variable information. For instance, in the C/C++ debugger, one could send an evaluate request to retrieve information about a variable if the variable name is known.

For now, I did not envision any particular support for the user in this regard but of course we could also look into extending the variable resolution area a bit more.

@roblourens roblourens requested a review from connor4312 August 31, 2024 00:41
@martin-fleck-at
Copy link
Author

@roblourens Thank you very much for your quick feedback, I really appreciate it!

Regarding the retrieval of the variable information, please see my inline comment.

As for the use cases: The original ticket describes a scenario where a data breakpoint feature is needed for a remote debugger. However, I do not have any details on that. The particular use case we are working on has to do with a dedicated memory view where we wanted to support setting data breakpoints through a context menu in our custom UI. The VS Code Debug UIs already have all the necessary access and many bits and pieces were already in place for most of the other breakpoint types. However, we hit a wall particularly with data breakpoints as they are not exposed in the extension API even though most of the functionality was already in place on the "main" side and only a few if-branches were missing on the extension host side. So the core of this change is really rather small and trying to align the support of data breakpoints with the support of other breakpoints. If you have any further questions, please let me know.

* @param name The name of the variable's child to obtain data breakpoint information for. If `variablesReference` isn't specified, this can be an expression.
* @param variablesReference Reference to the variable container if the data breakpoint is requested for a child of the container.
*/
getDataBreakpointInfo(name: string, variablesReference?: number): Thenable<DataBreakpointInfo | undefined>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method or the one below it should be in the VS Code API. Both are trivial wrappers around DAP calls and could be made via customRequest. We try to avoid taking unnecessary knowledge about DAP into VS Code API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully understand your point and agree that these methods are more like convenience wrappers around customRequest calls. The downside, however, is that with custom request calls we lack access to the session's capabilities, which means we cannot perform sanity checks on the client side (like RawDebugSession does) and have to rely on the debug adapter to send us proper errors. Would exposing these capabilities also be considered exposing unnecessary internal knowledge? I believe it could be useful in other scenarios like UI enablement as well.

@@ -646,6 +646,7 @@ export interface IExceptionBreakpoint extends IBaseBreakpoint {
export const enum DataBreakpointSetType {
Variable,
Address,
DynamicVariable
Copy link
Member

@connor4312 connor4312 Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the addition of DynamicVariable is only convenience for a debug adapter to reference a variable/expression without having to round trip dataBreakpointInfo themselves.

  • This enum expresses the two sources which can be used to set a data breakpoint in DAP. A dynamic variable is not that, it's just a variable that didn't yet get translated viadataBreakpointInfo
  • DynamicVariable adds an extra type to the API and brings in more DAP knowledge into the API that is necessary to implement functionality

Copy link
Author

@martin-fleck-at martin-fleck-at Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this scenario is a bit more nuanced because it involves the revival of data breakpoints on the VS Code side if they are persisted.

Retrieving the dataId via Data Breakpoint Info:

  1. Scoped Variables or Expressions
    By using name and specifying the variable container scope (variablesReference) or frame scope (frameId), the request yields a dataId that is valid only within that specific context.

  2. Address
    When using name to specify an address (by setting the asAddress flag), we get a dataId that is valid indefinitely. Additionally, we can set bytes to retrieve a memory range. This is known as a "data bytes breakpoint" in the codebase and is available only if the supportsDataBreakpointBytes capability is present.

  3. Expressions
    Using name to specify an expression without setting the asAddress flag results in a dataId that is also valid indefinitely. It’s unclear whether bytes should also be supported in this case.

In the first scenario, persisting the breakpoint likely doesn’t make sense. However, in the latter two cases, it might. Ultimately, it is up to the debug adapter to determine if a breakpoint can be persisted by returning the canPersist flag in the data breakpoint info response.

Persisted Data Breakpoints in VS Code (debugModel):

  • Type Variable: We reuse the dataId, assuming it remains consistent across sessions.
  • Type Address: We re-retrieve the dataId by requesting the data breakpoint info with the asAddress flag. It seems we assume this dataId could change across sessions.

Given this, I felt there might be a gap in supporting re-retrieval of the dataId for expressions or potentially scoped variables if the debug adapter allows them to be persisted. Probably the term DynamicVariable was a bit misleading and adding two additional types Expression and Scoped would be more useful, even though I am not sure about the scoped one being very useful as breakpoints are activated at the beginning of a session so we might want to skip that one. What are your thoughts on this? I'm very interested to hear your perspective.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed another commit that showcases the different types a little bit better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@connor4312 What do you think of this approach?

- Allow the user to manage data breakpoints through vscode.debug API
- Add methods to gain information about a potential data breakpoint
- Allow data breakpoints to have different sources
-- Keep use case where data id is already known (current)
-- Add infrastructure for already existing address resolution
-- Extend for dynamic variables resolved for a session
--- Ensure dynamic variables are resolved in the debug model

Communication:
- Adapt DataBreakpoint with source between extension host and main
- Expose DataBreakpoint in VS Code API, previously not exposed

Minor:
- Make bytes optional in data bytes info, as it is the same in the DAP

Fixes microsoft#195151
@martin-fleck-at martin-fleck-at force-pushed the issues/195151 branch 3 times, most recently from 7e56de5 to 342b2c0 Compare September 4, 2024 14:19
- Properly extract API proposal into it's own file
- Remove data breakpoint info convenience method
- Provide proposal for different data breakpoint sources (discussion)
- Ensure that breakpoint mode is properly updated and passed through
@planger
Copy link

planger commented Oct 14, 2024

I just wanted to check in and see if there's been any progress on reviewing the PR. Let me know if there's anything I can do to help move it forward. Thank you!

@connor4312 connor4312 self-assigned this Oct 18, 2024
Copy link
Member

@connor4312 connor4312 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, and sorry for the slow review, it's been a busy few weeks.

I think there are some (partially existing) structural things we want to correct -- see my last comment. Let me know if you need help with that, I should have time during our next debt week.

}
} else {
// type === DataBreakpointSetType.Scoped
if (this.src.frameId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any given frame ID is valid only within the context of the current debug stopped state.

Copy link
Author

@martin-fleck-at martin-fleck-at Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, I am not sure if this is still relevant after re-working the resolution mechanism a bit more.

Comment on lines 1231 to 1234
if (sessionDataId) {
this.sessionDataId.set(session, sessionDataId);
dataId = sessionDataId;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if (sessionDataId) { is pretty repetitive, you can assign the return value of most calls to dataId and this.sessionDataId.set(session, dataId); down below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand you now from your comment below, the dataId is actually valid across sessions anyway so I'm not sure if it still makes sense to store the data Id for each individual session anymore, what do you think?

frameId: number;
variablesReference?: never;
}
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types are more complex than we generally like in our API. We'd probably represent these as classes that extensions could construct and pass to DataBreakpointAccessType

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, I refactored the types as classes that can be constructed more naturally.

* @param hitCondition Expression that controls how many hits of the breakpoint are ignored.
* @param logMessage Log message to display when breakpoint is hit.
*/
constructor(source: DataBreakpointSource | string, accessType: DataBreakpointAccessType, canPersist?: boolean, label?: string, enabled?: boolean, condition?: string, hitCondition?: string, logMessage?: string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think canPersist should be part of the type: 'variable' union, since in other cases the value of canPersist should be derived from the data breakpoint info request.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree that canPersist should be part of the type/class that already knows about resolved values. I actually also added other properties such as the accessTypes which may also already be given if known.

sessionDataId = (await session.dataBytesBreakpointInfo(this.src.address, this.src.bytes))?.dataId;
if (!sessionDataId) {
return undefined;
let dataId = this.sessionDataId.get(session);
Copy link
Member

@connor4312 connor4312 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to revisit this pattern for breakpoints with canPersist. It kind of works for address breakpoints (minus ASLR shenanigans) but what should actually happen is that the value of canPersist gets derived from the data DataBreakpointInfoResponse and then we persist the data ID.

This breaks down more when we have sources set using frames and variable IDs. When a new session is started, or when VS Code is restarted and the breakpoints are read from storage, those frames and variable references will no longer have any meaning. We'll request them again here in toDap but that's likely to fail or point to the wrong addresses. Instead, if we've already translated them to a data ID with canPersist, we should just set them with that data ID instead.

I think we still want to keep the src in here for display purposes (and we could potentially present a "refresh" action for certain non-canPersist breakpoints within a VS Code session) but the resolution of the data ID should be done and kept at the instant the breakpoint it gets set.

This is a bit egg on my face since I started the pattern of requesting dataBytesBreakpointInfo in this method which was misleading 🙈

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a deeper look now and if I understand you correctly, you argue that the resolution of the data breakpoints should be done as early as possible (i.e., as soon as they are added. And what we are actually interested in persisting is only the dataId which needs to be valid across multiple sessions anyway. Is that right? I tried to incorporate that in my next commit but skipped the refreshing of breakpoints within a session for now but that would be surely interesting as well.

@martin-fleck-at
Copy link
Author

@connor4312 Thank you very much for your feedback, I'll have a more detailed look this week!

- Translate complex types into separate classes
- Properly resolve data breakpoints as soon as the user adds them
-- Use resolution result to store additional data used later
@martin-fleck-at
Copy link
Author

@connor4312 Thank you very much for your very valuable feedback! I pushed another commit to move the data breakpoint resolution to an earlier point in time and properly propagate the resolution result throughout VS Code. I'm sure there are still areas to improve and I'd be happy to get more feedback but I want to make sure I'm on the right track here.

@thegecko
Copy link
Contributor

@connor4312 @roblourens Could we have an update on this PR, please?

Is there any more which needs changing?

@planger
Copy link

planger commented Mar 19, 2025

@connor4312 Hi there! I'd like to ask if there's a chance of getting this PR merged. We're happy to rebase and resolve conflicts if you think it's worth moving forward. Thank you!

@connor4312
Copy link
Member

Hey, sorry, been deep in MCP work this iteration. Looking.

Copy link
Member

@connor4312 connor4312 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple comments

}
}

@es5ClassCompat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

es5ClassCompat should not be added to new classes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

*
* @param resolution a known resolution for this breakpoint
*/
set(resolution: DataBreakpointResolution): this;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are used internally but I don't think they should be public API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree, I refactored the public API to return the resolution (now more appropriately called "info") for a particular session instead.

}
}

@es5ClassCompat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

es5ClassCompat is not needed on new classes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 144 to 152
/**
* Resolves the data breakpoint and retrieves the dataId, canPersist, and description properties.
* By default, the data breakpoint is resolved automatically after the breakpoint is set.
* A data breakpoint is only resolved once at a certain point in time.
* Calling this method again will not update the resolution properties properties.
*
* @param session the session against which this breakpoint should be resolved.
*/
resolve(session: DebugSession): Thenable<this>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breakpoints are global and visible to all extensions. This changes the data on the DataBreakpoint to view what its resolution state is for a session, but if this is called concurrently between two different extensions or for two different sessions, you would get racey behavior. Instead of mutating DataBreakpoint, should this return the resolution on that session?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I adapted the API to make this a bit more obvious. You now have the origin of the breakpoint (previously called "source" but I think that is a bit overloaded but I'm open to discussion) for which you can retrieve the info (previously called resolution) about what breakpoint could be set on a particular session.

source: DataBreakpointSource;

/** The resolution of the data breakpoint based on the given source. This may be undefined if the breakpoint was not yet resolved. */
resolution?: DataBreakpointResolution;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to below, I wonder if we should not have this at all, but require people use resolve(session): Promise<DataBreakpointResolution> to see this information

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree, I just adapted the naming here a bit.

: dbp.src.type === DataBreakpointSetType.Expression ? new ExpressionDataBreakpointSource(dbp.src.expression)
: dbp.src.type === DataBreakpointSetType.FrameScoped ? new FrameScopedDataBreakpointSource(dbp.src.frameId, dbp.src.expression)
: dbp.src.type === DataBreakpointSetType.VariableScoped ? new VariableScopedDataBreakpointSource(dbp.src.variablesReference, dbp.src.variable)
: new ResolvedDataBreakpointSource('-1'), // should not happen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and above, you can use assertNever if TS agrees it can never be hit

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked the whole flow a bit so that now we shouldn't run into that problem anymore.

@martin-fleck-at
Copy link
Author

@connor4312 Thank you very much for getting back to this! It took me some time to pick that up again and update it to the latest main but I believe now the API is much cleaner! We now have a much cleaner workflow from "origin" (previously "source") to "info" (previously "resolution") to the actual data breakpoint. We also handle the global nature of the data breakpoint more elegantly by storing the info per session. I think we have a very nice API now but please let me know if you like something changed.

@martin-fleck-at
Copy link
Author

@connor4312 I see that the 'Code OSS' check seems to fail but I do not have the credentials to look at the actual log. If it is something I need to fix, could you please show me the output somehow.

@roblourens
Copy link
Member

That should be open to the public, you can't see https://dev.azure.com/vscode/VSCode/_build/results?buildId=149534&view=results? I can load it in a window where I'm not logged in.

The failure is

  1) Debug - Breakpoints
       data breakpoints:
     
  AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
  
  0 !== 'variable'
  
      at Context.<anonymous> (file:///mnt/vss/_work/1/s/out/vs/workbench/contrib/debug/test/browser/breakpoints.test.js:264:12)

@martin-fleck-at
Copy link
Author

@roblourens Thank you for your quick response! I fixed the test case now.

When I clicked on the Code OSS link of the failed job in this PR, I was asked to sign in and then got an error but I just tried it on my phone and there it works so it actually is something I need to figure out on my side. Sorry for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Data Breakpoints across sessions
5 participants