Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add getContext command #118722

Closed
wants to merge 1 commit into from
Closed

Add getContext command #118722

wants to merge 1 commit into from

Conversation

pokey
Copy link
Contributor

@pokey pokey commented Mar 11, 2021

In parallel to setContext. Can be tested by running eg

vscode.commands.executeCommand("getContext", "terminalFocus")

This PR partially fixes #10471

In parallel to `setContext`.  This PR partially solves microsoft#10471
@pokey
Copy link
Contributor Author

pokey commented Mar 24, 2021

I just wanted to follow up on this PR. It would be a huge help for accessibility in VSCode. What can I do to make this PR more likely to be accepted?

@jrieken
Copy link
Member

jrieken commented Mar 24, 2021

It would be a huge help for accessibility in VSCode.

I am curious about that and AFAIK something like getContext isn't a good fit for accessibility.

Honestly not convinced of the getContext-command. In general, we don't like "async-get-apis" because it adds potential for high IPC loads, we usually mirror data into the extension host for "reading". Another thing is that the world of context keys is much more complex. There are actually many context key containers - they roughly correspond to various "parts" of the workbench, like editors, editor-groups, views etc. For instance, in one container the "hasTextSelection" context key value is true whereas in another container it is false. These parts have no API counterpart and can therefore not be referenced, meaning you can only interact with the root context key container. For the setContext-command that's already fishy and for get it will be a bigger problem because you will likely never get the actual/wanted values.

@pokey
Copy link
Contributor Author

pokey commented Mar 24, 2021

Hi @jrieken; thanks for your response! I'll do my best to respond to each of your points inline, but bear in mind this is my absolute first VSCode PR so I'm still trying to understand all this stuff 😊

I am curious about that and AFAIK something like getContext isn't a good fit for accessibility.

There are a few use cases for this. The main conceptual reason here is that we need access to these contexts in order to "prime" the speech recognition grammar with allowable commands based on context. For example, when the terminal has focus, we can activate all the speech commands that enable us to use the terminal. When the editor pane has focus, a different set of commands need to be active, and when quick open is focused it would be another set of commands. In addition, we tend to string together multiple commands that then all get issued rapidly, and having the voice coding software able to check these contexts can allow us to avoid timing issues that would otherwise result in arbitrary and brittle sleep statements.

Also, this PR does have interest from two completely separate voice-coding communities: I use talon, and @daanzu is maintainer of kaldi active grammar

That being said, if you'd prefer to expose an alternative api that would allow us to get the same information, I'd be fine with that, but I'd imagine that would take longer than merging this PR.

Honestly not convinced of the getContext-command. In general, we don't like "async-get-apis" because it adds potential for high IPC loads, we usually mirror data into the extension host for "reading".

I see. I'd be happy to take a swing at implementing such mirroring functionality if you can point me in the right direction. Do you see this as a showstopper?

Another thing is that the world of context keys is much more complex. There are actually many context key containers - they roughly correspond to various "parts" of the workbench, like editors, editor-groups, views etc. For instance, in one container the "hasTextSelection" context key value is true whereas in another container it is false. These parts have no API counterpart and can therefore not be referenced, meaning you can only interact with the root context key container. For the setContext-command that's already fishy and for get it will be a bigger problem because you will likely never get the actual/wanted values.

Interesting. How come this complexity doesn't pose an issue for using these contexts to qualify keyboard shortcuts? Are the when-clause contexts that qualify keyboard shortcuts able to refer to keys in containers other than the root context key container? If so, maybe you could point me to how I might implement this?

My goal is that the value returned by this command at any given moment would be exactly the value that would be used by VSCode to determine whether a when-clause context is active for a keyboard shortcut at that same moment. Fwiw, ideally I'd like to get notified when the value changes so that I can change the active grammar, but that's probably out of scope here

The main issue is that if your users who are able to use a keyboard are able to leverage these contexts, I'd argue that users who cannot use a keyboard should also be able to leverage these contexts

@jrieken
Copy link
Member

jrieken commented Mar 24, 2021

Interesting. How come this complexity doesn't pose an issue for using these contexts to qualify keyboard shortcuts? Are the when-clause contexts that qualify keyboard shortcuts able to refer to keys in containers other than the root context key container? If so, maybe you could point me to how I might implement this?

The complexity happens when evaluating a context key expression, e.g one needs to start with the right child container. There is two ways how this is achieved:

  1. menus and others "know" the right container when they are created (e.g the instance is passed along)
  2. keybindings lookup the right container: each container is associated with a dom node (run document.querySelectorAll('[data-keybinding-context]') in dev-tools to see them) and when a key-event occurs we walk upwards from the target dom-element until we find a container dom-node.

So, those who evaluate context keys pay the price for the complexity, down-level consumers like keybindings or menus don't have to know anything of this.

My goal is that the value returned by this command at any given moment would be exactly the value that would be used by VSCode to determine whether a when-clause context is active for a keyboard shortcut at that same moment.

That's exactly the problem: there is no API representation for the objects that own a container. Pressing ESC inside the editor find widget is different than pressing ESC inside an editor widget or inside a peek editor inside an editor widget. This is somewhat endless and mirroring the dom structure into the API is not an option for us. The only thing that could work is the focused/active element, e.g start with the container which contains the active element. However, that would be a pretty bad (aggressive) driver for mirroring stuff into the extension host.

The main issue is that if your users who are able to use a keyboard are able to leverage these contexts, I'd argue that users who cannot use a keyboard should also be able to leverage these contexts

I agree with that. But I don't agree with the approach that has been picked. You should not be reading and evaluating context keys. Instead you should be more like an "event source" and we should own the tuple [event, when-clause, command] - just like we do it for keybindings. Something like this

  1. extension registers event source
  2. vscode allows to bind events to commands using when-clauses
  3. extension fires event -> vscode evaluates the binding (using the focused/active element) and invokes the command

@pokey
Copy link
Contributor Author

pokey commented Mar 25, 2021

Thanks for the detailed response. Super helpful, and I think I'm starting to get at least a rough grasp of the high-level picture here

I agree with that. But I don't agree with the approach that has been picked. You should not be reading and evaluating context keys. Instead you should be more like an "event source" and we should own the tuple [event, when-clause, command] - just like we do it for keybindings. Something like this

  1. extension registers event source
  2. vscode allows to bind events to commands using when-clauses
  3. extension fires event -> vscode evaluates the binding (using the focused/active element) and invokes the command

So I think an approach like this would solve about a third of the issue, which I guess might make it a good place to start depending on how complex impl would be. I will try to explain the other 2/3 that's missing from this solution, though. There are basically two issues not addressed, but I think I can lump them into one with an example. Here's a sample talon file (the details of kaldi active grammar will likely be different but I'd imagine similar concepts apply; but correct me if I'm wrong here @daanzu):

tag: user.terminal
-
make directory: "mkdir"

This code tells talon that when I am in a terminal context, it should listen for me to say "make directory", and in response it will simulate keyboard events as if I have typed mkdir on my keyboard. I am not sure I understand how your proposed solution would handle this case. Somehow talon needs to know that I am in a terminal context in order to activate this command. A few high-level possibilities for how this might be done:

a) Support a few specific when-clause contexts as contextual substitution variables in the "window.title" setting, eg terminalFocus and inQuickOpen. Talon has machinery for tracking window titles that is commonly used to solve problems like this.
b) Support a notification and lookup system that tracks what the when-clause context would be if the user were to type a character right now
c) Support just the lookup system from b), so that we can poll this information
d) Pick a few specific when-clause contexts, such as terminalFocus and inQuickOpen, and create specific apis to query and receive events for them
e) Same as d) but no events, so we'd have to poll for this info

I guess in theory I could solve half of the issue here using your proposed solution with the workbench.action.terminal.sendSequence command. I'd bind an event that triggers this command, qualified by the terminalFocus when-clause context. This only solves half the problem, though, because I'd need to have this command active in talon all the time, so it would still be polluting my command grammar, making misrecognitions more likely, although when it does misfire at least it would end up being a no-op. But this trick still wouldn't help me with inQuickOpen.

Fwiw, the contexts that I am most interested in are the following:

  • terminalFocus
  • inQuickOpen
  • editorFocus

See also the related discussion here. Also cc/ @knausj85 and @lunixbochs to make sure I'm not missing something here

Thanks again for your help!

@daanzu
Copy link

daanzu commented Mar 26, 2021

Let me preface this by saying that I don't know a whole lot about how the VSCode extension API works. Also mentioning two other Dragonfly/Caster maintainers: @Danesprite & @LexiconCode. I hope we can find a way to implement this.

@pokey did a good job explaining generally what would be helpful for our usage to have. I would agree with the ordered list of what would be easiest for us to implement to get this info. Having a way to implement an extension that could add context tags to a place in the title bar would be easiest, but it would also be possible for us to have our software interact more directly with an extension that could gather this info. For what it's worth, dragonfly makes it easy to poll at the start of a spoken utterance, although that doesn't entirely solve the problem of determining when each command in a multi command utterance has completed.

One other possibly-related idea that would be quite handy to have: a way to query the syntax language of the current editor cursor location and/or the current editor file.

@drmfinlay
Copy link

Hello. Thanks for mentioning me @daanzu. I don't have much to add really.

I would opt for the simpler solution of putting additional context information in the window title. My editor, Emacs, can be set to include the current "mode" name in the window or "frame" title (details here). That is how I have my Dragonfly speech grammars for particular Emacs modes active only when they are required. It seems to me that @pokey's above possibility A for additional "window.title" substitution variables would be good enough, or at least a good start.

@pokey
Copy link
Contributor Author

pokey commented May 11, 2021

@jrieken I see in the extension roadmap, under Accessibility:

We'll engage and work with our community to get input and guidance, and we need you to keep us honest.

Well here I am, keeping you honest 😊. Shall we set up a zoom / skype / teams meeting / whatever to hash out a solution?

@jrieken
Copy link
Member

jrieken commented May 12, 2021

Shall we set up a zoom / skype / teams meeting / whatever to hash out a solution?

Adding @isidorn who coordinates these. I am all in for a solution but I hope I made clear that getContext (see #118722 (comment)) is not that solution

@isidorn
Copy link
Contributor

isidorn commented May 12, 2021

@pokey feel free to join our Accessibility gitter channel where you explain more what you would like to discuss about and then we can setup a meeting if needed https://gitter.im/Microsoft/vscode-a11y

@pokey
Copy link
Contributor Author

pokey commented May 12, 2021

Will do thanks!

@jrieken
Copy link
Member

jrieken commented Jun 16, 2021

Closing this because getContext isn't the solution

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lift setContext from a command to proper API
5 participants