Dictation support for visual studio code #40976

JoleCameron · 2017-12-31T02:43:36Z

Hi,

I wish to lodge a request to have VS Code updated so that it can accept dictation input. Currently, if you try to dictate into VS Code using software like Dragon (the industry standard), nothing happens.

This is important to fix for people like myself who have long term hand injuries and are trying to figure out ways to program by voice. People have managed programming by voice in these situations, but the solutions are difficult to develop and not pretty.

To be clear, I'm not asking that you develop voice commands to input symbols by voice, only that the text boxes in VS Code (and/or Visual Studio) can accept dictation input by Dragon (preferably with full 'select-and-say' support). Voice programmers can take care of the rest.

Does it have to be Dragon? Not necessarily. It could be any local speech recognition engine with good accuracy (I'd argue that the decade old Windows Voice Recognition isn't quite there yet) and the ability to write custom voice commands.

While there are few people using such technologies today, it is a subject of interest to all programmers, because they may need it in the future.

Jole

cleidigh · 2017-12-31T05:59:49Z

@JoleCameron
I am a Code user and contributor. I am also a Dragon user as I have ALS and can only program by voice.
For over a year now I've been doing contributions all the while using and programming in Code. Some
of my contributions have to do with improving accessibility. That said, I have been able to set up a pretty usable scenario. As a longtime programer I had the ability to put together this set up, I'm sure you can do the same and apologies if I'm telling you anything you already know:

Windows7
Dragon vDPI 14 (Dragon 15 has limitations, albeit a better recognition)
SpeechMatic directional high-performance microphone with USB-AGC, around the neck twist type (Critical !)
Natlink (open source Python framework and API for Dragon)
Dragonfly (Python grammar and rules engine allowing flexible custom commands)
AutoHotKey
Custom Python grammars for Code
Various user contribution grammars

The key to making this all work well is to have grammars that can seamlessly enter text either in the main editors or input boxes. Also I have command set up for almost all of the common Code keyBindings
Utilizing everything possible to avoid using the mouse makes everything faster.

I do all of the above with no changes to Code.

I would be happy to walk you through what I have. It probably would be a good start to understand better what you have now and how you use it.

Cheers

Update: from your repositories I see you use Vocala and therefore Natlink - you already have most of what you need. (And now I know I was telling you things you already knew :-(

JoleCameron · 2017-12-31T19:50:30Z

@cleidigh

Thanks for your prompt response. May I start by saying that it's nice to talk about this problem with someone who themselves programs by voice.

My journey towards hands-free programming developed a little differently than yours. In my case, I developed a severe case of RSI when typing up my Honours thesis in late 2013. In order to finish my mathematics thesis, I developed basic macros using Vocola 2.

I was an inexperienced self-taught programmer before developing this injury, so didn't want to start developing a system to program by voice until my hands could do a little bit of typing to write the commands. Between that, full-time work in a different industry and a couple of years of poor health, and have only returned to my goal to set up programming by voice now.

For the PC, I have licenses for Dragon 12.5 Preferred and Dragon 15. As you know, Natlink does not work with Dragon 15 and, given it never had official support, may not work with future versions. Since the compatibility issues have not been resolved in the year since Dragon 15 was released, I have no reason to believe that they will be resolved in the future. Because of this, I will develop a system using inbuilt DVC commands. Note that it is actually possible to write DVC commands like camel , provided that the open-ended variable as at the end of the command.

Back to the main point: there are a few reasons why I think that Code need Select-and-Say capabilities.

While I admit that it's possible to program effectively by using commands to emulate keystrokes, this solution has its problems. First, it cuts the user off from being able to use "correct that" to improve recognition with time. Depending on the particular person, this can be more or less important. Second, it makes the learning curve steeper than necessary, because if a person was able to start by developing voice commands for some things, and they could use ordinary dictation to write the rest of their code, albeit slowly. What we don't see is the number of programmers who lose the ability to use a keyboard and mouse and then a force to change careers.
Your solution is impractical for markup languages like LaTeX. You may well be aware that LaTeX is the industry standard for scientific and mathematical publication. These documents are a mix of prose, and encoding for equations and pictures. It is impractical to dictate ordinary prose using only custom commands, so you need Select-and-Say, but you also need an editor with sufficient power to efficiently navigate the various symbols by voice.
Finally, Visual Studio and VS Code fail to live up to Microsoft's own standards for disability access. And if Microsoft fails to live up to its own accessibility standards, what do you think everyone else will do? I note that you're using Windows 7. I'm using Windows 10. Windows 10 is actually less accessible by speech than Windows 7. For example, consider Edge in Windows 10 vs Internet Explorer in Windows 7 (Internet Explorer on Windows 10 is too unstable to use).

Anyway, I hope this helps explain why I think that VS Code should incorporate this change.

Cheers

cleidigh · 2017-12-31T20:58:08Z

@JoleCameron
Thanks for the detailed response.

First let me say that despite the fact that we have arrived at where we are from slightly different paths, one thing I think is probably very common; everyone starts off frustrated using voice control / dictation for programming. I was very reluctant to use Dragon in the beginning given its peculiarities , limitations.
Necessity changed that and I went crazy to try to make the best of it. I think there are some objective realities that one should start with:

System Requirements: a fast system, 16gb Ram et cetera, very good microphone - did not catch what you use?
Dragons is not meant for programming and it never will be unfortunately as you know they do not support Natlinks and they broke "continuous recognition" in DPI 15 - more on that later
Utilizing programming support elements is really a requirement not an option.
Code cannot really add much directly to the puzzle, Dragon would have to implement more direct support for anything special, they will never do that.
I believe I am accomplishing everything you mention without much more set up then you are the have.
I think you see too many limitations with the current approach you are using with just vocala
I can do everything that Select and Say does, while I do not use correct that some of those facilities should be possible . also
Doing MarkDown is no problem using a mix of custom commands or using Emmetts and normal dictated text.
Key factor is using continuous recognition commands that allow you to chain both symbols, words and Code commands. For this you need Natlink+Dragonfly and some off-the-shelf grammars
Using some very basic Python you can add almost anything with little effort. I can share with you
all my grammars both personal and collected.

I would strongly suggest giving Dragonfly grammars a try, and I would be happy to help with this.
Let me know if you'd like to do this.

BTW I think one way Code could support this more as with a combination of recipes and perhaps
an extension to help with setup. I believe this is the most likely path knowing both code and Dragon.

JoleCameron · 2017-12-31T21:22:47Z

Thanks for the offer, but as I am choosing to stick with the current version of Dragon (for reasons of employability and to make sure my system works long-term) your method won't work in my case. It's easy enough to fake continuous command recognition - that's not my issue here. And my setup is fine.

When it comes down to it: yes, I think I can get it to work without any changes to Code. However, this will require using workarounds that I wouldn't need to use if Microsoft lived up to its own accessibility standards for speech input. Sure, Dragon's not designed for programming, but I'm not asking for a special method to program by voice, just that the text box is designed according to Microsoft's own standards.

cleidigh · 2018-01-01T21:22:59Z

@JoleCameron
Happy New Year's !

After all my work with this done without help, I'd like to help you get the most out of voice programming.

I think you have a couple options:

I believe you can install both 12 and 15, I cannot test this because I can never be without voice. I think you could try this to be able to use 12 for programming and 15 for other things and future compatibility.
After 40+ years of programming and a lot of research and working with Dragon, any article on programming by voice will point you to one of the frameworks like dragonfly, vocala etc .
Your flexibility and power cannot be matched by dragging alone. FWIW
while I highly recommend the above approach, if you're absolutely determined to use 15, I would still like to make this work better for you.

Natlink has been made to work partially with 15. Multiple people are trying to make this work better, however most likely will be with limitations but I think you might be able to get a fair amount out of it. This might be in between approach.

Lastly a pure 15 approach:

I want to point out a couple of things on your comments about compatibility
I am not sure why you are not able to use Dragon alone to enter dictation into an input box in Code. --- With my 14 I can use just Dragon commands to open the Find widget and enter search text.
Press Control+F (default key binding for search)
Put Dragon in to Normal or dictation mode "Start Normal Mode"
(dictate search text)
you can optimize the above with DVC commands
The above is the standard way that Dragon interacts with any input item with no knowledge of the application
It is important to understand the several ways Dragon interacts with programs
Dragon really only has custom interactions in a couple of ways
It understands and can interact with Menus and dialogue buttons that utilize the Win32 Windows API
Many new applications utilize WPF which I think are currently not fully supported by Dragon, this is a Dragon issue not an application issue.
With respect to Code in particular it is somewhat of a special application itself. It does not use Win32
for anything other than the menu bar and a couple native dialogues. Code is an Electron app centered around the Chromium standalone browser engine.
This architecture means the entire application is browser like not native application like. Dragon{needs
to interact with the application in the same manner that it typically interacts with a browser.
While Dragon has some add-ons for interacting with a few popular programs including browsers, these are done by Dragon personally I believe are not that great. I have created a few things to get much more out of Chrome than the Dragon add-ons. My browsing experience is quite good . this way
You mentioned correction, I use the built-in suggestions in Code as well as Undo, I do not Dragons
Correction as it will rarely ever help with code. I believe it should work just fine . anyway.

Finally without sounding defensive (I am not) , Code does not violate nor does it incorrectly implement
"input boxes", these are implemented as HTML5 input elements which we focused, will accept any input from Dragon. I have actually done extensions to Natlink and I have a pretty good idea of how it works. I have not actually determined how Dragon could be made more compatible, it almost always comes down to keyboard commands. I switched to Code after using many other editors for years in particular because of its accessibility. It supports things like screen readers and contrast modes , not necessary for us but nonetheless I think it makes Code the most accessible editor out there.

Let me know if you'd like to do some of these experiments.

cleidigh · 2018-01-06T23:35:35Z

@JoleCameron

Any thoughts on the above?
Did you try my suggestion for dictation into input boxes?

Is there's something very specific to address given my comments?

JoleCameron · 2018-01-07T21:49:33Z

Sorry for not getting back to your sooner. I've been both busy and unwell this week, and I let this slide. My microphone also died last night, so I'm having to type this by hand. Hence, I'll be brief.

My concerns with VS Code boil down to the fact that I can't even dictate into the main text box without using a Dragon command, let alone have Select-and-Say access. Given that Microsoft provides an essential service (Windows), I think that the problem isn't entirely Nuance's fault. Beyond that, things go beyond my level of knowledge.

I would appreciate input on setting up programming by voice using Dragon 15, but I'd rather not do that through a public forum. To that end, I sent you a private message on the knowbrainer forum. I'll probably want input at about the one month mark.

LexiconCode · 2018-01-26T15:55:32Z

@claudioc

I also program by voice. Outside of Select-and-Say capabilities which would be a blessing to have VS Code. There are a number of other ways VS code could improve accessibility as well. First a little bit about my set up.

Edited 2/11/2020 - Updated information and links

Windows 10 64-bit -(8GB) of RAM - i5 7200U
Dragon vDPI 15
SpeechWare FlexyMike Dual Ear Cardiod high-performance microphone with SpeechMatic MultiAdaptermulti

Natlink - NatLink is an OpenSource extension module for the speech recognition program Dragon.
Caster - Caster is a collection of tools aimed at enabling programming and accessibility entirely by voice. It runs on top of Dragonfly.
Dragonfly - A fork of dragonfly that utilizes CMU Sphinx, Dragon NaturallySpeaking, Windows Speech Recognition, Kaldi as a backend.

Microsoft and VS code contributors could at empower the voice to code community to develop extensions that facilitate accessibility. There are some outstanding limitations with Castor and Dragonfly both interact by emulating keystrokes in VS code. A uses example. Which is why we need A method to expose the VSCode active 'when Clause Contexts'.
#10471
#26882

zachgibson · 2018-04-04T15:52:17Z

I’m trying to use Dictation on a Mac and it doesn’t handle actually dictating text. I can perform commands such as open new file and such using Dictation in VSCode.

cece554 · 2018-10-22T17:50:48Z

having this problem as well with mac dictation, I tried saving snippets in dictation under commands VScode appears to be unable to handle them, but when I say worries are not under commands VScode prints those

sethwilsonUS · 2019-02-07T18:35:01Z

I'm legally blind, and while I can program reasonably well through conventional means I'm still excited about the possibilities of voice programming.

I've been working on a voice programming web abb using an open-source JavaScript library called AnnYang. It uses the web standard SpeechRecognition API, which at present only works in Chrome (and apparently also Firefox now, though I haven't tested that). I'm wondering if, since VSCode uses Chromium, I/we can integrate AnnYang into a VSCode extension. If this could work, it would be awesome, because it'd be a free, integrated, cross-platform solution. But I'm not sure how smooth the integration would be, or if AnnYang is powerful enough. But the idea has potential I think...

ryan-zheng-teki · 2019-06-08T07:18:58Z

I think many developers will really like it when we could use voice command to write code. Especially when people are back at home after a whole day's work. Now voice recognition accuracy is improving, and Microsoft is promoting the "remote-development".With the adoption of 5G, I really hope that voice coding could be integrated into VSCode.At least we could decrease 70% of our time sitting down every day which is really good for our health condition as a developer.

irasanchez · 2019-06-30T16:27:40Z

As a student who is developing wrist pain, I'd also appreciate this.

rbavery · 2020-02-11T08:27:09Z

Just want to chime in to support this.

niemyjski · 2020-03-25T12:29:44Z

It is super important to have accessible tools for everyone to use.

LexiconCode · 2020-03-25T18:26:57Z

I've been investigating alternatives that don't require reliance on the editor to expose information for accessibility via extensions. Microsoft's Accessibility Insights for Windows as a tool to investigate exposing and testing Windows accessibility API UI Automation. Currently there is no official UI Automation bindings for Python or standardize support from a community performance project. I've worked with a few people to expose some other editors Scintilla. From there we been able to expose menus and editable text, cursor position, and so on. My hope is that this could be done from UI Automation but there needs to be better support from Microsoft.

CJohnDesign · 2020-05-24T04:49:51Z

I support this too. Wrist pain.

isidorn · 2020-06-09T14:37:43Z

Hi, VS Code developer here 👋
First thanks a lot for the great feedback. We definetly want to have a nice dictation expereince in VS Code so let's try to get some concise info here. I do not use dictation software so I appologise for the simple questions:

What are the dictation software used on Win / Mac / Linx. Is Dragon used everywhere? I plan to try it out on my mac.
Does this dictation software have a GitHub page where we can interect with the developers?
Do these dictation software work well with Google Chrome, for example when you want to dictate into this GitHub input box
What is the experience with VS Code? It simply does not work? I know @cleidigh uses it with Dragon

Then we can try to figure out what should be done on the VS Code side and what should be done on the dictation software side.

Thanks!

niemyjski · 2020-06-09T15:00:35Z

I don't have answers to most of your questions :( but you have a whole team @ Microsoft (https://blogs.microsoft.com/accessibility/) who does nothing but accessibility. I'd recommend reaching out to Jessica Rafuse (She's pretty awesome).

isidorn · 2021-05-28T08:27:45Z

Just FYI there is a voice-assistant VS Code extension for Windows. You can find it here https://github.com/b4rtaz/voice-assistant
I tried it out and feels like it is in the early stages and still needs a lot of polish, but nevertheless it looks interesting.

fusentasticus · 2021-12-08T17:31:22Z

@isidorn Thanks for following this thread on automation needs for those of us who prefer or have to command our computer by voice!

Now that there is a dedicated subdirectory for automation in the source tree, should we as dictation users go and vote for UI automation for extension authors using Playwright #136121 so that at least this part becomes easily user installable?
And with the automation already in place would it be a big deal to write a full UIAutomation driver on top? By the real thing, I'm of course thinking about the excellent conceptual framework https://docs.microsoft.com/en-us/windows/win32/winauto/entry-uiautocore-overview, which very nicely Microsoft's Edge browser already supports and that Microsoft has given to the community per official commitments.
So, it is as if all the technology pieces are in place for something like word-under-the-mouse and custom select-and-say mechanisms to be easily implemented by dictation systems --- if we could just get a full built-in UIAutomation service for VS Code!! Specifically, we're looking for goodies like FromPoint, RangeFromPoint and Select from Text pattern, the TextEdit patterns, and all the well-designed stuff for automation of panels, tabs, and buttons etc.
My comments here should include: that I do see at least partial UIAutomation support when the VS Code window is focused (active window). However, the TextEdit control is disabled unless "Accessibility support" is turned on. Unfortunately, turning this setting on forces text wrapping to be off (which is not always good for visual users!). Also, when vscode is unfocused, the automation elements returned by FromPoint appear to be an internal VS Code hierarchy not related to the UIAutomation model, which is why I am confused about the status of automation for VS Code! I'm not sure for example how much of the current automation in VS code has bubbled up from underlying automation work on Chromium/Electron [My preliminary testing is done via FlaUI in UIA3 mode.]

isidorn · 2021-12-10T10:29:06Z

@fusentasticus thanks for your reply. Let me try to answer:

I would not suggest voting on that, that is just testing infrastructure we use and we do not have any plans to add this. I hope we can achieve this not using Playwright
I am not an expert in UIAutomation framework, so I do not really know how to best answer this. If something can be written that interacts with VS Code that would be great. VS Code is using Chromium underneath, so theoretically is this UIAutomation works with Chrome or Edge it should be possible for it to work with VS Code
I see how this UIAutomation would enable a lot of scenarios, and that sounds great!
Word wrapping being disabled is covered by this issue Word wrap should not be disabled when accessibility is turned on #95428 We can fine tune this behaviour. And yes I believe VS Code simple bubble up from underlying automation work on Chromium/Electron

meganrogge · 2023-01-04T14:17:52Z

Exploring a different, though related idea in #170554. Please let us know what you think there.

meganrogge · 2023-10-05T21:10:07Z

Hi @JoleCameron, it has been a while since we last touched base with you. How are you finding the dictation support in VS Code these days? Is there anything we can do to help?

bpasero · 2024-02-15T07:55:54Z

Fyi I am splitting this issue into the part that is actually being worked on: dictation support in the editor (#205263).

I think this issue here in particular asks for voice-to-text support in all locations that accept textual input, which is not in scope for February.

bpasero · 2024-03-07T11:12:22Z

With our February release, there is now support to use your voice to dictate into the editor: https://code.visualstudio.com/updates/v1_87#_use-dictation-in-the-editor

After installing the VS Code Speech extension you can use the keyboard shortcut Ctrl+Alt+V (Cmd+Alt+V on macOS) to start it.

Can people in this issue try it out and report back how it goes? Thanks!

meganrogge · 2024-04-16T20:03:13Z

In reading through this issue, here are my findings:

Dragon does work with VS Code, though beginner dictation users can find configuring their setups to do so challenging
there is interest in using voice commands to write code. We now support that with copilot chat and with editor dictation.
It would improve the experience to expose VS Code when clauses so extensions could know context/focus and tell Caster, a dragonfly based programming toolkit that enables running commands/writing code. However, we now have copilot chat and Hey Code for those.

cc @isidorn, I think this issue can be closed given these findings.

isidorn · 2024-04-18T09:58:16Z

Thank you very much for those insights.

I agree that we can go ahead and close this issue. But I think we should create a follow up feature request for voice to trigger VS Code commands. Something that we currently do not support well, and it would be good to understand the need better.

For other requests (when close through API) there are already issues capturing this.

Users of the Voice extensions - we plan to do a user study at end of May. If you would like to help, more details can be found here microsoft/vscode-discussions#1144

bpasero · 2024-04-18T11:31:05Z

I think we have that as #209906

meganrogge · 2024-04-18T14:11:01Z

I have assigned #209906 to myself and added the accessibility label

cleidigh self-assigned this Dec 31, 2017

cleidigh added accessibility Keyboard, mouse, ARIA, vision, screen readers (non-specific) issues workbench labels Dec 31, 2017

bpasero added editor and removed workbench labels Jan 3, 2018

cleidigh added the info-needed Issue requires more information from poster label Jan 6, 2018

isidorn added this to the Backlog milestone Jun 18, 2019

isidorn added feature-request Request for new features or functionality and removed info-needed Issue requires more information from poster labels Jun 18, 2019

isidorn assigned isidorn and unassigned cleidigh Jun 9, 2020

isidorn mentioned this issue May 27, 2021

VS Code questions b4rtaz/voice-assistant#2

Closed

alexdima added editor-input Editor text input and removed editor labels Oct 15, 2021

meganrogge self-assigned this Nov 1, 2022

isidorn mentioned this issue Dec 8, 2022

add compatibility for voice control dictation on Mac #123511

Closed

bpasero added the workbench-voice label Oct 2, 2023

This was referenced Jan 23, 2024

Vs code speech should require Copilot Chat #203210

Closed

Voice support in editor and input boxes #202763

Closed

meganrogge modified the milestones: Backlog, February 2024 Jan 25, 2024

bpasero self-assigned this Jan 25, 2024

kieferrm mentioned this issue Feb 4, 2024

Iteration Plan for February 2024 #204292

Closed

61 tasks

bpasero modified the milestones: February 2024, Backlog Feb 15, 2024

bpasero removed the editor-input Editor text input label Feb 15, 2024

meganrogge modified the milestones: Backlog, April 2024 Mar 7, 2024

meganrogge closed this as completed Apr 18, 2024

meganrogge removed this from the April 2024 milestone Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dictation support for visual studio code #40976

Dictation support for visual studio code #40976

JoleCameron commented Dec 31, 2017 •

edited

cleidigh commented Dec 31, 2017 •

edited

JoleCameron commented Dec 31, 2017

cleidigh commented Dec 31, 2017 •

edited

JoleCameron commented Dec 31, 2017

cleidigh commented Jan 1, 2018 •

edited

cleidigh commented Jan 6, 2018

JoleCameron commented Jan 7, 2018

LexiconCode commented Jan 26, 2018 •

edited

zachgibson commented Apr 4, 2018

cece554 commented Oct 22, 2018

sethwilsonUS commented Feb 7, 2019

ryan-zheng-teki commented Jun 8, 2019

irasanchez commented Jun 30, 2019

rbavery commented Feb 11, 2020

niemyjski commented Mar 25, 2020 •

edited

LexiconCode commented Mar 25, 2020 •

edited

CJohnDesign commented May 24, 2020

isidorn commented Jun 9, 2020

niemyjski commented Jun 9, 2020 •

edited

isidorn commented May 28, 2021

fusentasticus commented Dec 8, 2021 •

edited

isidorn commented Dec 10, 2021 •

edited

meganrogge commented Jan 4, 2023

meganrogge commented Oct 5, 2023

bpasero commented Feb 15, 2024

bpasero commented Mar 7, 2024

meganrogge commented Apr 16, 2024 •

edited

isidorn commented Apr 18, 2024

bpasero commented Apr 18, 2024

meganrogge commented Apr 18, 2024

Dictation support for visual studio code #40976

Dictation support for visual studio code #40976

Comments

JoleCameron commented Dec 31, 2017 • edited

cleidigh commented Dec 31, 2017 • edited

JoleCameron commented Dec 31, 2017

cleidigh commented Dec 31, 2017 • edited

JoleCameron commented Dec 31, 2017

cleidigh commented Jan 1, 2018 • edited

cleidigh commented Jan 6, 2018

JoleCameron commented Jan 7, 2018

LexiconCode commented Jan 26, 2018 • edited

zachgibson commented Apr 4, 2018

cece554 commented Oct 22, 2018

sethwilsonUS commented Feb 7, 2019

ryan-zheng-teki commented Jun 8, 2019

irasanchez commented Jun 30, 2019

rbavery commented Feb 11, 2020

niemyjski commented Mar 25, 2020 • edited

LexiconCode commented Mar 25, 2020 • edited

CJohnDesign commented May 24, 2020

isidorn commented Jun 9, 2020

niemyjski commented Jun 9, 2020 • edited

isidorn commented May 28, 2021

fusentasticus commented Dec 8, 2021 • edited

isidorn commented Dec 10, 2021 • edited

meganrogge commented Jan 4, 2023

meganrogge commented Oct 5, 2023

bpasero commented Feb 15, 2024

bpasero commented Mar 7, 2024

meganrogge commented Apr 16, 2024 • edited

isidorn commented Apr 18, 2024

bpasero commented Apr 18, 2024

meganrogge commented Apr 18, 2024

JoleCameron commented Dec 31, 2017 •

edited

cleidigh commented Dec 31, 2017 •

edited

cleidigh commented Dec 31, 2017 •

edited

cleidigh commented Jan 1, 2018 •

edited

LexiconCode commented Jan 26, 2018 •

edited

niemyjski commented Mar 25, 2020 •

edited

LexiconCode commented Mar 25, 2020 •

edited

niemyjski commented Jun 9, 2020 •

edited

fusentasticus commented Dec 8, 2021 •

edited

isidorn commented Dec 10, 2021 •

edited

meganrogge commented Apr 16, 2024 •

edited