Add voxtype modifier suppression during transcription#4178
Conversation
Adds a Hyprland submap that blocks modifier keys (SUPER, CTRL, ALT) while voxtype is typing transcribed text. This prevents held modifier keys from triggering window manager shortcuts during output. The fix uses voxtype's pre_output_command/post_output_command hooks to activate/deactivate the submap automatically. Includes F12 as emergency escape if voxtype fails to reset the submap. Fixes: basecamp#4159
|
I am about to tag 0.4.10 which will have the upstream fix. I added a setup command that might be useful: although it writes to hypr/conf.d instead of hypr/bindings. It does take a --show switch that just prints the submap out. |
|
Thanks for the heads up on 0.4.10! I'll keep the submap in utilities.conf rather than using |
Yep that sounds good! I would consider adding a |
|
@konnsim 0.4.10 is in the AUR, should reach all of the mirrors soon. Thanks for your collaboration! |
|
@peteonrails Do we have to change anything on the Omarchy side? Or do the same bindd + binddr combos work? |
@dhh The same bindd + bindr combos will work. Long term, I won't break backward compatibility without a discussion and coordination with distro owners. As far as the "typing has started but I am holding the SUPER key" problem goes: the submap introduced in this PR does the trick. Since @konnsim and I coordinated on this response, there's nothing else that needs to change on Omarchy beyond this PR. However, there is a related issue that is not addressed by this PR: #4159 The workaround suggests that using |
|
I had a good crack at fixing the issue of releasing In the end I decided that it's actually kind of a hidden feature that allows you to do "hold to dictate" if you hold all 3 then release |
| bind = , Control_R, exec, true | ||
| bind = , Alt_L, exec, true | ||
| bind = , Alt_R, exec, true | ||
| bind = , ESCAPE, submap, reset # Emergency escape if voxtype fails to call post_output_command |
There was a problem hiding this comment.
As a heads up - Voxtype 0.4.11 will include a voxtype record cancel command that aborts a recording or transcription in progress without injecting any text.
In my test build environment I have it bound to ESC, and it also resets the submap. Food for thought.
There was a problem hiding this comment.
does pre_output_command fire right before wtype/ydotool starts outputting?
I could update the submap binding to bind = , ESCAPE, exec, voxtype record cancel; hyprctl dispatch submap reset
That way ESCAPE works for intentional cancels and also as an emergency escape in case of mid-output crash.
There was a problem hiding this comment.
does pre_output_command fire right before wtype/ydotool starts outputting?
I could update the submap binding to
bind = , ESCAPE, exec, voxtype record cancel; hyprctl dispatch submap resetThat way ESCAPE works for intentional cancels and also as an emergency escape in case of mid-output crash.
It does, you can see the chain here if you are curious: https://github.com/peteonrails/voxtype/blob/9dd3a3c635b0d0585280c8ed2973698888efa987/src/output/mod.rs#L115
There was a problem hiding this comment.
We could even add another hook for on_recording_start that could be used to drop into a voxtype_recording submap that binds only ESCAPE so it can be used to cancel the dictation anywhere in the pipeline from recording -> transcription -> output without affecting modifiers until output begins.
There was a problem hiding this comment.
Yeah that could be handy. I'm working on testing 0.4.11 but a couple extra hooks are a pretty light lift.
There was a problem hiding this comment.
No rush ofc I'm happy to open it as an enhancement in another PR if it doesn't make it into this.
There was a problem hiding this comment.
Thought on this some more, adding the voxtype record cancel command to my submap in this PR was pointless as it only gets entered into immediately before output begins (which that new command doesn't cancel).
The idea of a pre_recording_command hook (or whatever you wanted to call it) with it's own voxtype_recording submap with an ESCAPE binding calling voxtype record cancel is still valid though.
We would just need to move from the new voxtype_recording submap to the existing voxtype_suppress submap on the pre_output_command hook. Otherwise the changes in this PR would all stand as-is.
See issue peteonrails/voxtype#59 for details.
There was a problem hiding this comment.
I like this idea quite a bit - I wasn't able to get it into 0.4.11, but I'm going to work on it this weekend for a possible 0.4.12.
|
tested changes with AUR voxtype 0.4.10, all working as expected. |
372be24 to
e54022a
Compare
@konnsim I'm tracking a related issue here peteonrails/voxtype#61 that a user with the sub map reported. They report that during text injection the first character is getting dropped when using the voxtype-suppress submap. I also ran into this last night while I was testing 0.4.11. I was able to reproduce it in 0.4.10, so I'm concerned that we might have a race condition. I'm going to spend a little bit more time getting good data on the problem this weekend. |
Kurtis, Heads up: the issue turned out to be with binding the ESC key into the sub map. So I led you a little bit astray there. Sorry about that! I'm updating the submap output in a bug fix release and I think we might want to coordinate changing it on this PR so that it doesn't present to Omarchy users across the board. --Pete |
Binding ESCAPE in the submap causes wtype to drop the first character of transcribed text. ESCAPE appears to clear compositor state during submap transitions. See: peteonrails/voxtype#61
All good I just took a look and yes I could reproduce the first char dropping, but interestingly it was not happening when I continued to hold I've just adjusted back to |
…order quirk - Add pre_recording_command hook to enter voxtype_recording submap - Move dictation bindings to user's bindings.conf as adjusting binds will require adjusting new submap (install + migration) - Document hold-to-record (release X to stop) and toggle modes through key release ordering - ESCAPE cancels recording while in submap
af69575 to
908dc9d
Compare
|
I've added support for the Unfortunately because we need to drop into a submap to bind I also added a comment to document how the release order of |
| voxtype setup systemd | ||
|
|
||
| # Add voxtype bindings to hyprland config if not present | ||
| if [[ -f ~/.config/hypr/bindings.conf ]] && ! grep -q "voxtype_recording" ~/.config/hypr/bindings.conf; then |
There was a problem hiding this comment.
We don't need this. The default utilities.conf is included for everyone.
There was a problem hiding this comment.
On line 54-56 of utilities.conf I remove the voxtype bindings.
I've lifted it up to bindings.conf so it can be changed wihout modifying omarchy files as the binddr needs to be in the voxtype_record submap and that can't be changed just by a top level unbind/binddr, it needs to be changed inside the submap.
|
@konnsim Yeah, I don't like the idea of exposing this big lump of bindings directly. Folks can always map whatever they want in there, but we have to find a setup that works great out of the box and that can live inside the default bindings. |
Remove voxtype_recording submap and associated bindings - the added complexity doesn't fit well in defaults and users wanting different keybindings would need to modify the submap too. Keep only the voxtype_suppress submap for modifier key suppression during text output, which is the core fix for the original issue.
|
@peteonrails I can't think of another way to get the I'll leave this at just the modifier suppression fix now so it can get merged in to solve the immediate issue. |
Fair enough -- the cancel button may not be something a broad swath of users need or want. I'll leave the feature in voxtype, but am happy to set this detail down until we figure out whether Omarchy users will or will not need it. I'm going to focus on improving the documentation of 'troubleshooting potential keybinding issues' so that when/if an Omarchy user runs in to one of the issues we've seen, they have a playbook to follow. If you need any other support in getting this PR approved by the maintainers, I'm ready to help. |
|
Turns out there's an Just one catch though (and this is likely related to the I think that's the best we're going to get with |
I think we can't do this to users. I think it's better to just leave |
|
I agree, this is gtg then, did you want me to squash my commits or happy to do it on merge? |
|
@dhh just a heads up that changing to toggle keybinds doesn't resolve the issue this PR is solving. |
Summary
Adds a Hyprland submap that blocks modifier keys (SUPER, CTRL, ALT) while voxtype is typing transcribed text. This prevents held modifier keys from triggering window manager shortcuts during output.
Problem
When using voxtype with push-to-talk (SUPER+CTRL+X), if the user releases keys slowly or in the wrong order, modifiers may still be held while voxtype types the transcription. This causes typed characters to trigger shortcuts instead of inserting text.
Solution
voxtype_suppresssubmap to Hyprland bindings that blocks modifier keyspre_output_command/post_output_commandhooksDependencies
pre_output_command/post_output_commandhooks.Changes
default/hypr/bindings/utilities.conf- Addvoxtype_suppresssubmapdefault/voxtype/config.toml- Add hook configurationmigrations/1767939322.sh- Add hooks to existing voxtype configsTest plan
Related: #4159