Replies: 1 comment
-
|
I wonder how we could make it possible to write a kanata config in natural language and let an LLM write the config.
now the more tricky parts
Maybe a specdriven development framework like open specs would be well adviced (no experience). After writing this down I realize, I still asume the user to be a developer with an IDE like vscode installed and vibe coding experinence. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Karabiner-Elements 16.0.0 shipped yesterday. Congrats to the KE team — it's a solid release. I've been following KE's evolution as someone who uses kanata daily and contributes to the project, and there are some things worth discussing about where keyboard remappers are heading.
What caught my eye in KE 16
JavaScript config generation. Users can now write JavaScript that generates KE's Complex Modifications JSON, with an in-app editor. KE's raw JSON configs are notoriously hard to write, and JavaScript is the most widely-known programming language in the world. This dramatically lowers the barrier to writing complex rules — and in a world where LLMs are increasingly writing configs, JS's massive presence in training data is a pragmatic advantage.
send_user_commandand external server integration. KE now has a formalized way to dispatch actions to an external process via IPC, with a companion "user-command-receiver" server. Rather than building every possible action into core, they've created a lightweight extension point.Accessibility API integration.
frontmost_application_ifcan now detect overlay windows like Spotlight via the Accessibility API, andvariable_ifgained awareness of UI element properties — role, subrole, title, window geometry. This expands context-aware remapping beyond "which app is frontmost" to "what UI element is focused." It moves KE further along the spectrum toward tools like Hammerspoon.What I think this means for kanata
KE's
send_user_commandis something kanata already has — and arguably does better. Thepush-msgaction (#854) already provides lightweight outbound dispatch over TCP without the latency and security concerns ofcmd. And kanata's TCP server is bidirectional — external tools can push state in as well as receive events. jtroo has been pretty clear this is by design, not a workaround (#1304).What strikes me is that the pieces for a real extension ecosystem are mostly already here. jtroo explored a
custom-behaviourextension point in #797. The community builtpush-msgas a practical answer to the latency and security issues withcmd. And on context-aware remapping, jtroo's position has been consistent — it belongs outside kanata, mediated by TCP (#40). Community tools like komokana (Windows), kanata-vk-agent (macOS), and qanata (Linux/Sway) already prove this pattern works.I've been building KeyPath, a macOS app for kanata config and device management, and it already uses both sides of this pattern — receiving
push-msgevents for native macOS actions (action handler) and detecting the frontmost app to push state into kanata via TCP (context provider). Next I'm exploring feeding macOS Accessibility API state through the same channel — the same thing KE 16 built into its core, but delivered as an external tool, which is more in line with how jtroo has said this should work.The pattern that's emerging might be worth naming:
push-msgevents and perform platform-specific actions — window management, notifications, app launching.This mostly works today with existing features. What's missing isn't so much new functionality as visibility — jtroo has noted that TCP protocol documentation is scattered across issues and source code (#1304). I wonder if the highest-leverage investment is making the existing integration pattern more discoverable — documenting it, linking to community tools, and positioning
push-msg+defvar/switchas the extension surface. That way kanata's core stays lean while the community builds the platform-specific integrations that KE is choosing to absorb into its core.Worth noting: KE's JavaScript move also makes their configs easier for LLMs to generate. Kanata's S-expression syntax is LLM-friendly (simple, consistent, unambiguous) but there's far less of it in training data. Curated example configs, goal-oriented recipes in the docs, and machine-readable validation output could help close that gap.
The remapping engine remains kanata's deepest advantage. KE is catching up on conditional logic, but kanata's
switch, tap-hold variants,defvar,deflocalkeys, and chords are still ahead. The risk isn't that KE overtakes kanata on remapping — it's that kanata's strength becomes a ceiling if users can't easily integrate with the broader system. The TCP extension pattern could prevent that, and most of the pieces are already in place.Curious what others think — especially whether there's appetite for better documenting the TCP integration pattern and making the community tools more discoverable. And if anyone else is building integrations on the TCP server, I'd love to hear what's working and what's missing.
Beta Was this translation helpful? Give feedback.
All reactions