Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin capability #846

Open
Timoses opened this issue Sep 2, 2020 · 10 comments
Open

Plugin capability #846

Timoses opened this issue Sep 2, 2020 · 10 comments
Labels
Caster Issues pertaining to primarily the Caster project. Investigation Required Issue needs to be investigated.

Comments

@Timoses
Copy link

Timoses commented Sep 2, 2020

Requirements to a plugin framework:

  • Plugins may depend on other Python libraries -> Caster should offer some form of dependency resolution
  • Plugins should be configurable by the user of a plugin
  • Plugins must be able to store a plugin-specific state (e.g. bring me stores user's programs to bring)
  • Some form of developer mode to hot-reload plugins
  • Plugins should define a default pronunciation which can be overridden by the user

Ideas:

  • Plugins could provide deeper context information to Caster
    • Caster (or Dragonfly) may (and should probably) never implement all possible situations of application-specific contexts.
    • For example, the plugin firefox could provide Caster with information about whether the cursor is currently in the address bar of the browser
    • This capability could be used to further separate plugins from each other. For example, the basic firefox plugin could provide context information to caster. Then a user could configure a plugin which is only used within this special context of the firefox browser (e.g. a "firefox: address bar" plugin, or even a "address bar search" plugin which is not reduced to firefox but works in other browers or whereever such capability is required).

Initial issue content

Is your feature request related to a problem? Please describe.
As i currently understand it a user has to manually import external rules into their user directory. In my perspective this is a real hindrance for users to use other rules. Further, there's currently no way to potentially configure rules nicely from a user perspective.

Describe the solution you'd like
It would be very helpful if a user was able to configure which external rules should be loaded and caster will take care of fetching and loading these external rules. Additionally, the user should be able to configure a specific externally loaded plugin (rule).

Configuration of external rules may for example be helpful when the rule can be used in different application contexts. For example, the shell multiplexer tmux can be used in a variety of shell or terminal applications. Or in the example of tmux whether the client should connect to a specific tmux server. Or even offer the user to configure multiple tmux targets. The plugin developer should be free in what configuration options are offered and how they are implemented.

What plugins are loaded and their configuration could be specified in the user's caster directory:

<user_dir>/
    plugins.conf   # this is where the plugins are defined by the user

(I already mentioned once that another name for user_dir may be more descriptive (#842)).

plugins.yaml for example have the following format:

- dictation-toolbox/caster-plugins//base
- timoses/caster-tmux:
    application_context: alacritty

I used yaml just as an example. Above dictation-toolbox/caster-plugins would be the github repo and base a subfolder within which the plugin is located. Caster would take care that the configuration made by a user for a plugin is handed over to the plugin.

The plugin could be loaded into the user's caster directory under plugins directory. The downloaded plugins directories should never be manipulated by the user. The possibility to overwrite specific rules should still be possible under the user's own rules directory.

The user's caster directory would then look something like this:

<caster_directory>/
    plugins/
        dictation-toolbox/
            caster-plugins/
                base/
                    rules/
        timoses/
            caster-tmux/
                rules/
    rules/    # User's rules
    plugins.yaml    # User defined list of plugins to be fetched

A remote plugin source directory in a git repository would contain the rules. Additionally it could contain a plugin.conf defining its configuration options (such as application_context) which is read by Caster in order to for example validate the user's configuration for the plugin.

Going one step further the caster rules as they are currently implemented (such as navigation, bring_me, ...) within the project could be moved out of the main Caster project and used as external plugins. This would put the focus of caster to extend dragonfly in a way that gives plugin developers a much cleaner surface and users even higher flexibility in which plugins they would like to use. The Caster-Plugins could be loaded into the user's plugin.conf in the caster directory by default giving her or him the same experience as is currently the case (however, with a very easy option to go another direction if she or he so desires).

Following this pattern Caster as an extension to dragonfly would focus on providing improvements to the potential of voice coding as a base to plugin developers. Plugin developers would have available the full spectrum of the dragonfly and caster rule framework (with actions, etc) including a plugin framework to work with and offer their plugins to users.

Describe alternatives you've considered
Of course, a plugin framework could be molded onto the current project as is. Though thinking about how a plugin framework could be implemented and what effect it may have on the project are important to consider.

Additional context
My first impression of the project is that there is little separation of concerns. I believe that a separation into Caster as an extension to Dragonfly with the offering of Caster-Plugins as a starting point for users would benefit both concerns (extending Dragonfly and actual usage and application of the DragonFly/Caster framework).

This is only an initial idea and an inquiry for feedback and brainstorming; especially also for evaluating feasibility.

@LexiconCode LexiconCode added the Investigation Required Issue needs to be investigated. label Sep 3, 2020
@Timoses
Copy link
Author

Timoses commented Sep 8, 2020

Just noting down an idea:

  • Plugins themselves may be Python packages which can be loaded with pip by caster. This solves handling of dependencies which any plugin may require.

@LexiconCode LexiconCode added the Caster Issues pertaining to primarily the Caster project. label Sep 10, 2020
@Timoses
Copy link
Author

Timoses commented Sep 10, 2020

I have been trying to sum up some requirements and potential ideas for a plugin framework in the initial description of this issue. This will be an ongoing effort.

So far i have presumed in that it would only be concerning grammar plugins. Although, one could also think about other types of plugins. For example a model plugin Which would allow a user to specify which language model To Use or a word plugin which would load words to be recognized during dictation. A word plugin could be very useful for plugins which deal in specific contexts where potentially unusual words are used.

I am not sure how feasible these kinds of plugins are as i have not looked into implementation details.

However, I think it will be a good idea to discuss further plugin opportunities. While using caster more and more I myself discover new ideas for plugins and which kind of interface Caster would have to offer.

@Timoses
Copy link
Author

Timoses commented Sep 16, 2020

I have been prototyping a little in my fork of caster https://github.com/Timoses/Caster/tree/mycaster/mycaster .

The main goals are:

  • Be lean and simple
  • Focus on extensibility
  • Ease of use
  • Be agnostic in regard to which SR engine or framework are used
  • Remove complexity and unwieldiness of current Caster project
  • Find a fitting modular pattern:
    • The core of caster is a plugin framework which can potentially support various speech recognition engines or even frameworks.
    • Utility libraries for plugin development.
    • Plugins: These bring the actual functionality towards controlling other programs and/or the operating system environment itself.

The advantage of this is that Caster Core has neither dependencies on any plugins nor cares how they are implemented.

Further, it would make it much easier to implement further functionality such as:

  • Controlling caster itself via commands to e.g. turn off one or all plugins and contexts, switch language models, ...
  • Provide functionality towards other (sub-)projects to support GUI or other extensive functionality
  • Automatic (and smart?) unloading of unused grammars or complete contexts and reloading of these once the context reactivates

Personally, I would move plugins into their own repository to clearly limit the scope of the project. The problem i see is that there are so many applications that may be supported now and in the future that the project easily may become unwieldy (as it currently already is in my opinion). Caster could provide an opinionated plugin collection for example under a project repository caster-plugins.

Configuration

Currently the project uses a static user configuration file. The user can specify which plugins should be loaded (Caster could provide an initial set of plugins when the file is initially created). It is then possible to specify which plugins should be active in which contexts.

Context configuration offers the original dragonfly AppContext's executable and title and also allows specification of plugin specific contexts.
For example here I ask the tumx plugin to provide a context which matches when the command in the current pane equals nvim. Technically, the plugin receives the complete data structure which is specified by the user and is free to interpret it and return a context as it sees fit.

Two more ideas currently not implemented

  • The plugin framework could provide the plugin with a persistent data store which the plugin can use to fetch and store data persistent between reboots. This could be helpful for a plugin implementing the bring me functionality.
    • Data storage should be separated from configuration (e.g. in separate files).
  • Also the user should be able to configure each plugin. This could also be used to offer the user the ability to override speech command triggers or complete specs (removing the need for merge rules)

Plugins

Plugins are completely free how they implement grammars. A plugin could for example also use the BreatheAPI without touching any of Caster's utilities.

The current behaviour of Caster could be provided by a collection of plugins which may be loaded into the configuration file automatically at initial creation. This should be a very reduced set of plugins, though. It does not make sense to provide a user with a plugin which she or he never uses. A good candidate for example for initial activation is the dictation plugin.

As an example usage the current branch contains two activated plugins:

  • A simple alphabet (copy and pasted from Caster)
  • My own tmux plugin (which is still very basic)

Plugins can return a default context (for example the firefox plugin could set a default context for the application firefox). Again the plugin is free in how the context is implemented.

Project Structure

To sum up the previous points this is a suggestion for a project structure:

  • Project Caster:
    • Namespace: castervoice
    • Contains:
      • Core: The core of caster providing the plugin framework, support for various speech recognition engines and frameworks and context management.
      • Utility: Utilities for plugin development
  • Project Caster-Plugins:
    • Namespace: casterplugin
    • Contains:
      • Plugins to support various applications an operating system control

I drew a raw class diagram for the core of caster (https://github.com/Timoses/Caster/blob/mycaster/mycaster/class_diagram.png):
image

Organisation

In my opinion this would be a candidate for a Caster 2 project. It would involve a significant amount of changes to current code and would perhaps be better off in a separate project.

Try it yourself

Only Python 3:

python -m venv venv
source venv/bin/activate
pip install 'git+https://github.com/Timoses/Caster.git@mycaster#subdirectory=mycaster'

# Fetch configuration file
mkdir user_config_dir
curl -L -o user_config_dir/caster.yml https://raw.githubusercontent.com/Timoses/Caster/mycaster/mycaster/user_config_dir/caster.yml

# Kaldi
# You should probably adjust the model_dir argument in the configuration file and then
pip install 'dragonfly2[kaldi]'

# Go nuts
python -m mycastervoice

@LexiconCode
Copy link
Member

LexiconCode commented Sep 21, 2020

@Timoses Tomorrow or Mid week I will get around trying out your branch!

Would you be available to chat over voice possibly Wednesday or Thursday perhaps we can arrange a time over Gitter?

Continuing to think through defining the scope, definition and barriers related to plug-ins in relation to the current system system. The overarching question here comes down to the level of abstraction. Some subtopics in relation:

  • Spec/grammar API and its impact on current framework and plug-in implementation
  • Contexts
  • Modularization of rules python vs Configuration Files such as toml/yaml formats for plug-ins
  • Plug-in discovery
  • Framework Fragmentation

There's more to discuss than the points above but that is quite a bit to discuss already. The goal is to take what we discuss and summarize from our conversations.

@Timoses
Copy link
Author

Timoses commented Sep 21, 2020

The overarching question here comes down to the level of abstraction.

Yes. I have found that Caster Core can really just focus on pluggability and facilitating frameworks/engines. Even creating a plugin to control the Caster controller itself seems feasible (here an example to put all plugins except the controller to sleep).

@LexiconCode
Copy link
Member

It would be nice to make actions a plug-in as well.

@Timoses
Copy link
Author

Timoses commented Nov 27, 2020

It would be nice to make actions a plug-in as well.

Imo a plugin has a very specific scope and should be ready for a user to be used. E.g. you'd maybe want to use Keepass with Caster: You'd use the Keepass plugin. If you want to use Office 365 you use the Office 365 plugin.
Thus, a plugin should/could interpret commands on a higher level. For example, a dictation plugin would let you say 'capital r' (R) instead of having to say 'shift r'. Plugins are therefore very context-specific (which makes sense as you select which plugins are active based on the context your digital platform/computer is in).

Actions I would see more as a library. E.g. like this one: https://github.com/dictation-toolbox/dtactions .
Those could offer a plugin developer easier ways to implement their context-specific logic by using those action libraries. But the plugin developer is not limited to it and can use any other libraries (such as the keepass library; I really want a Keepass plugin : D).

@Timoses
Copy link
Author

Timoses commented Mar 16, 2021

As a side note:
This is implemented in a side project: https://github.com/CasterVoice/caster-core

As Caster in its current form does not provide plugin capability, I'll leave this issue open. Options would be to:

  • Transfer Caster functionalities to Caster-Core usable plugins (while also building upon Caster-Core to support more of the features which Caster currently does)
  • Implement plugin capability in this project

@masisley
Copy link

Curious to know if we've made any progress on this. You've both clearly put a lot of thought into it. :-)

I've coded a bunch of useful grammars for smart home devices (hue light control, sleep number bed, LG TV, and soon twinkly Christmas lights) and even controlling videogames by voice (primarily game boy advance games like Pokémon and advance wars that are not time sensitive), and would love to upload them somewhere useful, but I don't want to subject all Caster users to them. Several of them rely on command line tools, though some of them could potentially be re- factored to python libraries. Very useful though. The ios apps to control most of these are terrible in terms of voice accessibility.

I might try https://github.com/dictation-toolbox/dragonfly-scripts if there isn't anywhere better.

@Timoses
Copy link
Author

Timoses commented Jan 15, 2024

As mentioned here there's a caster-core repository. There's also a working collection of some commonly used functionalities in the caster-plugins repository (dictation and bringme functionality).

I myself maintain some plugins for myself here.

The base/core functionality of the core repository was good enough for me to work on my own plugins. It was thought as a parallel exploration project but lost focus due to missing time on my side. Of course, many functionalities that the current "Mono-Repo" Caster (this repository) provides is not implemented as plugins yet.

If you like to have a look I'd me happy to guide you along.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Caster Issues pertaining to primarily the Caster project. Investigation Required Issue needs to be investigated.
Projects
None yet
Development

No branches or pull requests

3 participants