Skip to content

OpenASR/idiolect

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

idiolect icon idiolect

Deploy

A general purpose voice user interface for the IntelliJ Platform, inspired by Tavis Rudd. Possible use cases: visually impaired and RSI users. Originally developed as part of a JetBrains hackathon, it is now a community-supported project. For background information, check out this presentation.

Usage

To get started, press the Voice control button in the toolbar, then speak a command, e.g. "Hi, IDEA!" Idiolect supports a simple grammar. For a complete list of commands, please refer to the wiki. Click the button once more to deactivate.

Building

For Linux or macOS users:

git clone https://github.com/OpenASR/idiolect && cd idiolect && ./gradlew runIde

For Windows users:

git clone https://github.com/OpenASR/idiolect & cd idiolect & gradlew.bat runIde

Recognition works with most popular microphones (preferably 16kHz, 16-bit). For best results, minimize background noise.

Contributing

Contributors who have IntelliJ IDEA installed can simply open the project. Otherwise, run the following command from the project's root directory:

./gradlew runIde -PluginDev

Architecture

Idiolect is implemented using the IntelliJ Platform SDK. For more information about the plugin architecture, please refer to the wiki page.

Integration with Idear

plugin.xml defines a number of <extensionPoint>s which would allow other plugins to integrate with or extend/customise the capabilities of Idear.

AsrProvider

Listens for audio input, recognises speech to text and returns an NlpRequest with possible utterances. Does not resolve the intent.

Possible alternative implementations could:

  • integrate with Windows SAPI 5 Speech API
  • integrate with Dragon/Nuance API

NlpProvider

Processes an NlpRequest. The default implementation invokes IdeService.invokeAction(ExecuteVoiceCommandAction, nlpRequest) and the action is handled by ExecuteVoiceCommandAction and ActionRecognizerManager.handleNlpRequest()

AsrSystem

Processes audio input, recognises speech to text and executes actions. The default implementation AsrControlLoop uses the AsrProvider and NlpProvider.

Some APIs such as AWS Lex implement the functionality of AsrProvider and NlpProvider in a single call.

IntentResolver

Processes an NlpRequest (utterance/alternatives) and resolves an NlpResponse with intentName and slots. ActionRecognizerManager.handleNlpRequest() iterates through the IntentResolvers until it finds a match.

The Idear implementations use either exact-match or regular expressions on the recognized text. Alternative implementations may use AI to resolve the intent.

CustomPhraseRecognizer

Many of the auto-generated trigger phrases are not suitable for voice activation. You can add your own easier to say and remember phrases in ~/.idea/phrases.properties

IntentHandler

Fulfills an NlpResponse (intent + slots), performing desired actions. ActionRecognizerManager.handleNlpRequest() iterates through the IntentHandlers until the intent is actioned.

TemplateIntentHandler

Handles two flavours of intent prefix:

  • Template.id.${template.id} eg: Template.id.maven-dependency
  • Template.${template.groupName}.${template.key} eg: Template.Maven.dep

template.id is often null. template.key is the "Abbreviation" that you would normally type before pressing TAB.

The default trigger phrases are generated from the template description or key and are often not suitable for voice activation. You can add your own trigger phrase -> live template mapping in ~/.idea/phrases.properties and it will be resolved by CustomPhraseRecognizer.

ttsProvider

Reads audio prompts/feedback to the user

org.openasr.idear.nlp.NlpResultListener

Any interfaces which are registered to the topic in plugin.xml under <applicationListeners> will be notified when

  • listening state changes
  • recognition is returned by the AsrProvider
  • request is fulfilled by an IntentHandler
  • there is a failure
  • a prompt/message is provided for the user

Plugin Actions

plugin.xml defines <action>s:

VoiceRecordControllerAction

This action is invoked when the user clicks on the Voice control button in the toolbar. This simply tells AsrService to activate or standby. When the AsrService is active, the AsrSystem,

by default ASRControlLoop (see below).

ExecuteActionFromPredefinedText

A debugging aid to use one of the ActionRecognizer extension classes configured in plugin.xml to generate an ActionCallInfo which is then runInEditor().

ExecuteVoiceCommandAction

Similar to ExecuteActionFromPredefinedText but uses the Idiolect.VoiceCommand.Text data attached to the invoking AnActionEvent.

IDEA Actions

There are many Actions (classes which extend AnAction) provided by IDEA:

ASRControlLoop

When AsrControlLoop detects an utterance, it invokes PatternBasedNlpProvider.processUtterance() which typically calls invokeAction() and/or one or more of the methods of IdeService

Programming By Voice

Maintainers