# Rhasspy Command-Line Tools

Rhasspy's various services can be controlled via command-line tools. Many of these tools are intended to be used in Unix pipelines, and are therefore very particular about what they read/write to and from `stdin`/`stdout`.

The available tools can be grouped by their function:

* Wake word
    * Detect wake word in an audio stream
* Voice command
    * Detect start and stop of voice commands in an audio stream
* Training
    * Generate speech/intent recognition artifacts
* Speech to text
    * Transcribe an audio segment
* Intent recognition
    * Convert text to structured JSON event

## GStreamer

GStreamer provides tools and plugins for constructing audio/video transformation pipelines.

Rhasspy's audio tools (wake word, voice command, etc.) expect a precise audio format (16-bit 16Khz mono PCM), which GStreamer can convert to from a variety of sources. Audio can even be streamed over a network, allowing Rhasspy to receive microphone input remotely.

In [1]:
# Make sure you have gstreamer installed.
# If not, run:
# sudo apt-get install gstreamer1.0-pulseaudio gstreamer1.0-tools gstreamer1.0-plugins-good

!which gst-launch-1.0

/usr/bin/gst-launch-1.0


In [3]:
# Get a list of installed plugins

! gst-inspect-1.0 | head

kate:  katedec: Kate stream text decoder
kate:  kateenc: Kate stream encoder
kate:  kateparse: Kate stream parser
kate:  katetag: Kate stream tagger
x265:  x265enc: x265enc
resindvd:  rsndvdbin: rsndvdbin
uvch264:  uvch264mjpgdemux: UVC H264 MJPG Demuxer
uvch264:  uvch264src: UVC H264 Source
libvisual:  libvisual_jess: libvisual jess plugin plugin v.0.1
libvisual:  libvisual_bumpscope: libvisual Bumpscope plugin plugin v.0.0.1


In [4]:
# See details for a specific plugin

! gst-inspect-1.0 filesrc | head

Factory Details:
  Rank                     primary (256)
  Long-name                File Source
  Klass                    Source/File
  Description              Read from arbitrary point in a file
  Author                   Erik Walthinsen <omega@cse.ogi.edu>

Plugin Details:
  Name                     coreelements
  Description              GStreamer core elements


In [5]:
# Play an audio file using a pipeline.
# You can also just use gst-play-1.0 <FILE>

! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    autoaudiosink

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstPulseSinkClock
Got EOS from element "pipeline0".
Execution ended after 0:00:02.404596137
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...


## Wake Word (Porcupine)

Rhasspy uses [porcupine](https://github.com/Picovoice/Porcupine) to detect wake words. By default, the `rhasspy-porcupine` command expects an audio stream on `stdin` and listens for the word "porcupine".

In [7]:
# Play wake word sample
! gst-play-1.0 wav/porcupine.wav

Press 'k' to see a list of keyboard shortcuts.
Now playing /home/hansenm/opt/rhasspy-services/docs/notebooks/wav/porcupine.wav
Redistribute latency...
0:00:01.2 / 0:00:01.2       
Reached end of play list.



In [10]:
# Detect wake word.
# Requires 16-bit 16Khz mono audio.
! gst-launch-1.0 \
    filesrc location=wav/porcupine.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-porcupine | \
  jq .

[1;39m{
  [0m[34;1m"index"[0m[1;39m: [0m[0;39m0[0m[1;39m,
  [0m[34;1m"keyword"[0m[1;39m: [0m[0;32m"/home/hansenm/opt/rhasspy-services/wake_word/porcupine/resources/keyword_files/linux/porcupine_linux.ppn"[0m[1;39m
[1;39m}[0m


## Voice Commands (webrtcvad)

Rhasspy uses [webrtcvad](https://github.com/wiseman/py-webrtcvad) to detect speech. Combined with some heuristics, `rhasspy-webrtcvad` will detect when a voice command starts and stops. This command expects an audio stream on `stdin`.

In [14]:
# Requires 16-bit 16Khz mono audio
! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-webrtcvad

rhasspy/voice-command/speech {"seconds": 0.06}
rhasspy/voice-command/command-started {"seconds": 0.6000000000000001}
rhasspy/voice-command/silence {"seconds": 3.3000000000000025}
rhasspy/voice-command/command-stopped {"seconds": 3.900000000000003}


## Training

Rhasspy's voice commands are pre-specified in a file named `sentences.ini`, which contains simplifed [JSGF grammars](https://www.w3.org/TR/jsgf/) grouped by intent. The training process involes:

1. Extracting the JSGF grammars
2. Converting them to finite state transducers (FSTs)
3. Merging the FSTs into `intent.fst`
4. Converting `intent.fst` to an [ARPA language model](https://cmusphinx.github.io/wiki/arpaformat/)

Assuming `profile` directory contains:

* `sentences.ini`

In [15]:
# Generate grammars from sentences.ini
! rhasspy-ini_jsgf \
    --ini-file profile/sentences.ini \
    --grammar-dir profile/grammars \
    --debug

DEBUG:root:Loaded ini file
DEBUG:root:Wrote profile/grammars/GetTime.gram (1 rule(s))
DEBUG:root:Wrote profile/grammars/GetTemperature.gram (1 rule(s))
DEBUG:root:Wrote profile/grammars/GetGarageState.gram (1 rule(s))
DEBUG:root:Wrote profile/grammars/ChangeLightState.gram (3 rule(s))
DEBUG:root:Wrote profile/grammars/ChangeLightColor.gram (3 rule(s))


In [16]:
# Show generated grammars
! ls profile/grammars/

ChangeLightColor.gram  GetGarageState.gram  GetTime.gram
ChangeLightState.gram  GetTemperature.gram


In [13]:
# Convert grammars to FSTs and a language model
! rhasspy-jsgf_fst_arpa \
    --grammar-dir profile/grammars \
    --fst-dir profile/fsts \
    --fst profile/intent.fst \
    --vocab profile/vocab.txt \
    --arpa profile/language_model.txt \
    --debug

DEBUG:root:Parsing JSGF grammar GetTime.gram
DEBUG:root:Parsing JSGF grammar ChangeLightState.gram
DEBUG:root:Parsing JSGF grammar GetTemperature.gram
DEBUG:root:Parsing JSGF grammar ChangeLightColor.gram
DEBUG:root:Parsing JSGF grammar GetGarageState.gram
DEBUG:root:Processing GetTime
DEBUG:root:Processing ChangeLightState
DEBUG:root:Processing GetTemperature
DEBUG:root:Processing ChangeLightColor
DEBUG:root:Processing GetGarageState
DEBUG:root:Wrote intent FST to profile/intent.fst
DEBUG:root:Generated FSTs in 0.07654190063476562 second(s)
DEBUG:root:['ngramcount', 'profile/intent.fst', '/tmp/tmpj_j5fyei']
DEBUG:root:['ngrammake', '/tmp/tmpj_j5fyei', '/tmp/tmp87fbnrta']
DEBUG:root:['ngramprint', '--ARPA', '/tmp/tmp87fbnrta']
DEBUG:root:Wrote ARPA language model to profile/language_model.txt
DEBUG:root:Wrote vocabulary to profile/vocab.txt


In [14]:
# Show generated FSTs
! ls profile/fsts/

ChangeLightColor.fst  GetGarageState.fst  GetTime.fst
ChangeLightState.fst  GetTemperature.fst


In [15]:
# Custom vocabulary
! head profile/vocab.txt

tell
what
me
the
time
is
it
turn
bedroom
light


In [16]:
# Custom ARPA language model
! head profile/language_model.txt


\data\
ngram 1=32
ngram 2=72
ngram 3=94

\1-grams:
-99	<s>	-2.911084
-0.7585103	</s>
-2.448706	tell	-2.998452


## Speech Training (Pocketsphinx)

The CMU English model for [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx) uses [ARPABET](https://en.wikipedia.org/wiki/Arpabet) phonemes to describe word pronunciations. A large pronunciation dictionary has been provided, and was used to generate a grapheme-to-phoneme model with [Phonetisaurus](https://github.com/AdolfVonKleist/Phonetisaurus).

Assuming `profile` directory contains:

* `base_dictionary.txt` (pronunciation dictionary)
* `g2p.fst` (grapheme-to-phoneme model)

[Download Link](https://github.com/synesthesiam/rhasspy-profiles/releases/download/v1.0-en/)

In [18]:
# Generate custom dictionary for vocabulary
! rhasspy-vocab_dict \
    --vocab profile/vocab.txt \
    --dictionary profile/base_dictionary.txt \
    --debug > profile/dictionary.txt

DEBUG:root:Loading dictionary from profile/base_dictionary.txt
DEBUG:root:Loaded 30 word(s) from profile/vocab.txt


In [19]:
# Custom dictionary
! head profile/dictionary.txt

bedroom B EH D R UW M
blue B L UW
closed K L OW Z D
cold K OW L D
door D AO R
garage G ER AA ZH
green G R IY N
hot HH AA T
how HH AW
is IH Z


Guess unknown word pronunciations

In [20]:
%%file profile/unknown_words.txt
test
ploop
raxacoricofallipatorius

Writing profile/unknown_words.txt


In [22]:
# Guesses can be added to dictionary as:
# <WORD> <PHONEMES>
! rhasspy-vocab_g2p \
    --model profile/g2p.fst \
    < profile/unknown_words.txt | \
  jq .

[1;39m{
  [0m[34;1m"test"[0m[1;39m: [0m[1;39m[
    [0;32m"T EH S T"[0m[1;39m,
    [0;32m"T AH S T"[0m[1;39m,
    [0;32m"T IH S T"[0m[1;39m,
    [0;32m"T S T"[0m[1;39m,
    [0;32m"T IY S T"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"ploop"[0m[1;39m: [0m[1;39m[
    [0;32m"P L UW P"[0m[1;39m,
    [0;32m"P L OW AO P"[0m[1;39m,
    [0;32m"P L OW AA P"[0m[1;39m,
    [0;32m"P L AA AO P"[0m[1;39m,
    [0;32m"P L UW P IY"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"raxacoricofallipatorius"[0m[1;39m: [0m[1;39m[
    [0;32m"R AE K S AH K AO R IH K AO F AE L AH P AH T AO R IY IH S"[0m[1;39m,
    [0;32m"R AE K S AH K AO R IY K OW F AE L AH P AH T AO R IY IH S"[0m[1;39m,
    [0;32m"R AE K S AH K AO R AH K OW F AE L AH P AH T AO R IY IH S"[0m[1;39m,
    [0;32m"R AE K S AH K AA R IH K AO F AE L AH P AH T AO R IY IH S"[0m[1;39m,
    [0;32m"R AE K S AH K AO R IH K OW F AE L AH P AH T AO R IY IH S"[0m[1;39m
  [1;39m][0

## Speech to Text (Pocketsphinx)

Transcription of an audio segment with [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx) requires three artifacts:

1. An acoustic model
    * Provided by CMU for English
2. A pronunciation dictionary
    * Generated by extracting custom vocabulary from base dictionary
3. An ARPA language model
    * Generated using [Opengrm](http://www.opengrm.org/twiki/bin/view/GRM/NGramLibrary)

Assumes `profile` contains:

* `acoustic_model` (cmusphinx-en-us-5.2 16Khz)
* `dictionary.txt` (from training)
* `language_model.txt` (from training)

[Download Link](https://github.com/synesthesiam/rhasspy-profiles/releases/download/v1.0-en/)

In [23]:
# Requires 16-bit 16Khz mono audio
! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-pocketsphinx \
    --acoustic-model profile/acoustic_model \
    --dictionary profile/dictionary.txt \
    --language-model profile/language_model.txt | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.11563658714294434[0m[1;39m,
  [0m[34;1m"likelihood"[0m[1;39m: [0m[0;39m0.5971662313418215[0m[1;39m
[1;39m}[0m


Decode multiple WAV files ([jsonl](http://jsonlines.org/) output)

In [62]:
%%file wav_files.txt
wav/turn_on_living_room_lamp.wav
wav/what_time_is_it.wav

Overwriting wav_files.txt


In [63]:
# Reads list of WAV files to decode from stdin
! rhasspy-pocketsphinx_wavs2text \
    --acoustic-model profile/acoustic_model \
    --dictionary profile/dictionary.txt \
    --language-model profile/language_model.txt \
    < wav_files.txt | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.11560416221618652[0m[1;39m,
  [0m[34;1m"likelihood"[0m[1;39m: [0m[0;39m0.5345806918647441[0m[1;39m,
  [0m[34;1m"wav_name"[0m[1;39m: [0m[0;32m"turn_on_living_room_lamp.wav"[0m[1;39m,
  [0m[34;1m"wav_seconds"[0m[1;39m: [0m[0;39m2.402375[0m[1;39m
[1;39m}[0m
[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"what time is it"[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.17764925956726074[0m[1;39m,
  [0m[34;1m"likelihood"[0m[1;39m: [0m[0;39m0.019605291722447564[0m[1;39m,
  [0m[34;1m"wav_name"[0m[1;39m: [0m[0;32m"what_time_is_it.wav"[0m[1;39m,
  [0m[34;1m"wav_seconds"[0m[1;39m: [0m[0;39m2.218667[0m[1;39m
[1;39m}[0m


## Speech Training (Kaldi)

Rhasspy supports the [Kaldi](https://kaldi-asr.org) speech recognition toolkit for audio segment transcription. Both `nnet3` and `gmm` model types can be trained and used for decoding.

The [zamia](https://github.com/gooofy/zamia-speech) TDNN English model has been tested, and uses the [International Phonetic Alphabet](https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) for its pronunciation dictionary.

Assumes `profile/kaldi` contains:

* `model` (nnet3)
    * `conf`
        * `mfcc_hires.conf`
    * `phones`
        * `phones.txt`
        * `nonsilence_phones.txt`
        * `silence_phones.txt`
        * `optional_silence.txt`
        * `extra_questions.txt`
    * `model`
        * `final.mdl`
        *  `tree`
* `base_dictionary.txt`
* `g2p.fst`

In [34]:
# Generate custom dictionary for vocabulary
! rhasspy-vocab_dict \
    --vocab profile/vocab.txt \
    --dictionary profile/kaldi/base_dictionary.txt \
    --debug > profile/kaldi/dictionary.txt

DEBUG:root:Loading dictionary from profile/kaldi/base_dictionary.txt
DEBUG:root:Loaded 30 word(s) from profile/vocab.txt


In [35]:
# Custom dictionary
! head profile/kaldi/dictionary.txt

bedroom b 'E d r u m
blue b l 'u
closed k l 'o U z d
cold k 'o U l d
door d 'O r
garage g 3 'A Z
green g r 'i n
hot h 'A t
how h 'aU
is 'I z


Guess unknown word pronunciations

In [36]:
%%file profile/unknown_words.txt
test
ploop
raxacoricofallipatorius

Overwriting profile/unknown_words.txt


In [37]:
# Guesses can be added to dictionary as:
# <WORD> <PHONEMES>
! rhasspy-vocab_g2p \
    --model profile/kaldi/g2p.fst \
    < profile/unknown_words.txt | \
  jq .

[1;39m{
  [0m[34;1m"test"[0m[1;39m: [0m[1;39m[
    [0;32m"t 'E s t"[0m[1;39m,
    [0;32m"t E s t"[0m[1;39m,
    [0;32m"t V s t"[0m[1;39m,
    [0;32m"t I s t"[0m[1;39m,
    [0;32m"t 'i s t"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"ploop"[0m[1;39m: [0m[1;39m[
    [0;32m"p l 'u p"[0m[1;39m,
    [0;32m"p l u p"[0m[1;39m,
    [0;32m"p l o U 'O p"[0m[1;39m,
    [0;32m"p l 'u p 'i"[0m[1;39m,
    [0;32m"p l 'A A p"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"raxacoricofallipatorius"[0m[1;39m: [0m[1;39m[
    [0;32m"r '{ k s V k 'A r I k O f '{ l V p V t O r i I s"[0m[1;39m,
    [0;32m"r '{ k s V k 'A r I k O f '{ l V p V t 'O r i I s"[0m[1;39m,
    [0;32m"r '{ k s V k O r 'i k o U f '{ l V p V t O r i I s"[0m[1;39m,
    [0;32m"r '{ k s V k O r 'i k o U f '{ l V p V t 'O r i I s"[0m[1;39m,
    [0;32m"r '{ k s V k 'O r I k O f '{ l V p V t O r i I s"[0m[1;39m
  [1;39m][0m[1;39m
[1;39m}[0m


### Generate HCLG.fst

Kaldi models require an additional training step to generate a special finite state transducer (FST) named `HCLG.fst`. This FST merges acoustic, pronunciation, and language model information, allowing Kaldi to do fast transcriptions.

Assumes `profile/kaldi` contains:

* `model` (nnet3)
* `dictionary.txt` (from training)
* `language_model.txt` (from training)

In [59]:
# Generate HCLG.fst in profile/kaldi/model/graph
! rhasspy-kaldi-train \
    --model-dir profile/kaldi/model \
    --model-type nnet3 \
    --dictionary profile/kaldi/dictionary.txt \
    --language-model profile/language_model.txt \
  2>&1 | \
  tail

fstisstochastic /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/graph/HCLGa.fst 
-0.660182 -2.19459
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/model/final.mdl /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/graph/HCLGa.fst 
Preparing online decoding
/home/hansenm/opt/rhasspy-services/build/kaldi-master/egs/wsj/s5/steps/online/nnet3/prepare_online_decoding.sh --mfcc-config /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/conf/mfcc_hires.conf /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/extractor /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/model /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/online
/home/hansenm/opt/rhasspy-services/build/kaldi-master/egs/wsj/s5/ste

## Speech to Text (Kaldi)

Transcribing an audio segment with [Kaldi](https://kaldi-asr.org) a trained model with the following artifacts:

1. An acoustic model (`final.mdl`)
2. An acoustic-phonetic-lingustic FST (`HCLG.fst`)
3. A symbol table (`words.txt`)

Assumes `profile/kaldi` contains:

* `model` (nnet3)
    * `graph`
        * `HCLG.fst`
        * `words.txt`
    * `model`
        * `final.mdl`

In [64]:
# Requires 16-bit 16Khz mono audio
! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-kaldi \
    --model-dir profile/kaldi/model \
    --model-type nnet3 | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"wav_name"[0m[1;39m: [0m[0;32m"tmpzg069bi0.wav"[0m[1;39m,
  [0m[34;1m"wav_seconds"[0m[1;39m: [0m[0;39m2.412562[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.581928[0m[1;39m
[1;39m}[0m


Decode multiple WAV files (jsonl output)

In [82]:
%%file wav_files.txt
wav/turn_on_living_room_lamp.wav
wav/what_time_is_it.wav

Overwriting wav_files.txt


In [66]:
# Reads list of WAV files to decode from stdin
! rhasspy-kaldi-decode \
    --model-dir profile/kaldi/model \
    --model-type nnet3 \
    < wav_files.txt | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"wav_name"[0m[1;39m: [0m[0;32m"turn_on_living_room_lamp.wav"[0m[1;39m,
  [0m[34;1m"wav_seconds"[0m[1;39m: [0m[0;39m2.402375[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.47668[0m[1;39m
[1;39m}[0m
[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"what time is it"[0m[1;39m,
  [0m[34;1m"wav_name"[0m[1;39m: [0m[0;32m"what_time_is_it.wav"[0m[1;39m,
  [0m[34;1m"wav_seconds"[0m[1;39m: [0m[0;39m2.218667[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.440229[0m[1;39m
[1;39m}[0m


## Intent Recognition (fsticuffs)

Assumes `profile` contains:

* `intent.fst` (from training)

In [51]:
# Sentences are read line-by-line from stdin
# when --text-input is given.
! echo 'turn on the living room lamp' | \
  rhasspy-fsticuffs \
    --intent-fst profile/intent.fst \
    --text-input | \
  cut -d' ' -f2- | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"intent"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"name"[0m[1;39m: [0m[0;32m"ChangeLightState"[0m[1;39m,
    [0m[34;1m"confidence"[0m[1;39m: [0m[0;39m1[0m[1;39m
  [1;39m}[0m[1;39m,
  [0m[34;1m"entities"[0m[1;39m: [0m[1;39m[
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"state"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"start"[0m[1;39m: [0m[0;39m5[0m[1;39m,
      [0m[34;1m"end"[0m[1;39m: [0m[0;39m7[0m[1;39m
    [1;39m}[0m[1;39m,
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"name"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"start

In [53]:
# Do fuzzy matching (usually slower).
# Skip over any unknown words.
! echo "would you please turn on that good ol' living room lamp of mine" | \
  rhasspy-fsticuffs \
    --intent-fst profile/intent.fst \
    --skip-unknown \
    --fuzzy \
    --text-input | \
  cut -d' ' -f2- | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on living room lamp"[0m[1;39m,
  [0m[34;1m"intent"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"name"[0m[1;39m: [0m[0;32m"ChangeLightState"[0m[1;39m,
    [0m[34;1m"confidence"[0m[1;39m: [0m[0;39m1[0m[1;39m
  [1;39m}[0m[1;39m,
  [0m[34;1m"entities"[0m[1;39m: [0m[1;39m[
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"state"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"start"[0m[1;39m: [0m[0;39m5[0m[1;39m,
      [0m[34;1m"end"[0m[1;39m: [0m[0;39m7[0m[1;39m
    [1;39m}[0m[1;39m,
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"name"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"start"[0