# Rhasspy Command-Line Tools

Rhasspy's various services can be controlled via command-line tools. Many of these tools are intended to be used in Unix pipelines, and are therefore very particular about what they read/write to and from `stdin`/`stdout`.

The available tools can be grouped by their function:

* Wake word
    * Detect wake word in an audio stream
* Voice command
    * Detect start and stop of voice commands in an audio stream
* Training
    * Generate speech/intent recognition artifacts
* Speech to text
    * Transcribe an audio segment
* Intent recognition
    * Convert text to structured JSON event

## GStreamer

GStreamer provides tools and plugins for constructing audio/video transformation pipelines.

Rhasspy's audio tools (wake word, voice command, etc.) expect a precise audio format (16-bit 16Khz mono PCM), which GStreamer can convert to from a variety of sources. Audio can even be streamed over a network, allowing Rhasspy to receive microphone input remotely.

In [1]:
# Make sure you have gstreamer installed.
# If not, run:
# sudo apt-get install gstreamer1.0-pulseaudio gstreamer1.0-tools gstreamer1.0-plugins-good

!which gst-launch-1.0

/usr/bin/gst-launch-1.0


In [2]:
# Get a list of installed plugins

! gst-inspect-1.0

video4linux2:  v4l2deviceprovider (GstDeviceProviderFactory)
video4linux2:  v4l2radio: Radio (video4linux2) Tuner
video4linux2:  v4l2sink: Video (video4linux2) Sink
video4linux2:  v4l2src: Video (video4linux2) Source
pocketsphinx:  pocketsphinx: PocketSphinx
gtk:  gtkglsink: Gtk GL Video Sink
gtk:  gtksink: Gtk Video Sink
transcode:  uritranscodebin: Pipeline object
transcode:  transcodebin: Generic bin
nle:  nleurisource: GNonLin URI Source
nle:  nleoperation: GNonLin Operation
nle:  nlecomposition: GNonLin Composition
nle:  nlesource: GNonLin Source
libvisual:  libvisual_oinksie: libvisual oinksie plugin plugin v.0.1
libvisual:  libvisual_lv_scope: libvisual libvisual scope plugin v.0.1
libvisual:  libvisual_lv_analyzer: libvisual libvisual analyzer plugin v.1.0
libvisual:  libvisual_jakdaw: libvisual Jakdaw plugin plugin v.0.0.1
libvisual:  libvisual_infinite: libvisual infinite plugin plugin v.0.1
libvisual:  libvisual_corona: libvisual libvisual corona plugin plu

In [3]:
# See details for a specific plugin

! gst-inspect-1.0 filesrc

Factory Details:
  Rank                     primary (256)
  Long-name                File Source
  Klass                    Source/File
  Description              Read from arbitrary point in a file
  Author                   Erik Walthinsen <omega@cse.ogi.edu>

Plugin Details:
  Name                     coreelements
  Description              GStreamer core elements
  Filename                 /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstcoreelements.so
  Version                  1.14.1
  License                  LGPL
  Source module            gstreamer
  Source release date      2018-05-17
  Binary package           GStreamer (Ubuntu)
  Origin URL               https://launchpad.net/distros/ubuntu/+source/gstreamer1.0

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseSrc
                         +----GstFileSrc

Implemented Interfaces:
  GstURIHandler

Pad Templates:
  SRC template: 'src'


In [11]:
# Play an audio file using a pipeline.
# You can also just use gst-play-1.0 <FILE>

! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    autoaudiosink

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstPulseSinkClock
Got EOS from element "pipeline0".
Execution ended after 0:00:01.889378819
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...


## Wake Word (Porcupine)

Rhasspy uses [porcupine](https://github.com/Picovoice/Porcupine) to detect wake words. By default, the `rhasspy-porcupine` command expects an audio stream on `stdin` and listens for the word "porcupine".

In [14]:
# Requires 16-bit 16Khz mono audio
! gst-launch-1.0 \
    filesrc location=wav/porcupine.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-porcupine

{"index": 0, "keyword": "/etc/porcupine/porcupine.ppn"}


## Voice Commands (webrtcvad)

Rhasspy uses [webrtcvad](https://github.com/wiseman/py-webrtcvad) to detect speech. Combined with some heuristics, `rhasspy-webrtcvad` will detect when a voice command starts and stops. This command expects an audio stream on `stdin`.

In [16]:
# Requires 16-bit 16Khz mono audio
! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-webrtcvad

speech {"seconds": 0.06}
command-start {"seconds": 0.36}
silence {"seconds": 4.5}
speech {"seconds": 4.619999999999999}
silence {"seconds": 4.859999999999998}
command-stop {"seconds": 5.3399999999999945}


## Training

Rhasspy's voice commands are pre-specified in a file named `sentences.ini`, which contains simplifed [JSGF grammars](https://www.w3.org/TR/jsgf/) grouped by intent. The training process involes:

1. Extracting the JSGF grammars
2. Converting them to finite state transducers (FSTs)
3. Merging the FSTs into `intent.fst`
4. Converting `intent.fst` to an [ARPA language model](https://cmusphinx.github.io/wiki/arpaformat/)

Assuming `profile` directory contains:

* `sentences.ini`

In [19]:
# Generate grammars from sentences.ini
! rhasspy-ini_jsgf \
    --ini-file profile/sentences.ini \
    --grammar-dir profile/grammars \
    --debug

DEBUG:root:Loaded ini file
DEBUG:root:Wrote profile/grammars/GetTime.gram (1 rule(s))
DEBUG:root:Wrote profile/grammars/GetTemperature.gram (1 rule(s))
DEBUG:root:Wrote profile/grammars/GetGarageState.gram (1 rule(s))
DEBUG:root:Wrote profile/grammars/ChangeLightState.gram (3 rule(s))
DEBUG:root:Wrote profile/grammars/ChangeLightColor.gram (3 rule(s))


In [84]:
# Show generated grammars
! ls profile/grammars/

ChangeLightColor.gram  GetGarageState.gram  GetTime.gram
ChangeLightState.gram  GetTemperature.gram


In [22]:
# Convert grammars to FSTs and a language model
! rhasspy-jsgf_fst_arpa \
    --grammar-dir profile/grammars \
    --fst-dir profile/fsts \
    --fst profile/intent.fst \
    --vocab profile/vocab.txt \
    --arpa profile/language_model.txt \
    --debug

DEBUG:root:Parsing JSGF grammar ChangeLightColor.gram
DEBUG:root:Parsing JSGF grammar ChangeLightState.gram
DEBUG:root:Parsing JSGF grammar GetTemperature.gram
DEBUG:root:Parsing JSGF grammar GetGarageState.gram
DEBUG:root:Parsing JSGF grammar GetTime.gram
DEBUG:root:Processing ChangeLightColor
DEBUG:root:Processing ChangeLightState
DEBUG:root:Processing GetTemperature
DEBUG:root:Processing GetGarageState
DEBUG:root:Processing GetTime
DEBUG:root:Wrote intent FST to profile/intent.fst
DEBUG:root:Generated FSTs in 0.08637738227844238 second(s)
DEBUG:root:['ngramcount', 'profile/intent.fst', '/tmp/tmp_szqhjlh']
DEBUG:root:['ngrammake', '/tmp/tmp_szqhjlh', '/tmp/tmpx5o362es']
DEBUG:root:['ngramprint', '--ARPA', '/tmp/tmpx5o362es']
DEBUG:root:Wrote ARPA language model to profile/language_model.txt
DEBUG:root:Wrote vocabulary to profile/vocab.txt


In [85]:
# Show generated FSTs
! ls profile/fsts/

ChangeLightColor.fst  GetGarageState.fst  GetTime.fst
ChangeLightState.fst  GetTemperature.fst


In [86]:
# Custom vocabulary
! head profile/vocab.txt

make
set
the
bedroom
light
red
green
blue
to
turn


In [88]:
# Custom ARPA language model
! head profile/language_model.txt


\data\
ngram 1=32
ngram 2=72
ngram 3=94

\1-grams:
-99	<s>	-2.911084
-0.7585103	</s>
-1.670555	make	-2.913011


## Speech Training (Pocketsphinx)

The CMU English model for [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx) uses [ARPABET](https://en.wikipedia.org/wiki/Arpabet) phonemes to describe word pronunciations. A large pronunciation dictionary has been provided, and was used to generate a grapheme-to-phoneme model with [Phonetisaurus](https://github.com/AdolfVonKleist/Phonetisaurus).

Assuming `profile` directory contains:

* `base_dictionary.txt` (pronunciation dictionary)
* `g2p.fst` (grapheme-to-phoneme model)

[Download Link](https://github.com/synesthesiam/rhasspy-profiles/releases/download/v1.0-en/)

In [27]:
# Generate custom dictionary for vocabulary
! rhasspy-vocab_dict \
    --vocab profile/vocab.txt \
    --dictionary profile/base_dictionary.txt \
    --debug > profile/dictionary.txt

DEBUG:root:Loading dictionary from profile/base_dictionary.txt
DEBUG:root:Loaded 30 word(s) from profile/vocab.txt


In [89]:
# Custom dictionary
! head profile/dictionary.txt

bedroom B EH D R UW M
blue B L UW
closed K L OW Z D
cold K OW L D
door D AO R
garage G ER AA ZH
green G R IY N
hot HH AA T
how HH AW
is IH Z


Guess unknown word pronunciations

In [39]:
%%file profile/unknown_words.txt
test
ploop
raxacoricofallipatorius

Writing profile/unknown_words.txt


In [41]:
# Guesses can be added to dictionary as:
# <WORD> <PHONEMES>
! rhasspy-vocab_g2p \
    --model profile/g2p.fst \
    < profile/unknown_words.txt | \
  jq .

[1;39m{
  [0m[34;1m"test"[0m[1;39m: [0m[1;39m[
    [0;32m"T EH S T"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"ploop"[0m[1;39m: [0m[1;39m[
    [0;32m"P L UW P"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"raxacoricofallipatorius"[0m[1;39m: [0m[1;39m[
    [0;32m"R AE K S AH K AO R IH K AO F AE L AH P AH T AO R IY IH S"[0m[1;39m
  [1;39m][0m[1;39m
[1;39m}[0m


## Speech to Text (Pocketsphinx)

Transcription of an audio segment with [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx) requires three artifacts:

1. An acoustic model
    * Provided by CMU for English
2. A pronunciation dictionary
    * Generated by extracting custom vocabulary from base dictionary
3. An ARPA language model
    * Generated using [Opengrm](http://www.opengrm.org/twiki/bin/view/GRM/NGramLibrary)

Assumes `profile` contains:

* `acoustic_model` (cmusphinx-en-us-5.2 16Khz)
* `dictionary.txt` (from training)
* `language_model.txt` (from training)

[Download Link](https://github.com/synesthesiam/rhasspy-profiles/releases/download/v1.0-en/)

In [76]:
# Requires 16-bit 16Khz mono audio
! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-pocketsphinx \
    --acoustic-model profile/acoustic_model \
    --dictionary profile/dictionary.txt \
    --language-model profile/language_model.txt | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.25699281692504883[0m[1;39m,
  [0m[34;1m"likelihood"[0m[1;39m: [0m[0;39m0.014070227820922574[0m[1;39m
[1;39m}[0m


Decode multiple WAV files (jsonl output)

In [80]:
%%file wav_files.txt
wav/turn_on_living_room_lamp.wav
wav/what_time_is_it.wav

Writing wav_files.txt


In [81]:
# Reads list of WAV files to decode from stdin
! rhasspy-pocketsphinx_wavs2text \
    --acoustic-model profile/acoustic_model \
    --dictionary profile/dictionary.txt \
    --language-model profile/language_model.txt \
    < wav_files.txt

{"text": "turn on the living room lamp", "transcribe_seconds": 0.26385045051574707, "likelihood": 0.01129020513516577, "wav_name": "turn_on_living_room_lamp.wav", "wav_seconds": 4.206563}
{"text": "what time is it", "transcribe_seconds": 0.09502935409545898, "likelihood": 0.5766246077811629, "wav_name": "what_time_is_it.wav", "wav_seconds": 1.225}


## Speech Training (Kaldi)

Rhasspy supports the [Kaldi](https://kaldi-asr.org) speech recognition toolkit for audio segment transcription. Both `nnet3` and `gmm` model types can be trained and used for decoding.

The [zamia](https://github.com/gooofy/zamia-speech) TDNN English model has been tested, and uses the [International Phonetic Alphabet](https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) for its pronunciation dictionary.

Assumes `profile/kaldi` contains:

* `model` (nnet3)
    * `conf`
        * `mfcc_hires.conf`
    * `phones`
        * `nonsilence_phones.txt`
        * `silence_phones.txt`
        * `optional_silence.txt`
        * `extra_questions.txt`
    * `model`
        * `final.mdl`
        *  `tree`
* `base_dictionary.txt`
* `g2p.fst`

In [51]:
# Generate custom dictionary for vocabulary
! rhasspy-vocab_dict \
    --vocab profile/vocab.txt \
    --dictionary profile/kaldi/base_dictionary.txt \
    --debug > profile/kaldi/dictionary.txt

DEBUG:root:Loading dictionary from profile/kaldi/base_dictionary.txt
DEBUG:root:Loaded 30 word(s) from profile/vocab.txt


In [52]:
# Custom dictionary
! head profile/kaldi/dictionary.txt

bedroom b 'E d r u m
blue b l 'u
closed k l 'o U z d
cold k 'o U l d
door d 'O r
garage g 3 'A Z
green g r 'i n
hot h 'A t
how h 'aU
is 'I z


Guess unknown word pronunciations

In [53]:
%%file profile/unknown_words.txt
test
ploop
raxacoricofallipatorius

Overwriting profile/unknown_words.txt


In [54]:
# Guesses can be added to dictionary as:
# <WORD> <PHONEMES>
! rhasspy-vocab_g2p \
    --model profile/kaldi/g2p.fst \
    < profile/unknown_words.txt | \
  jq .

[1;39m{
  [0m[34;1m"test"[0m[1;39m: [0m[1;39m[
    [0;32m"t 'E s t"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"ploop"[0m[1;39m: [0m[1;39m[
    [0;32m"p l 'u p"[0m[1;39m
  [1;39m][0m[1;39m,
  [0m[34;1m"raxacoricofallipatorius"[0m[1;39m: [0m[1;39m[
    [0;32m"r '{ k s V k 'A r I k O f '{ l V p V t O r i I s"[0m[1;39m
  [1;39m][0m[1;39m
[1;39m}[0m


### Generate HCLG.fst

Kaldi models require an additional training step to generate a special finite state transducer (FST) named `HCLG.fst`. This FST merges acoustic, pronunciation, and language model information, allowing Kaldi to do fast transcriptions.

Assumes `profile/kaldi` contains:

* `model` (nnet3)
* `dictionary.txt` (from training)
* `language_model.txt` (from training)

In [66]:
# Generate HCLG.fst in profile/kaldi/model/graph
! rhasspy-kaldi-train \
    --model-dir profile/kaldi/model \
    --model-type nnet \
    --dictionary profile/kaldi/dictionary.txt \
    --language-model profile/language_model.txt

Cleaning up
Generating lexicon
/kaldi/egs/wsj/s5/utils/prepare_lang.sh /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/local/dict  /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/local/lang /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang
Checking /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/local/dict/silence_phones.txt ...
--> reading /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/local/dict/silence_phones.txt is OK

Checking /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/local/dict/optional_silence.txt ...
--> reading /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/local/dict/option

--> resulting phone sequence from L_disambig.fst corresponds to the word sequence
--> L_disambig.fst is OK

Checking /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang/oov.txt
--> /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang/oov.int corresponds to /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang/oov.txt
--> /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang/oov.{txt, int} are OK

--> /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang/L.fst is olabel sorted
--> /home/hansenm/opt/rhasspy-services/docs/notebooks/profile/kaldi/model/data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory /home/ha

## Speech to Text (Kaldi)

Transcribing an audio segment with [Kaldi](https://kaldi-asr.org) a trained model with the following artifacts:

1. An acoustic model (`final.mdl`)
2. An acoustic-phonetic-lingustic FST (`HCLG.fst`)
3. A symbol table (`words.txt`)

Assumes `profile/kaldi` contains:

* `model` (nnet3)
    * `graph`
        * `HCLG.fst`
        * `words.txt`
    * `model`
        * `final.mdl`

In [69]:
# Requires 16-bit 16Khz mono audio
! gst-launch-1.0 \
    filesrc location=wav/turn_on_living_room_lamp.wav ! \
    decodebin ! \
    audioconvert ! \
    audioresample ! \
    audio/x-raw, rate=16000, channels=1, format=S16LE ! \
    filesink location=/dev/stdout | \
  rhasspy-kaldi \
    --model-dir profile/kaldi/model \
    --model-type nnet3 | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"wav_name"[0m[1;39m: [0m[0;32m"tmpr8rxbd3e.wav"[0m[1;39m,
  [0m[34;1m"wav_seconds"[0m[1;39m: [0m[0;39m4.21675[0m[1;39m,
  [0m[34;1m"transcribe_seconds"[0m[1;39m: [0m[0;39m0.747169[0m[1;39m
[1;39m}[0m


Decode multiple WAV files (jsonl output)

In [82]:
%%file wav_files.txt
wav/turn_on_living_room_lamp.wav
wav/what_time_is_it.wav

Overwriting wav_files.txt


In [83]:
# Reads list of WAV files to decode from stdin
! rhasspy-kaldi-decode \
    --model-dir profile/kaldi/model \
    --model-type nnet3 \
    < wav_files.txt

{"text":"turn on the living room lamp","wav_name":"turn_on_living_room_lamp.wav","wav_seconds":4.206563,"transcribe_seconds":0.715335}
{"text":"what time is it","wav_name":"what_time_is_it.wav","wav_seconds":1.225,"transcribe_seconds":0.208314}


## Intent Recognition (fsticuffs)

Assumes `profile` contains:

* `intent.fst` (from training)

In [71]:
# Sentences are read line-by-line from stdin
! echo 'turn on the living room lamp' | \
  rhasspy-fsticuffs \
    --intent-fst profile/intent.fst | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on the living room lamp"[0m[1;39m,
  [0m[34;1m"intent"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"name"[0m[1;39m: [0m[0;32m"ChangeLightState"[0m[1;39m,
    [0m[34;1m"confidence"[0m[1;39m: [0m[0;39m1[0m[1;39m
  [1;39m}[0m[1;39m,
  [0m[34;1m"entities"[0m[1;39m: [0m[1;39m[
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"state"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"start"[0m[1;39m: [0m[0;39m5[0m[1;39m,
      [0m[34;1m"end"[0m[1;39m: [0m[0;39m7[0m[1;39m
    [1;39m}[0m[1;39m,
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"name"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"start

In [70]:
# Do fuzzy matching (usually slower).
# Skip over any unknown words.
! echo "would you please turn on that good ol' living room lamp of mine" | \
  rhasspy-fsticuffs \
    --intent-fst profile/intent.fst \
    --skip-unknown \
    --fuzzy | \
  jq .

[1;39m{
  [0m[34;1m"text"[0m[1;39m: [0m[0;32m"turn on living room lamp"[0m[1;39m,
  [0m[34;1m"intent"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"name"[0m[1;39m: [0m[0;32m"ChangeLightState"[0m[1;39m,
    [0m[34;1m"confidence"[0m[1;39m: [0m[0;39m1[0m[1;39m
  [1;39m}[0m[1;39m,
  [0m[34;1m"entities"[0m[1;39m: [0m[1;39m[
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"state"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"on"[0m[1;39m,
      [0m[34;1m"start"[0m[1;39m: [0m[0;39m5[0m[1;39m,
      [0m[34;1m"end"[0m[1;39m: [0m[0;39m7[0m[1;39m
    [1;39m}[0m[1;39m,
    [1;39m{
      [0m[34;1m"entity"[0m[1;39m: [0m[0;32m"name"[0m[1;39m,
      [0m[34;1m"value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"raw_value"[0m[1;39m: [0m[0;32m"living room lamp"[0m[1;39m,
      [0m[34;1m"start"[0