Webapp using node-tree structure to interact and teach users how to use floss with Coqui-ai STT and Google TTS Build for the MäRI project at Arcada 2022
The application runs on Arcada robot Snow. The purpose is to instruct users about dental hygiene through flossing and brushing techniques with both text and speech. The speech is generated using Google-TTS and for flossing a video is also showing for clarity.
The user may select which techniques to learn about using touch inputs or speech. The speech input is handled with WebRTC and is converted to plain text using Coqui-ai.
After instructions the users is prompted with questions that should be answered using touch or speech. Finally the user may choose to learn more or exit the application.
flowchart
direction LR
subgraph Flossing
direction LR
f1[Monologue] --> f2[Video]
f2 --> f3[Monologue]
f3 --> f4{trickQuestion}
f4 -->|A| f5
f4 -->|B| f5
f4 -->|C| f5
f5[Monologue] --> f6
f6{Question}
f6 -->|A| f7[Monologue]
f6 -->|B| f7[Monologue]
f6 -->|C| f8[Monologue]
f9[Monologue]
f7 --> f9
f8 --> f9
end
subgraph Brushing
direction LR
b1[monologue] --> b2[Monologue]
b2 --> b3[Monologue]
b3 --> b4[Monologue]
b4 --> b5[Monologue]
b5 --> b6[Monologue]
b6 --> b7[Question]
b7 -->|A| b9[Monologue]
b7 -->|B| b10[Monologue]
b9 --> b11
b10 --> b11
b11[Monologue] --> b12[Monologue]
end
a[Start] --> b[Presentation]
b --> c{Question}
c --> Flossing --> c
c --> Brushing --> c
c --> exit[EndTree]
- Framework for holding application
- Holds "wavey" design in footer
- imports all scripts
- Objects for text to the application, used primarily in tree.js
- Available manuscripts are in english and swedish
- Stores all main functions, outside of STT
Functions of note:
- TTS API uses SSML so the text should be within <speak> tags
- textToSpeech uses the variable context and calls on the function playAudio which the function "remembers" i.e. Closure
- Returns a function called textToSpeech that we can save to a variable and call when needed.
currentNodekeeps track of active node- Child-nodes are set on parent-node to progress interaction
- Each dialogue goes through textToSpeech() in speech.js to query for audio files.
- Audio user input is handled with webRTC && Coqui-ai
- User input outside scope of childnodes gets passed through
nodeStart()withcurrentNode
- Depending on class and user input, childnode is set to
currentNode. - New
currentNodeis activated withnodeStart() - Tree ends with node class
EndTreeEndTreerefreshes page with reload.
- When a node has been passed through TTS it goes to interaction that handles node based on class and forwards to appropriate functionality
- Gets result from STT or user touch input and sets next node depending on answer
Code for the interaction tree with 5 node classes.
classDiagram
direction LR
class Question {
2-3 answers
2-3 child nodes
}
class trickQuestion{
2-3 answer
}
class Monologue{
Only TTS
}
class Video{
Plays a video in modal
Start & stop video with timers
Mutes video if given TTS
}
class EndTree{
Ends interaction
}
Question .. trickQuestion
trickQuestion .. Monologue
Monologue .. Video
Video .. EndTree
- First make a parent node
- Then create the child node
- Finally set the child node on the parent node
- Additional parameters, like video, set as needed
- All nodes but
Videoclass requires a string for tts
- Initiates the app and sets up the application
- import after other scripts but before webRTC module
- soundmeter.js to gauge sound volume
- audio.js for handling mic and STT
- socket.io implementation to stream audio to interpreting server
- video tutorial
- images
- Styling and animations
Finalized
- V1 for robot Alf with gestures
- V2 for robot Snow (current)


