## Introduction

`kaia` is an assistant that utilizes other projects for the end-user functonality. All code in `kaia` module is run on the assistant device: a RhaspberryPi 4 with a connected medium-sized LED display and conference mic. It can, of course, also run at normal PC as well.

The backbone of `kaia` is `eaglesong`. `kaia.core` defines Driver and Interpreter, a standard way to connect eaglesong to the medium. In particular, driver connects to:
* Either full-fledged Rhasspy that has access to sound devices
* Or an alternative `audio_control` service, my own implementation that controls the mic and the headphones. In comparison with Rhasspy, AudioControl is much smaller and hence allows experimenting with additional functionality, e.g. speaker recognition, easier.

AudioControl/Rhasspy feeds every recognized utterance into eaglesong loop. Interpreter sends audio outputs to AudioControl/Rhasspy for playing, and images and text outputs are sent to the internal `KaiaGuiServer`. All this runs in the docker container on the assistant device. Outside of the container, the firefox is running in the kiosk mode. It fetches the non-audio messages from the web-server and presents them to the user.

To prevent Driver and Interpreter from turning into superclasses, lots of functionality is detached from them and placed into Translators. Translators are the `eaglesong` wraps that are placed between the "main" routine `KaiaAssistant` and the driver/interpreter, and their function is to pre/post-process the inputs and the outputs of the `KaiaAssistant`. Exactly this is done for e.g. voiceover: if the `KaiaAssistant` produces a string, this translator intercepts this string, sends to `avatar` and turns it into an audio.

`KaiaAssistant` is an `eaglesong` routine that does not depend on `avatar`, `brainbox` or anything else. The main job of `KaiaAssistant` is to dispatch the incoming messages to the _skills_, such as current time skill, wheather skill etc. So the skills are main content producers of the system, while `KaiaAssistant`, translators, driver, interpreter, webserver and the webpage inside Firefox are infractructural means to deliver this content to the user.

The simplest skill is `SingleLineKaiaSkill`: such skills just process one input, return one output and that's all, so there is no lasting conversations. They are easiest two write, and lots of examples are given in `kaia.skills` folder. 

Sometimes we do however want a lasting conversation. While this functionality is possible and there is already a pilot implementation of it in `NotificationSkill`, it is still very new and there are a lot of questions like "what to do if one lasting conversation is interrupted by another". The proper orchestration of such long-lasting conversations is still pending.

Now, to the testing of all this. It can be implemented on many different levels:
* Each skill can be tested with the standard `eaglesong` testing means.
* `KaiaAssistant` with some skills can also be tested by standard `eaglesong` means to check if the skills' interaction is proper
* `KaiaAssistant` with skills and translators can be tested by `eaglesong` means, if mock web servers for Avatar and Brainbox are set. Both services are supplied with `*TestApi` that allows to run them in place of tests.
* The system as a whole, including Brainbox, Avatar and AudioControl can be tested thanks to the mock functionality of AudioControl, that fakes a mic and process wav-files.

### Demo

Such tested and runnable variant of the assistant is placed in `my/demo` folder. It contains a minimal working example of Kaia. 