This notebook demonstrates how to create paraphrases for Kaia. Essentially all the utterances Kaia emits can be paraphrased, and that greatly increase the "feel alive" factor.

First, you need to extract the templates from your assistant. I will extract them from demo-version, and take three of them:
* One without parameters
* One with parameters, but without words in agreement with parameters
* One with parameters and words in agreement

In [1]:
from kaia.app.kaia_driver_settings import AssistantFactory
from foundation_kaia.misc import Loc

factory = AssistantFactory(None)
assistant = factory.create_assistant(None)
all_templates = assistant.get_replies()

templates = [None, None, None]
for template in all_templates:
    s = str(template)
    if '{' in s:
        if '[' in s:
            templates[2] = template
        else:
            templates[1] = template
    else:
        templates[0] = template

for template in templates:
    print(template)

Template: Exception has occured. The skill was interrupted, but assistant runs normally
Template: The characters available are: {character_list}
Template: You need to wait for the timer. {minutes} [minute|minutes] remaining


Then, you need to define your characters. 

*Note*: Currently, we have several `Character` classes across Avatar and Chara. At some point, I hope to reorganize the code and have only one Character class that defines character completely: the personality, the samples for voice clone, the samples for LoRA images, etc. But right now we have what we have.

In [2]:
from avatar.daemon import Character

mountain = Character(
    "Mountain",
    Character.Gender.Masculine,
    "Mountain is the spirit of Mountains. He is old and full of knowledge of the word. His replies are short and sharp"
)

ocean = Character(
    "Ocean",
    Character.Gender.Feminine,
    "Ocean is the spirit of the Ocean. She is old, kind and enigmatic, and her replies are wordy and emotional"
)

You also need to define the users. Note, that if you plan to have several users, you will also need to set up SpeakerIdentification so Kaia would understand who is talking to the system. 

In [3]:
eagle = Character(
    "Eagle",
    Character.Gender.Masculine,
    "Eagle is an eagle. He likes spending time in the mountains, enjoying the view from above"
)

salmon = Character(
    "Salmon",
    Character.Gender.Feminine,
    "Salmon is a salmon. She likes spending time in the deeps of the ocean, exploring the ancient wisdom"
)

Now, you may also want to define the relationship between the characters and users:

In [4]:
from chara.paraphrasing import ParaphraseCase
from foundation_kaia.prompters import Referrer, Prompter

o = Referrer[ParaphraseCase]()

relationships = {
    "warm": Prompter(f'{o.ref.user} and {o.ref.character} have a warm relationship'),
    "cold": Prompter(f'{o.ref.user} and {o.ref.character} just met very recently and their relashionship is a bit cold')
}

user_to_character_to_relationships = {
    'Eagle': {
        'Mountain': 'warm',
        'Ocean': 'cold'
    },
    'Salmon': {
        'Mountain': 'cold',
        'Ocean': 'warm'
    }
}

That is enough to build _cases_: all the tasks for paraphrasing. Each case contains template, user, character and their relationship. 


If you already have some generated paraphrases, you may do _complimentary generation_: generate the paraphrasing of the templates you just recently added to Kaia, as well as some more paraphrases for templates that were heavily in use. In this case, set the variables `existing_paraphrases` and `paraphrases_feedback` to the contents of your 'paraphrases.pkl` and `paraphrases-feedback.json` files respectively.

In [5]:
from chara.paraphrasing import ParaphraseSetup


existing_paraphrases = []
paraphrases_feedback = {}
setup = ParaphraseSetup(templates, [salmon, eagle], [ocean, mountain], user_to_character_to_relationships, relationships)
source_cases = setup.create_new_cases()
prehistory = setup.create_prehistoric_cases(existing_paraphrases, paraphrases_feedback)
len(source_cases)

12

In [6]:
prompt_template = ParaphraseCase.get_paraphrase_template()
print(prompt_template(source_cases[-1]))

You work on helping with the deep, artistic personalization of kitchen voice assistant. The core feature of this assistant is stylization for visual novels: assistant acts like a character, and interactions with the user need to follow the personaliries of the character and the user, as well as their current relationship.

The character's name is Mountain. Mountain is the spirit of Mountains. He is old and full of knowledge of the word. His replies are short and sharp

The user's name is Eagle. Eagle is an eagle. He likes spending time in the mountains, enjoying the view from above

Eagle and Mountain have a warm relationship

You're currently working on one of the assistant replies.

# Context

The circumstances are following: Eagle is asking what is the next step in the recipe, but currently Eagle needs to wait for the timer to finish the previous step, and Mountain informs him about this fact.

# The Format

The current, default answers are:

* You need to wait for the timer. {minut

Now, we will need a functioning BrainBox api:

In [7]:
from brainbox import BrainBox

api = BrainBox.Api('127.0.0.1:8090')

And defined pipeline for paraphrasing, where:
1. `initial_step` just contains all the cases
2. `choosing_step` selects the cases that are currently least represented
3. `mapping_step` runs the LLM with the according prompts
4. `parsing_step` tries to restore the template from the LLM answer: some outputs will fails as LLM sometimes doesn't obey the restrictions regarding variables.

As running LLM takes quite some time, I will set the amount of cases in each iteration to 1. You may set as many as you are comfortable with.

In [8]:
from brainbox.flow import *
from chara.paraphrasing import LLMParsingStep
from pathlib import Path

initial_step = ConstantStep(source_cases)
choosing_step = ParaphraseSetup.create_representation_step(1)
prompt = ParaphraseCase.get_paraphrase_template()
mapping_step =  BrainBoxMappingStep(
    api,
    BrainBoxMapping(
        PromptBasedObjectConverter(
            prompt,
            'mistral-small',
        ),
        SimpleApplicator('answer'),
    )
)
parsing_step = LLMParsingStep()
temp_path = Path('temp/paraphrases')

flow = Flow(
    temp_path,
    [
        initial_step,
        choosing_step,
        mapping_step,
        parsing_step
    ],
    prehistory = prehistory
)
flow.reset()
flow.run(1)

TOTAL 0
0 records -> Start ConstantStep, 0/4 -> 12 records
12 records -> Start SortedRepresentationStep, 1/4 -> 1 records
1 records -> Start BrainBoxMappingStep, 2/4 -> 1 records
1 records -> Start LLMParsingStep, 3/4 -> 1 records
TOTAL 1


Great! Now export the paraphrases:

In [11]:
output = Path('temp/paraphrases.pkl')
ParaphraseSetup.export(temp_path, output)

You may read the records from the file:

In [16]:
from yo_fluq import FileIO

records = FileIO.read_pickle(output)
print(records[0].template)

Template: *Ah, dear Salmon, it seems our conversation has hit a snag, but fear not, for I am still here, ready to guide you through the depths of wisdom.*
