## Preparing samples for generation

First, you need to find samples of your voice: 8-10 files, 3-4 seconds each is perfectly enough. Actually, even 3-4 files might be enough, the output really stabilizes rapidly.

The guidelines on the samples selections:
* Try to use the voicelines from computer games. They don't have background noise and music
* Don't try to pick the most expressive samples. For some reasons, TortoiseTTS exagregates the expressiveness, so you may end up with something absurd.
* Small background noise, like a single footstep, is okay
* Effects on voice which they use e.g. for Death Prophet in Dota, are not okay. Tortoise wasn't really able to pick up the voice in this case.
* You must normalize the volume. Too loud samples cause TortoiseTTS to produce white noise, screaming or other irrelevant outputs. You can use `kaia.ml.voice_cloning.data_prep.volume_normalization` to do this. 
* For samples with heavy background noise and backgound music, you might be successfull if you first do a cleaning (separation), using, e.g.

I will use Lina's voice from Dota2. I was actually extremely lucky with this voice, it has perfect everything: intotations, volume, etc. Unfortunately, my experience shows that not all the voices, even from computer games, are that good, and many of them needs some hacking. Moreover, for some voices Tortoise seems to work generally worse, no matter what you do with this voice.

In case you're preparing samples from e.g. Youtube video files: I wasn't able to quickly find a free utility that would allow you to select fragment from a videofile and export the audio from the fragment to an audiofile quickly, in one-click. So I designed the following pipeline for that:
1. e.g. Kdenlive to cut the fragments and remove bad fragments.
2. keep good fragments apart with >2 seconds pause between them
3. Extract `wav` file from the resulting videofile
4. use `kaia.ml.voice_cloning.data_prep.audio_cutter` to cut these audiofiles to individual files.

Now, let's assume you already have the samples, and do an upsampling: generating voicelines with TortoiseTTS.

First, you need to export samples in Tortoise. 

In [1]:
from kaia.brainbox.deciders.docker_based import TortoiseTTSSettings
from pathlib import Path

settings = TortoiseTTSSettings()
settings.export_voice_for_tortoise('lina', Path('files/voice'))

In [2]:
from kaia.ml.voice_cloning.data_prep.task_generator import generate_tasks
from kaia.infra import Loc, FileIO

golden_set = FileIO.read_json('files/golden_set.json')
texts = [s['text'] for s in golden_set]
tasks = generate_tasks(texts[:3], ['lina'])

tasks.intermediate_tasks[:2]

({'id': 'id_0a62e2ce86124bbf80a5397af2d6a61a', 'decider': 'TortoiseTTS', 'arguments': {'voice': 'lina', 'text': 'Trollocs were usually cowards in their way, preferring strong odds and easy kills.'}, 'dependencies': None, 'back_track': None, 'batch': None, 'decider_method': None, 'decider_parameters': None},
 {'id': 'id_0063c929b8ab424b8872808f22393f06', 'decider': 'TortoiseTTS', 'arguments': {'voice': 'lina', 'text': 'The Deathwatch Guard has charge of my safety, but you have charge of the defense of this camp.'}, 'dependencies': None, 'back_track': None, 'batch': None, 'decider_method': None, 'decider_parameters': None})

Afterwards, you need to start BrainBox server, add the tasks, wait for the results (it can take days for large amounts) and download the results.

```python
from kaia.brainbox import BrainBox

api = BrainBox().create_api('127.0.0.1')
api.add(tasks)

api.download(api.get_result('id_09fa356482dd442ca0a5697b3efa9294'),  Path('files/voicelines.zip')) # paste the ID of Collector's task here
```