## Introduction

Brainbox is a service that:

* hosts _Deciders_, representations of ML-services, such as Oobabooga or Automatic1111.
* provides a unified API access to the deciders
* starts and stops the deciders, managing the GPU load
* queues incoming tasks and executes them asynchonously, storing results in a database.

In this scenario, BrainBox is supposed to be constantly on a GPU machine and execute the tasks when it's not busy. 

In alternative scenario, BrainBox can be used to proxy the models without starting and stopping them, to provide logging and caching all the incoming and outgoing data, as well as for the interface unification.

Let's configure a local BrainBox and run some tasks on it. 

In [1]:
from kaia.brainbox import BrainBoxSettings, BrainBoxTestApi, BrainBoxTask
from kaia.brainbox.deciders.fake_image_generator import FakeImageDecider

#Dict of decider is needed to build a correspondence between the name and the type of the decider
deciders = {'FakeImageDecider': FakeImageDecider()}

#Setting contain various settings of everything ML-related, including the settings of the supported models. We can alter these settings
settings = BrainBoxSettings()
settings.brain_box_web_port = 8091

#This line boots up brainbox in a testmode. It will work as a real webserver, and will be brought down after the `with` scope.
with BrainBoxTestApi(deciders) as api:
    task = BrainBoxTask(id = BrainBoxTask.safe_id(), decider = 'FakeImageDecider', arguments = dict(prompt='Some prompt'), decider_method = None)
    result = api.execute(task)
    api.pull_content(result[0])
    downloaded_path = api.download(result[1])

To run the task, BrainBox determines the appropriate decider and runs einther `__call__` (if `decider_method` is None, which is a default), or a specified method. `arguments` are passed to this method.


### Returned files


FakeImageDecider, as well as Automatic1111, returns the list of file objects. These objects may or may not have content; by default, brainbox doesn't return content. If content is needed, retrieve it with `api.pull_content`. 

In [2]:
print(result[0].content is None, result[1].content is None)

False True


Alternativaly, you may use `api.download` to move file from server to a local folder; the method returns the filename at the local machine. 

In [3]:
from kaia.infra import FileIO

FileIO.read_bytes(downloaded_path)

b'{"prompt": "Some prompt", "option_index": 1, "model": null}'

### Multiple tasks and asyncronysity 

`execute` method can also accept a list of tasks, in this case `result` will be an array, each element containing the result of the corresponding task.

We can also execute tasks asyncronously:

In [4]:
with BrainBoxTestApi(deciders) as api:
    task = BrainBoxTask(id = BrainBoxTask.safe_id(), decider = 'FakeImageDecider', arguments = dict(prompt='Some prompt'))
    api.add(task)
    print('Do something while Brainbox is executing out task')
    result = api.join(task)

result

Do something while Brainbox is executing out task


[File(id_c5b48a21613e4a32bb76e059c4b53c82.output.0.json),
 File(id_c5b48a21613e4a32bb76e059c4b53c82.output.1.json),
 File(id_c5b48a21613e4a32bb76e059c4b53c82.output.2.json),
 File(id_c5b48a21613e4a32bb76e059c4b53c82.output.3.json)]

We can poll the task status:

In [5]:
from pprint import pformat

with BrainBoxTestApi(deciders) as api:
    task = BrainBoxTask(id = BrainBoxTask.safe_id(), decider = 'FakeImageDecider', arguments = dict(prompt='Some prompt'))
    api.add(task)
    job_state = api.get_job(task.id)
    result = api.join(task)

print(pformat({key:value for key, value in job_state.__dict__.items() if not key.startswith('_')}))

{'accepted': False,
 'accepted_timestamp': None,
 'arguments': {'prompt': 'Some prompt'},
 'assigned': False,
 'assigned_timestamp': None,
 'back_track': None,
 'batch': None,
 'decider': 'FakeImageDecider',
 'decider_parameters': None,
 'dependencies': None,
 'error': None,
 'finished': False,
 'finished_timestamp': None,
 'has_dependencies': False,
 'id': 'id_04122acdf5ea415ba7d79e0566c73abd',
 'log': None,
 'method': None,
 'progress': None,
 'ready': False,
 'ready_timestamp': None,
 'received_timestamp': datetime.datetime(2024, 8, 7, 11, 30, 18, 412337),
 'result': None,
 'success': False}


## Warming up and cooling down

When decider warms up, it accepts a string parameter that may affect the decider. The most obvious use case for the parameter is a base model for Automatic1111 or Oobabooga. 

In this implementation, one cannot change the parameter of the running decider: it needs to be cooled down and then warmed up. Since the majority of time is anyways consumed to load the model, it's not a big problem.

A `Planer` is used to determine which deciders to warm up or cool down, and also which tasks should be executed on the currently-up deciders. `SimpleDecider` which is used in `BrainBoxTestApi`, selects the decider for longest-waiting task, warms the decider up, executes all tasks that match this decider, and then cools the decider down. Other planners with more sophisticated logic can be implemented. 

Let's see how parameters work in action

In [6]:
test_api = BrainBoxTestApi(deciders)
with test_api as api:
    tasks = []
    for index in range(3):
        for model in ['a','b','c']:
            tasks.append(BrainBoxTask(
                id = BrainBoxTask.safe_id(), 
                decider = 'FakeImageDecider', 
                arguments = dict(prompt=f'{model}/{index}'), 
                decider_parameters=model))
    results = api.execute(tasks)
    files = [api.download(result[0]) for result in results]

In [7]:
import pandas as pd

pd.DataFrame([FileIO.read_json(file) for file in files])

Unnamed: 0,prompt,option_index,model
0,a/0,0,a
1,b/0,0,b
2,c/0,0,c
3,a/1,0,a
4,b/1,0,b
5,c/1,0,c
6,a/2,0,a
7,b/2,0,b
8,c/2,0,c


We can see that indeed the tasks were called with different models. 

## Better tasks definitions

Brainbox API provides several ways to define a `BrainBoxTask` object. Let's create a custom decider and see how we can define the corresponding tasks:

In [8]:
from kaia.brainbox.core import IDecider

class Test(IDecider):
    def __init__(self):
        self.model = None
    
    def warmup(self, parameters):
        self.model = parameters

    def cooldown(self, parameters):
        self.model = parameters

    def run(self, a, b=1):
        return dict(a=a, b=b, model=model)

tasks = [
    BrainBoxTask(decider='Test', decider_method='run', decider_parameters='my_model', arguments = dict(a=5)),
    BrainBoxTask(decider=Test, decider_method='run', decider_parameters='my_model', arguments = dict(a=6)),
    BrainBoxTask(decider=Test.run, decider_parameters='my_model', arguments = dict(a=5)),
    BrainBoxTask.call(Test).run(a=5).to_task('my_model'),
]

results, _ = BrainBoxTestApi.execute_serverless(tasks,dict(Test=Test()))
    
for result in results:
    print(result.result)

*** 0
*** 4
{'a': 5, 'b': 1, 'model': 'c'}
{'a': 6, 'b': 1, 'model': 'c'}
{'a': 5, 'b': 1, 'model': 'c'}
{'a': 5, 'b': 1, 'model': 'c'}


All these methods define the same task. The last two variants also checks if the arguments are correct:

In [9]:
import traceback

print(BrainBoxTask.call(Test).run(a=5).to_task().arguments)
print(BrainBoxTask.call(Test).run(a=5, b=3).to_task().arguments)
try:
    print(BrainBoxTask.call(Test).run(b=1).to_task().arguments)
except ValueError as e:
    print(e)
try:
    print(BrainBoxTask.call(Test).run(x=5, a=5).to_task().arguments)
except ValueError as e:
    print(e)

{'a': 5}
{'a': 5, 'b': 3}
The following arguments are missing: a
Unexpected argument x


In most cases, you don't need to write `to_task` in the end, if you don't want any additional parameters. When I was developing BrainBox, I was constantly forgetting to write it, and augmented BrainBoxApi methods in a way that forgives this mistake. However, this is not a standard, and methods outside of the BrainBoxApi methods may fail if you do this.

## Media Libraries and tasks dependencies

In the on-premise hosting setup, we can't count on continuous availability of the GPU. Hence, a strategy can be to generate excess amount of content in advance. This scenario is supported in BrainBox with `Collector` decider and `MediaLibrary` format.

In [10]:
from kaia.brainbox.deciders.collector import Collector

deciders = {'FakeImageDecider': FakeImageDecider(), 'Collector': Collector()}

with BrainBoxTestApi(deciders) as api:
    tags = {}
    dependencies = {}
    for i in range(5):
        task = BrainBoxTask.call(FakeImageDecider)(prompt='Some prompt').to_task()
        tags[task.id] = dict(index=i)
        dependencies[task.id] = task.id
        api.add(task)

    collection_task = BrainBoxTask.call(Collector).to_media_library(tags=tags).to_task(dependencies = dependencies)
    result = api.execute([collection_task])
    library = api.download(result[0])
           

What happens here? BrainBox processes key-value pairs in `dependencies` by assigning the result of the task with id `value` to an argument of the decider's method with the name `key`. In our case, `key` equls `value`, but in other scenarios it may happen that the result of one decider must be injected as a particular argument to other decider, hence the dictionary. 

`Collector` then assembles all the outputs into one zip file, a media library. This media library also keeps tags, assotiated with the files. Collector can process any decider's output as long as it's a list of files.

In [11]:
from kaia.brainbox import MediaLibrary

lib = MediaLibrary.read(library)
lib.to_df().head(5)

Unnamed: 0,index,option_index,filename,timestamp,job_id
0,0,0,id_791917168c79486aafe900cf69f09384.output.0.json,2024-08-07 11:30:26.825840,id_791917168c79486aafe900cf69f09384
1,0,1,id_791917168c79486aafe900cf69f09384.output.1.json,2024-08-07 11:30:26.825840,id_791917168c79486aafe900cf69f09384
2,0,2,id_791917168c79486aafe900cf69f09384.output.2.json,2024-08-07 11:30:26.825840,id_791917168c79486aafe900cf69f09384
3,0,3,id_791917168c79486aafe900cf69f09384.output.3.json,2024-08-07 11:30:26.825840,id_791917168c79486aafe900cf69f09384
4,1,0,id_29999e384ca74319b5d8247e1cb58f3a.output.0.json,2024-08-07 11:30:26.825840,id_29999e384ca74319b5d8247e1cb58f3a


The files can be extracted from the library:

In [12]:
lib.records[0].get_content()

b'{"prompt": "Some prompt", "option_index": 0, "model": null}'

## TaskPack

The approach with `Collector` and `MediaLibrary` is very handy to generate content. Some additional syntax sugar was created to make this as easy as possible in the code. `BrainBoxTaskPack` allows you:
* to define intermediate tasks that are required to complete the main, `resulting_task`
* to define postprocessor that will be performed on the API side with the result: e.g., to download the file and to open it with `MediaLibrary.read`

It is important to remember that all the pack-related functionality is performed by API, not by the BrainBox web-server.

Also, `Collector.PackBuilder` allows you to define packs for the collection tasks easily.

In [18]:
from kaia.brainbox import BrainBoxTaskPack, DownloadingPostprocessor

builder = Collector.PackBuilder()

for i in range(5):
    builder.append(BrainBoxTask.call(FakeImageDecider)(prompt='Some prompt').to_task(), dict(index=i))

pack = builder.to_collector_pack('to_media_library')
pack.postprocessor = DownloadingPostprocessor(opener=MediaLibrary.read)

with BrainBoxTestApi(deciders) as api:
    lib = api.execute(pack)

lib.to_df().head()

Unnamed: 0,index,option_index,filename,timestamp,job_id
0,0,0,id_abeffff972af4814a9b6b93ec2dc66d1.output.0.json,2024-08-07 11:31:23.522448,id_abeffff972af4814a9b6b93ec2dc66d1
1,0,1,id_abeffff972af4814a9b6b93ec2dc66d1.output.1.json,2024-08-07 11:31:23.522448,id_abeffff972af4814a9b6b93ec2dc66d1
2,0,2,id_abeffff972af4814a9b6b93ec2dc66d1.output.2.json,2024-08-07 11:31:23.522448,id_abeffff972af4814a9b6b93ec2dc66d1
3,0,3,id_abeffff972af4814a9b6b93ec2dc66d1.output.3.json,2024-08-07 11:31:23.522448,id_abeffff972af4814a9b6b93ec2dc66d1
4,1,0,id_21914e73d6fc48ecaf4b02718950089b.output.0.json,2024-08-07 11:31:23.522448,id_21914e73d6fc48ecaf4b02718950089b


Here, `Collector.to_media_library` originally returns a filename, but postprocessor automatically downloads it and opens with a provided method. BrainBoxTaskPack is a handy way of representing complex task networks in your applications and then work with a single-liner `api.execute` to obtain the programmatical representation of the result in the application.