In [None]:
import time
from concurrent.futures import ThreadPoolExecutor
from itertools import repeat

from intelligence_layer.core import NoOpTracer, Task, TaskSpan

# How to get more done in less time
The following notebook contains tips for the following problems:
    - A single task that takes very long to complete
    - Running one task multiple times
    - Running several different tasks at the same time
    

## A single long running task
With a single long running task, consider the following:
 - If there are other calculations to do, consider using `ThreadPool.submit`, together with `result`
   - See [here](#submit_example) for an example
 - If this is not the case consider:
   - Choosing a faster model. The `base` model is faster than `extended`, `extended` is faster than `supreme`
   - Choosing tasks that perform fewer LLM operations. E.g.: `MultiChunkQa` usually takes longer than `SingleChunkQa`

## Running one task multiple times
When a single task should process multiple inputs, one can use `task.run_concurrently` to easily process the inputs at the same time  

**Example:**

In [None]:
class DummyTask(Task):
    def do_run(self, input: str, task_span: TaskSpan) -> str:
        time.sleep(2)
        print(f"Task1 complete with input: {input}")
        return input.upper()


tracer = NoOpTracer()

multiple_task_inputs = [f"input-{i}" for i in range(4)]
task = DummyTask()


result = task.run_concurrently(
    multiple_task_inputs, tracer
)  # this finishes in 2 seconds instead of 8 when looping over the inputs
result

## Running several tasks at the same time
When having to run multiple distinct tasks at the same time, one can leverage the existing `concurrent.futures` python library.
The following shows some examples on how this can be done

In [None]:
# Second long-running task


class DummyTask2(Task):
    def do_run(self, input: str, task_span: TaskSpan) -> str:
        time.sleep(2)
        print(f"Task2 complete with input: {input}")
        return input.upper()


# initialize all tasks and inputs
task_1 = DummyTask()
task_2 = DummyTask2()

task_input_1 = list([f"input-{i}" for i in range(10)])
task_input_2 = list([f"input-{i}" for i in range(20)])

<a id='submit_example'></a>
The individual tasks can then be submitted to a ThreadPool.  
This is especially useful when there are other things to do while running tasks.

In [None]:
with ThreadPoolExecutor(max_workers=2) as executor:
    task_1_result = executor.submit(task_1.run_concurrently, task_input_1, tracer)
    task_2_result = executor.submit(task_2.run_concurrently, task_input_2, tracer)
    # ...other important code here
    print("Task 1 result:", task_1_result.result())
    print("Task 2 result:", task_2_result.result())

`ThreadPool` can easily be used via the function `.map`. This processes a list of jobs in order and outputs the results once all jobs are done.  
This is especially useful if there are many diverse jobs that take a varying amount of time.  
However, since `map` only takes a single parameter, the input has to be bundled into a list of tuples beforehand.

In [None]:
jobs = list(zip(repeat(task_1), task_input_1)) + list(zip(repeat(task_2), task_input_2))

with ThreadPoolExecutor(max_workers=20) as executor:
    result = list(executor.map(lambda job: job[0].run(job[1], tracer), jobs))
    print("Task 1 result:", result[: len(task_input_1)])
    print("Task 2 result:", result[len(task_input_1) :])

`ThreadPool.map` can also be used with `Task.run_concurrently()` in which case the creation of the jobs becomes slightly easier.

In [None]:
with ThreadPoolExecutor(max_workers=2) as executor:
    results = list(
        executor.map(
            lambda job: job[0].run_concurrently(job[1], tracer),
            [(task_1, task_input_1), (task_2, task_input_2)],
        )
    )
    print("Task 1 result:", result[: len(task_input_1)])
    print("Task 2 result:", result[len(task_input_1) :])

<div class="alert alert-warning">
Note
</div>

If tasks are CPU bound, the abovementioned code will not help. In that case, replace the `ThreadPoolExecutor` with a `ProcessPoolExecutor`.