# Parallel Execution of Same Event Example

In this example, we'll demonstrate how to use the workflow functionality to achieve similar capabilities while allowing parallel execution of multiple events of the same type.  
By setting the `num_workers` parameter in `@step` decorator, we can control the number of steps executed simultaneously, enabling efficient parallel processing.




# Installing Dependencies

First, we need to install the necessary dependencies:

* LlamaIndex core for most functionalities
* llama-index-utils-workflow for workflow capabilities

In [None]:
!pip install llama-index-core llama-index-utils-workflow

#Importing Required Libraries
After installing the dependencies, we can import the required libraries:

In [None]:
import asyncio
from llama_index.core.workflow import (
    step,
    Context,
    Workflow,
    Event,
    StartEvent,
    StopEvent,
)

In this example, we will create a workflow that can process multiple data items in parallel. By using the `@step(num_workers=N)` decorator, we can limit the number of steps executed simultaneously, thus controlling the level of parallelism. This approach is particularly suitable for scenarios that require processing similar tasks while managing resource usage.  
For example, you can execute multiple sub-queries at once, but please note that num_workers cannot be set without limits. It depends on  your workload or token limits.
#Defining Event Types
We'll define two event types: one for input events to be processed in parallel, and another for processing results:

In [None]:
class ParallelEvent(Event):
    data: str


class ResultEvent(Event):
    result: str

#Creating the Parallel Workflow
Now, we'll create a ParallelWorkflow class that includes three main steps:

- start: Initialize and send multiple parallel events
- process_data: Process data in parallel
- combine_results: Collect and merge all processing results

In [None]:
import random


class ParallelWorkflow(Workflow):
    @step(pass_context=True)
    async def start(self, ctx: Context, ev: StartEvent) -> ParallelEvent:
        data_list = ["A", "B", "C"]
        ctx.data["num_to_collect"] = len(data_list)
        for item in data_list:
            self.send_event(ParallelEvent(data=item))
        return None

    @step(num_workers=3)
    async def process_data(self, ev: ParallelEvent) -> ResultEvent:
        # Simulate some time-consuming processing
        await asyncio.sleep(random.randint(1, 2))
        result = f"Processed: {ev.data}"
        print(f"Completed processing: {ev.data}")
        return ResultEvent(result=result)

    @step(pass_context=True)
    async def combine_results(
        self, ctx: Context, ev: ResultEvent
    ) -> StopEvent | None:
        num_to_collect = ctx.data["num_to_collect"]
        results = ctx.collect_events(ev, [ResultEvent] * num_to_collect)
        if results is None:
            return None

        combined_result = ", ".join([event.result for event in results])
        return StopEvent(result=combined_result)

In this workflow:

- The start method initializes and sends multiple ParallelEvents.
- The process_data method uses the @step(num_workers=3) decorator to limit the number of simultaneously executing workers to 3.
- The combine_results method collects all processing results and merges them.

#Running the Workflow
Finally, we can create a main function to run our workflow:

In [None]:
import time

workflow = ParallelWorkflow()

start_time = time.time()
result = await workflow.run()
end_time = time.time()
print(f"Workflow result: {result}")
print(f"Time taken: {end_time - start_time} seconds")

Completed processing: A
Completed processing: C
Completed processing: B
Workflow result: Processed: A, Processed: C, Processed: B
Time taken: 2.0068275928497314 seconds


#Note

- Processing occurs in parallel, handling 3 items at a time, and only takes 2 seconds. Without setting num_workers, it would take 3 to 6 seconds.
- The order of the completed results may differ from the input order, depending on the completion time of the tasks.


This example demonstrates how to implement controlled parallel processing in a workflow. By setting num_workers, we can control the degree of parallelism, which is very useful for scenarios that need to balance performance and resource usage.