# Implementing peer-to-peer (P2P) communication


In the previous notebooks we have seen how to build custom components with NVFlare. We have discussed how the server/controller can send tasks and data to the clients/executros and get their responses back.

In this tutorial we'll explore how to implement peer-to-peer communication between clients. This is useful in many examples, including distributed optimization, fully decentralized FL and swarm learning.

## Aux channels

[Aux channels](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.private.aux_runner.html#nvflare.private.aux_runner.AuxRunner.send_aux_request) are a mechanism in NVFlare that allows parties to communicate and directly send `Shareable`s to each other. They are used to send messages that are not part of the main task flow and can be very useful for implementing peer-to-peer communication between clients.

The main method used to send a request to a specified (list of) clients is the [`send_aux_request`](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.private.fed.client.client_engine_executor_spec.html#nvflare.private.fed.client.client_engine_executor_spec.ClientEngineExecutorSpec.send_aux_request) which is part of the communication mechanism in NVFlare for client-to-client or client-to-server communication outside the main task flow. It takes four main arguments:

- `targets`: a list of client names to send the request to. If `None` the request is sent to all clients.
- `topic`: a string that identifies the type of message being sent.
- `request`: a `Shareable` object that contains the message to be sent.
- `timeout`: an integer that specifies the maximum number of seconds to wait for a response from each client.

Now, in order for the target clients to handle messages received via aux channels, they need to register a callback to do so. This is similar to what we saw in the previous notebooks, when the server had to handle responses from the clients. In this case, the clients need to use the `register_aux_message_handler` method to register a callback function that handles auxiliary messages. A callback must be specified for every specific `topic`. This specifies how clients can respond to messages sent via `send_aux_request`. It requires two arguments: 

- the `topic` to be handled
- and a callable function to handle the message `message_handle_func`

> The `send_aux_request` method is available through the `engine` provided via the `FLContext`, while the `register_aux_message_handler` is directly available to the `Executor`.

## Executors with P2P communication

Let's see the aux channels in action by implementing a new executor with P2P communication enabled via aux channels. Before diving into the executor, let's create a very simple controller to orchestrate the clients by sending them a `Task` with name `"talk"`, asking to the clients to send messages to each other.

```python
class SimpleController(Controller):

    def control_flow(self, abort_signal: Signal, fl_ctx: FLContext):        
        self.broadcast_and_wait(
            task=Task(name="talk", data=Shareable()),
            targets=None,
            min_responses=0,
            fl_ctx=fl_ctx,
        )
```

Now that we have that, let's implement the executor - we'll call it `P2PExecutor`. In the `execute` method, we use the `send_aux_request` method to send a message to all other clients with the topic `"hello"` and a `Shareable` containing a message (`"Hello from {client_name}"`). Let's do it by assuming that we'll have 3 clients and everyone will send a message to each other.

```python
class P2PExecutor(Executor):

    def execute(
        self,
        task_name: str,
        shareable: Shareable,
        fl_ctx: FLContext,
        abort_signal: Signal,
    ):

        if task_name == "talk":
            engine = fl_ctx.get_engine()
            identity_name = fl_ctx.get_identity_name()

            engine.send_aux_request(
                targets=[f"site-{i}" for i in range(3) if f"site-{i}" != identity_name],
                topic="hello",
                request=DXO(
                    data_kind=DataKind.APP_DEFINED,
                    data={
                        "message": f"Hello from {identity_name}",
                    },
                ).to_shareable(),
                timeout=10,
                fl_ctx=fl_ctx,
            )
            return make_reply(ReturnCode.OK)
```

Now, try to run the controller and executors as they are!

You'll see that aux messages are being sent, but reception doesn't happen - as said, we need to register a callback to handle them. Let's do that by registering a callback to handle the `"hello"` topic. The best practice is to do that when `EventType.START_RUN` in the `handle_event` method.

```python
from nvflare.apis.fl_constant import ReservedKey

class P2PExecutor(Executor):

    # execute method as above
    def execute(...):
        ...

    def handle_event(self, event_type: str, fl_ctx: FLContext):
        if event_type == EventType.START_RUN:
            engine = fl_ctx.get_engine()

            # Register the aux message handler
            engine.register_aux_message_handler(topic="hello", message_handle_func=self._handle_aux_request)

    def _handle_aux_request(self, topic: str, request: Shareable, fl_ctx: FLContext) -> Shareable:
        sender = request.get_peer_prop(key=ReservedKey.IDENTITY_NAME, default=None) # extract sender name
        received_message = from_shareable(request).data["message"]

        # log received message
        self.log_info(fl_ctx, f"Received message from {sender}: {received_message}")

        return make_reply(ReturnCode.OK)
```

Let's see this in action! Try to implement it yourself first - the final implementation is provided in `modules.py`.     

In [None]:
from nvflare.job_config.api import FedJob
from modules import BasicController, P2PExecutor

job = FedJob(name="p2p_job")

controller = BasicController()
job.to_server(controller)

num_clients = 3
for i in range(num_clients):
    executor = P2PExecutor()
    job.to(executor, f"site-{i}")

job.simulator_run("./tmp/")

If you inspect the logs, you'll see exactly 6 lines like

```
2025-02-07 10:23:08,625 - P2PExecutor - INFO - [identity=site-2, run=simulate_job, peer=site-1, peer_run=simulate_job] - Received message from site-1: Hello from site-1
```

Looks like we're succesfully sending P2P messages!

## Exercise

What happens if you set the `targets` argument of `send_aux_request` to `None`? This will automatically send the message to all the parties in the network. Try it out without changing anything else and see what happens - you'll probably see an error on the server side? How can you solve it?

> HINT: you might need to register the aux message handler on the server side as well!