# NVIDIA FLARE Overview

* Apache License 2.0 to catalyze FL research & development​
  

* Designed for enterprise production
    
* Able to run in CPU, GPU and Multi-GPU

* Enables cross-country, distributed, multi-party collaborative Learning​

* Production scalability with high availability and multi-task execution​

* Framework, model, domain and task agnostic​

* Layered, pluggable, customizable federated compute architecture​


# NVIDIA FLARE Architecture Overview

* Layered, Pluggable Open Architecture​

* Each layer’s component are composable and pluggable​

* Network: Communication & Messaging layer ​

   Drivers: gRPC, http + websocket, TCP, any plugin driver​

* CellNet: logical end point-to-point (cell to cell) network​

* Message: reliable streaming message ​

* Federated Computing Layer​

* Resource-based job scheduling, job monitoring, concurrent job lifecycle management, High-availability management​

* Plugin component management ​

* Configuration management​

* Local event and federated event handling​

* Federated Workflow​

   SAG, Cyclic, Cross-site Evaluation, Swarm Learning, Federated Analytics​

* Federated Learning Algorithms​

<img src="./flare_overview.png" alt="FLARE Architecture" width="700" height="400">



NVIDIA FLARE is built in layers. Each layer built on top of next. At the bottom the layer is the netowrk communication layer. 


## FCI - Flare Communication Interface

FCI is a logical network framework that supports asynchronous, 2-way communication through multiple transports. It is 

* **Pluggable**. It should have a pluggable architecture so it can support different messaging patterns (request-response, broadcast, pub/sub). It can also support different transport through drivers, like TCP, Pipe, HTTP/WS, gRPC.

* **Streamable**. Large binary data can be streamed in small chunks to minimize memory usage.

* **Full-duplex**. Both sides can send messages to each other without polling, if transport supports it.

* **Multiplex**, Multiple conversations can be conducted over the same connection at the same time using stream IDs.

* **Asynchronous**. Can send/receive messages in asynchronous fashion like fire/forget, listen to messages.

* **One-way connection** for remote communications. All TCP-based connections can be initiated from clients so clients have no port exposed.

* **Supports IPC**. It can work with communications through pipes or sockets between processes. 

* **Native heartbeats**. Heartbeats are supported by FCI to keep connections alive.

From top to bottom, FCI has following layers:

* **API Layer**: This is the API exposed to application developers, like Communicator and Cellnet.
* **Streamable Framed Message (SFM)**: This is the core of FCI and it provides abstraction on top of different communication protocols. It manages endpoints and connections.
* **Transport Drivers**: This layer is responsible for sending frames to other endpoints. It treats the frame as opaque bytes. 

<img src="./fci.png" alt="FLARE Communication Interface" width="300" height="400">


## Federated Computing Architecture

There are two parent control process with corresponding job processes on each site such process. This enables support of concurrent, mult-job process 

<img src="./system_architecture.png" alt="FLARE System Architecture" width="700" height="400">


## Federated Learnming Framework

Based on the basic core concepts, we have built many Federated learning workflows including FedAvg, FedOpt, FedProx, Scoffold, cyclic, swarming learning, split learning algorithms with many examples which can be found in [website](https://nvidia.github.io/NVFlare/) and its [tutorial categories](https://nvidia.github.io/NVFlare/catalog/)


## Enterprise Security and Privacy 

We have many feautures to support enterprise security as well as well support privacy enhancing technologies (PETs). Please refer to [Part-3 Secirity and Privacy](../../../part-3_security_and_privacy/part-3_introduction.ipynb)


## Simulations

We have built different tools for simulation including python API and CLI, you have seen the Job API and simulator CLI in [Chapter-1](../../../part-1_federated_learning_introduction/Chapter-1_running_federated_learning_applications/01.0_introduction/introduction.ipynb)

In [Chapter 4](../../chapter-4_setup_federated_system/04.0_introduction/introduction.ipynb) we will also discuss how to simulate the deployment within local machine. 


## Setup and Deployment

How to setup the federated computing system is not trivial task, We have built tool to make this process simpler. We will discuss this in In [Chapter 4](../../chapter-4_setup_federated_system/04.0_introduction/introduction.ipynb). 


## Configuration

NVFLARE support several configuration formats: JSON, pyhocon and YAML. you can see the details in [Configuration Files](https://nvflare.readthedocs.io/en/main/user_guide/configurations.html)

