# Privacy Preservation using NVFlare's Filters

[Filters](https://nvflare.readthedocs.io/en/main/programming_guide/filters.html) in NVIDIA FLARE are a type of FLComponent that has a process method to transform the Shareable object between the communicating parties. A Filter can be used to provide additional processing to shareable data before sending or after receiving from the peer.

The `FLContext` is available for the `Filter` to use. Filters can be added to your NVFlare job using the [FedJob API](https://nvflare.readthedocs.io/en/main/programming_guide/fed_job_api.html#fedjob-api) you should be familiar with from previous chapters.

#### Filters
In NVFlare, filters are used for the pre- and post-processing of a task's data.

Before sending a task to the `Executor`, the `Controller` applies any available “task data filters” to the task data, ensuring only the filtered data is transmitted. Likewise, when receiving the task result from the `Executor`, “task result filters” are applied before passing it to the `Controller`. On the `Executor` side, similar filtering occurs—“task data filters” process incoming task data before execution, and “task result filters” refine the computed result before sending it back to the `Controller`.

![NVFlare's Filter Concept](https://nvflare.readthedocs.io/en/main/_images/Filters.png)


#### Examples of Filters
Filters are the primary technique for data privacy protection.

Filters can convert data formats and a lot more. You can apply any type of massaging to the data for the purpose of security. In fact, privacy and homomorphic encryption techniques are all implemented as filters:

ExcludeVars to exclude variables from shareable (`nvflare.app_common.filters.exclude_vars`)

PercentilePrivacy for truncation of weights by percentile (`nvflare.app_common.filters.percentile_privacy`)

SVTPrivacy for differential privacy through sparse vector techniques (`nvflare.app_common.filters.svt_privacy`)

Homomorphic encryption filters to encrypt data before sharing (`nvflare.app_common.homomorphic_encryption.he_model_encryptor` and `nvflare.app_common.homomorphic_encryption.he_model_decryptor`)

#### Adding a Filter with the JobAPI
You can add `Filters` to an NVFlare job using the `job.to()` method by specifying which tasks the filter applies to and when to apply it, **before** or **after** the task.

The behavior can be selected by using the [FilterType](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.job_config.defs.html#nvflare.job_config.defs.FilterType). Users must specify the filter type as either `FilterType.TASK_RESULT` (flow from executor to controller) or `FilterType.TASK_DATA` (flow from controller to executor).

The filter will be added "task_data_filters" and task_result_filters accordingly and be applied to the specified tasks (defaults to “[*]” for all tasks).

For example, you can add a privacy filter as such.
```python
pp_filter = PercentilePrivacy(percentile=10, gamma=0.01)
job.to(pp_filter, "site-1", tasks=["train"], filter_type=FilterType.TASK_RESULT)
```

#### Enforcing of Filters
Data owners can enforce filters to be applied to any job they execute. Enforcing filters on all executed jobs ensures that data owners maintain control over privacy and compliance. This can be useful for several reasons:  

- **Consistent Privacy Protection:** Ensures that every model update follows predefined privacy policies, reducing the risk of accidental data leakage.  
- **Regulatory Compliance:** Helps meet legal and ethical standards (e.g., HIPAA, GDPR) by enforcing data anonymization or masking sensitive information.  
- **Defense Against Emerging Threats:** Provides a safeguard against evolving attack techniques, such as model inversion, membership inference, or detection of malicious model weights.  
- **Customization for Sensitive Data:** Allows data owners to tailor privacy mechanisms to their specific data types, ensuring that only necessary information is shared.  
- **Trust and Collaboration:** Encourages participation in Federated Learning by reassuring institutions that their data remains secure throughout the process.  

By enforcing privacy filters in NVFlare, data owners can ensure a reliable and secure FL environment without relying solely on external safeguards. For more details, see the [documentation](https://nvflare.readthedocs.io/en/main/user_guide/security/site_policy_management.html#privacy-management).

#### Writing Your Own Filter
For writing your own filter, you can utilize the [DXOFilter](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.apis.dxo_filter.html#nvflare.apis.dxo_filter.DXOFilter) base class. For details see the [documentation](https://nvflare.readthedocs.io/en/main/programming_guide/filters.html). 

First, we define a simple `Filter`, that prints the message content without modifying it. All that's needed it to write a `process_dxo()` method. In this case, the filter can process both `WEIGHTS` and `WEIGHT_DIFF`, specifying the type of data kind the filter should be applied to.

```
from nvflare.apis.dxo import DXO, DataKind
from nvflare.apis.dxo_filter import DXOFilter
from nvflare.apis.fl_context import FLContext
from nvflare.apis.shareable import Shareable


class DummyFilter(DXOFilter):
    def __init__(self):
        data_kinds = [DataKind.WEIGHTS, DataKind.WEIGHT_DIFF]
        super().__init__(supported_data_kinds=data_kinds, data_kinds_to_filter=data_kinds)

    def process_dxo(self, dxo: DXO, shareable: Shareable, fl_ctx: FLContext):
        self.log_info(fl_ctx, f"Filtering DXO: {dxo}")

        return dxo
```
To package this code as part of the NVFlare job, we include this class as a custom file [dummy_filter.py](src/dummy_filter.py).

Now, you can test this filter, using a simple NVFlare Job script. For this, we use a numpy controller and executors.

In [None]:
from nvflare.app_common.workflows.fedavg import FedAvg
from nvflare.app_opt.pt.job_config.base_fed_job import BaseFedJob
from nvflare.job_config.script_runner import ScriptRunner
from nvflare import FilterType

job = BaseFedJob(
    name="dummy_filter",
)

We can now add a filter to each client that is applied before the message is sent back to the server
using the `job.to()` method.

> **Note:** Use `filter_type=FilterType.TASK_RESULT` as we add the filter on top of the model after the `Executor` (in this case `ScriptRunner`) has completed the task.

In [None]:
n_clients = 1

controller = FedAvg(
    num_clients=n_clients,
    num_rounds=1,
)
job.to(controller, "server")

from src.dummy_filter import DummyFilter

for i in range(n_clients):
    runner = ScriptRunner(script="src/dummy_script.py")
    job.to(runner, f"site-{i+1}")

    # add dummy filter.
    job.to(DummyFilter(), f"site-{i+1}", tasks=["train"], filter_type=FilterType.TASK_RESULT)
    

Now, we can simply run the job with the simulator.

In [None]:
job.simulator_run("/tmp/nvflare/dummy_output")

Next, we'll learn how to use `Filters` and other technqiues to introduce [Differential Privacy (DP)](../05.2_differential_privacy/privacy_with_differential_privacy.ipynb) into your model training with NVFlare.