# Calculate Image Histogram

In this example, we will compute the local and global image statistics

## Setup NVFLARE

Follow [Getting Started](https://nvflare.readthedocs.io/en/main/getting_started.html) to set up a virtual environment and install NVFLARE.


## Install requirements

In [None]:
%pip install -r image_stats/requirements.txt

## Download data

As an example, we use the dataset from the ["COVID-19 Radiography Database"](https://www.kaggle.com/tawsifurrahman/covid19-radiography-database).
it contains png image files in four different classes: `COVID`, `Lung_Opacity`, `Normal`, and `Viral Pneumonia`.
First create a temp directory, then we download and extract to `/tmp/nvflare/image_stats/data/.`.

In [None]:
! python download_data.py

Once you have extract the data from zip file, you can check the directory

In [None]:
! ls -al /tmp/nvflare/image_stats/data/*


## Prepare data

Next, create the data lists simulating different clients with varying amounts and types of images. 
The downloaded archive contains subfolders for four different classes: `COVID`, `Lung_Opacity`, `Normal`, and `Viral Pneumonia`.
Here we assume each class of image corresponds to a different sites.

```
from prepare_data import prepare_data
prepare_data(input_dir = "/tmp/nvflare/image_stats/data", 
             input_ext = ".png",
             output_dir ="/tmp/nvflare/image_stats/data")
```

In [None]:

# or simply 
! python prepare_data.py 



## Run Job

**Run Job with Job Recipe in Simulated Env.**



In [None]:
from client import ImageStatistics

from nvflare.recipe import SimEnv
from nvflare.recipe.fedstats import FedStatsRecipe

data_root_dir = "/tmp/nvflare/image_stats/data"
n_clients = 3
output_path = "statistics/image_statistics.json"


statistic_configs = {"count": {}, "histogram": {"*": {"bins": 20, "range": [0, 256]}}}
# define local stats generator
stats_generator = ImageStatistics(data_root_dir)

sites = [f"site-{i + 1}" for i in range(n_clients)]
recipe = FedStatsRecipe(
        name="stats_image",
        stats_output_path=output_path,
        sites=sites,
        statistic_configs=statistic_configs,
        stats_generator=stats_generator,
        min_count=10,
    )

env = SimEnv(clients=sites, num_threads=n_clients)
recipe.execute(env=env)


The results are stored in workspace "/tmp/nvflare/image_stats"

In [None]:
! ls -al  /tmp/nvflare/simulation/stats_image/server/simulate_job/statistics/image_statistics.json

## Visualization
We can visualize the results easly via the visualizaiton notebook. Before we do that, we need to copy the data to the notebook directory 


In [None]:
! cp  /tmp/nvflare/simulation/stats_image/server/simulate_job/statistics/image_statistics.json image_stats/demo/.

now we can visualize via the [visualization notebook](demo/visualization.ipynb)

We are not quite done yet. What if you prefer to use python API instead CLI to run jobs. Lets do that in this section

## We are done !
Congratulations, you just completed the federated stats image histogram calulation
