Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
MXNet example as a plugin to OpenFL (#349)
* example how to add framework support added mxnet tutorial and adapter * add readme * Remove 'os' library from the shared descriptor * Minor change * changed the way set and get optimizer state in mxnet_adapter edited README add cuda monitor plugin edited list libraries in sd_requirements and requirements edited shard_descriptor * edited README edited list libraries requirements Minor fixes shard descriptor and mxnet adapter * shard descriptor minor changes * Minor changes * Update openfl-tutorials/interactive_api/MXNet_landmarks/README.md Co-authored-by: Igor Davidyuk <igor.davidyuk@intel.com> Co-authored-by: Igor Davidyuk <igor.davidyuk@intel.com>
- Loading branch information
1 parent
156d0ca
commit df22c83
Showing
11 changed files
with
1,025 additions
and
0 deletions.
There are no files selected for viewing
122 changes: 122 additions & 0 deletions
122
openfl-tutorials/interactive_api/MXNet_landmarks/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# MXNet Facial Keypoints Detection tutorial | ||
--- | ||
**Note:** | ||
|
||
Please pay attention that this task uses the dataset from Kaggle. To get the dataset you | ||
will need a Kaggle account and accept "Facial Keypoints Detection" [competition rules](https://www.kaggle.com/c/facial-keypoints-detection/rules). | ||
|
||
--- | ||
|
||
This tutorial shows how to use any other framework, different from already supported PyTorch and TensorFlow, together with OpenFl. | ||
|
||
## Installation of Kaggle API credentials | ||
|
||
**Before the start please make sure that you installed sd_requirements.txt on your virtual | ||
environment on an envoy machine.** | ||
|
||
To use the [Kaggle API](https://github.com/Kaggle/kaggle-api), sign up for | ||
a [Kaggle account](https://www.kaggle.com). Then go to the `'Account'` tab of your user | ||
profile `(https://www.kaggle.com/<username>/account)` and select `'Create API Token'`. This will | ||
trigger the download of `kaggle.json`, a file containing your API credentials. Place this file in | ||
the location `~/.kaggle/kaggle.json` | ||
|
||
For your security, ensure that other users of your computer do not have read access to your | ||
credentials. On Unix-based systems you can do this with the following command: | ||
|
||
`chmod 600 ~/.kaggle/kaggle.json` | ||
|
||
If you need proxy add "proxy": `"http://<ip_addr:port>" in kaggle.json`. It should looks like | ||
that: `{"username":"your_username","key":"token", "proxy": "ip_addr:port"}` | ||
|
||
*Information about Kaggle API settings has been taken from kagge-api [readme](https://github.com/Kaggle/kaggle-api).* | ||
|
||
*Useful [link](https://github.com/Kaggle/kaggle-api/issues/6) for a problem with proxy settings.* | ||
|
||
### 1. About dataset | ||
|
||
All information about the dataset you may find | ||
on [link](https://www.kaggle.com/c/facial-keypoints-detection/data) | ||
|
||
### 2. Adding support for a third-party framework | ||
|
||
You need to write your own adapter class which is based on `FrameworkAdapterPluginInterface` [class](https://github.com/intel/openfl/blob/develop/openfl/plugins/frameworks_adapters/framework_adapter_interface.py). This class should contain at least two methods: | ||
|
||
- `get_tensor_dict(model, optimizer=None)` - extracts tensor dict from a model and optionally[^1] an optimizer. The resulting tensors must be converted to **dict{str: numpy.array}** for forwarding and aggregation. | ||
|
||
- `set_tensor_dict(model, tensor_dict, optimizer=None, device=None)` - sets aggregated numpy arrays into the model or model and optimizer. To do so it gets `tensor_dict` variable as **dict{str: numpy.array}** and should convert it into suitable for your model or model and optimizer tensors. After that, it must load the prepared parameters into the model/model and optimizer. | ||
|
||
Your adapter should be placed in workspace directory. When you create `ModelInterface` class object at the `'***.ipunb'`, place the name of your adapter to the input parameter `framework_plugin`. Example: | ||
```py | ||
framework_adapter = 'mxnet_adapter.FrameworkAdapterPlugin' | ||
|
||
MI = ModelInterface(model=model, optimizer=optimizer, | ||
framework_plugin=framework_adapter) | ||
``` | ||
|
||
[^1]: Whether or not to forward the optimizer parameters is set in the `start` method (FLExperiment [class](https://github.com/intel/openfl/blob/develop/openfl/interface/interactive_api/experiment.py) object, parameter `opt_treatment`). | ||
|
||
### Run experiment | ||
|
||
1. Create a folder for each `envoy`. | ||
2. Put a relevant envoy_config in each of the n folders (n - number of envoys which you would like | ||
to use, in this tutorial there is two of them, but you may use any number of envoys) and copy | ||
other files from `envoy` folder there as well. | ||
3. Modify each `envoy` accordingly: | ||
|
||
- At `start_envoy.sh` change env_one to env_two (or any unique `envoy` names you like) | ||
|
||
- Put a relevant envoy_config `envoy_config_one.yaml` or `envoy_config_two.yaml` (or any other | ||
config file name consistent to the configuration file that is called in `start_envoy.sh`). | ||
4. Make sure that you installed requirements for each `envoy` in your virtual | ||
environment: `pip install -r sd_requirements.txt` | ||
5. Run the `director`: | ||
```sh | ||
cd director_folder | ||
./start_director.sh | ||
``` | ||
|
||
6. Run the `envoys`: | ||
```sh | ||
cd envoy_folder | ||
./start_envoy.sh env_one shard_config_one.yaml | ||
``` | ||
If kaggle-API setting are | ||
correct the download of the dataset will be started. If this is not the first `envoy` launch | ||
then the dataset will be redownloaded only if some part of the data are missing. | ||
|
||
7. Run the [MXNet_landmarks.ipynb](workspace/MXNet_landmarks.ipynb) notebook using | ||
Jupyter lab in a prepared virtual environment. For more information about preparation virtual | ||
environment look **[ | ||
Preparation virtual environment](#preparation-virtual-environment)** | ||
. | ||
|
||
* Install [MXNet 1.9.0](https://pypi.org/project/mxnet/1.9.0/) framework with CPU or GPU (preferred) support and [verify](https://mxnet.apache.org/versions/1.4.1/install/validate_mxnet.html) it: | ||
```bash | ||
pip install mxnet-cuXXX==1.9.0 | ||
``` | ||
|
||
* Run jupyter-lab: | ||
```bash | ||
cd workspare | ||
jupyter-lab | ||
``` | ||
|
||
### Preparation virtual environment | ||
|
||
* Create virtual environment | ||
|
||
```sh | ||
python3 -m venv venv | ||
``` | ||
|
||
* To activate virtual environment | ||
|
||
```sh | ||
source venv/bin/activate | ||
``` | ||
|
||
* To deactivate virtual environment | ||
|
||
```sh | ||
deactivate | ||
``` |
5 changes: 5 additions & 0 deletions
5
openfl-tutorials/interactive_api/MXNet_landmarks/director/director_config.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
settings: | ||
listen_host: localhost | ||
listen_port: 50051 | ||
sample_shape: ['96', '96'] | ||
target_shape: ['1'] |
4 changes: 4 additions & 0 deletions
4
openfl-tutorials/interactive_api/MXNet_landmarks/director/start_director.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
fx director start --disable-tls -c director_config.yaml |
12 changes: 12 additions & 0 deletions
12
openfl-tutorials/interactive_api/MXNet_landmarks/envoy/envoy_config_one.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
params: | ||
cuda_devices: [0] | ||
|
||
optional_plugin_components: | ||
cuda_device_monitor: | ||
template: openfl.plugins.processing_units_monitor.pynvml_monitor.PynvmlCUDADeviceMonitor | ||
settings: [] | ||
|
||
shard_descriptor: | ||
template: landmark_shard_descriptor.LandmarkShardDescriptor | ||
params: | ||
rank_worldsize: 1, 2 |
12 changes: 12 additions & 0 deletions
12
openfl-tutorials/interactive_api/MXNet_landmarks/envoy/envoy_config_two.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
params: | ||
cuda_devices: [1] | ||
|
||
optional_plugin_components: | ||
cuda_device_monitor: | ||
template: openfl.plugins.processing_units_monitor.pynvml_monitor.PynvmlCUDADeviceMonitor | ||
settings: [] | ||
|
||
shard_descriptor: | ||
template: landmark_shard_descriptor.LandmarkShardDescriptor | ||
params: | ||
rank_worldsize: 2, 2 |
170 changes: 170 additions & 0 deletions
170
openfl-tutorials/interactive_api/MXNet_landmarks/envoy/landmark_shard_descriptor.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
# Copyright (C) 2021-2022 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
"""Landmarks Shard Descriptor.""" | ||
|
||
import json | ||
import shutil | ||
from hashlib import md5 | ||
from logging import getLogger | ||
from pathlib import Path | ||
from random import shuffle | ||
from typing import Dict | ||
from typing import List | ||
from zipfile import ZipFile | ||
|
||
import numpy as np | ||
import pandas as pd | ||
from kaggle.api.kaggle_api_extended import KaggleApi | ||
|
||
from openfl.interface.interactive_api.shard_descriptor import ShardDataset | ||
from openfl.interface.interactive_api.shard_descriptor import ShardDescriptor | ||
|
||
logger = getLogger(__name__) | ||
|
||
|
||
class LandmarkShardDataset(ShardDataset): | ||
"""Landmark Shard dataset class.""" | ||
|
||
def __init__(self, dataset_dir: Path, | ||
rank: int = 1, worldsize: int = 1) -> None: | ||
"""Initialize LandmarkShardDataset.""" | ||
self.rank = rank | ||
self.worldsize = worldsize | ||
self.dataset_dir = dataset_dir | ||
self.img_names = list(self.dataset_dir.glob('img_*.npy')) | ||
|
||
# Sharding | ||
self.img_names = self.img_names[self.rank - 1::self.worldsize] | ||
# Shuffling the results dataset after choose half pictures of each class | ||
shuffle(self.img_names) | ||
|
||
def __getitem__(self, index) -> np.ndarray: | ||
"""Return a item by the index.""" | ||
# Get name key points file | ||
# f.e. image name: 'img_123.npy, corresponding name of the key points: 'keypoints_123.npy' | ||
kp_name = str(self.img_names[index]).replace('img', 'keypoints') | ||
return np.load(self.img_names[index]), np.load(self.dataset_dir / kp_name) | ||
|
||
def __len__(self) -> int: | ||
"""Return the len of the dataset.""" | ||
return len(self.img_names) | ||
|
||
|
||
class LandmarkShardDescriptor(ShardDescriptor): | ||
"""Landmark Shard descriptor class.""" | ||
|
||
def __init__(self, data_folder: str = 'data', | ||
rank_worldsize: str = '1, 1', | ||
**kwargs) -> None: | ||
"""Initialize LandmarkShardDescriptor.""" | ||
super().__init__() | ||
# Settings for sharding the dataset | ||
self.rank, self.worldsize = map(int, rank_worldsize.split(',')) | ||
|
||
self.data_folder = Path.cwd() / data_folder | ||
self.download_data() | ||
|
||
# Calculating data and target shapes | ||
ds = self.get_dataset() | ||
sample, target = ds[0] | ||
self._sample_shape = [str(dim) for dim in sample.shape] | ||
self._target_shape = [str(len(target.shape))] | ||
|
||
if self._target_shape[0] != '1': | ||
raise ValueError('Target has a wrong shape') | ||
|
||
def process_data(self, name_csv_file) -> None: | ||
"""Process data from csv to numpy format and save it in the same folder.""" | ||
data_df = pd.read_csv(self.data_folder / name_csv_file) | ||
data_df.fillna(method='ffill', inplace=True) | ||
keypoints = data_df.drop('Image', axis=1) | ||
cur_folder = self.data_folder.relative_to(Path.cwd()) | ||
|
||
for i in range(data_df.shape[0]): | ||
img = data_df['Image'][i].split(' ') | ||
img = np.array(['0' if x == '' else x for x in img], dtype='float32').reshape(96, 96) | ||
np.save(str(cur_folder / f'img_{i}.npy'), img) | ||
y = np.array(keypoints.iloc[i, :], dtype='float32') | ||
np.save(str(cur_folder / f'keypoints_{i}.npy'), y) | ||
|
||
def download_data(self) -> None: | ||
"""Download dataset from Kaggle.""" | ||
if self.is_dataset_complete(): | ||
return | ||
|
||
self.data_folder.mkdir(parents=True, exist_ok=True) | ||
|
||
logger.info('Your dataset is absent or damaged. Downloading ... ') | ||
api = KaggleApi() | ||
api.authenticate() | ||
|
||
if Path('data').exists(): | ||
shutil.rmtree('data') | ||
|
||
api.competition_download_file( | ||
'facial-keypoints-detection', | ||
'training.zip', path=self.data_folder | ||
) | ||
|
||
with ZipFile(self.data_folder / 'training.zip', 'r') as zipobj: | ||
zipobj.extractall(self.data_folder) | ||
|
||
(self.data_folder / 'training.zip').unlink() | ||
|
||
self.process_data('training.csv') | ||
(self.data_folder / 'training.csv').unlink() | ||
self.save_all_md5() | ||
|
||
def get_dataset(self, dataset_type='train') -> LandmarkShardDataset: | ||
"""Return a shard dataset by type.""" | ||
return LandmarkShardDataset( | ||
dataset_dir=self.data_folder, | ||
rank=self.rank, | ||
worldsize=self.worldsize | ||
) | ||
|
||
def calc_all_md5(self) -> Dict[str, str]: | ||
"""Calculate hash of all dataset.""" | ||
md5_dict = {} | ||
for root in self.data_folder.glob('*.npy'): | ||
md5_calc = md5() | ||
rel_file = root.relative_to(self.data_folder) | ||
|
||
with open(self.data_folder / rel_file, 'rb') as f: | ||
for chunk in iter(lambda: f.read(4096), b''): | ||
md5_calc.update(chunk) | ||
md5_dict[str(rel_file)] = md5_calc.hexdigest() | ||
return md5_dict | ||
|
||
def save_all_md5(self) -> None: | ||
"""Save dataset hash.""" | ||
all_md5 = self.calc_all_md5() | ||
with open(self.data_folder / 'dataset.json', 'w') as f: | ||
json.dump(all_md5, f) | ||
|
||
def is_dataset_complete(self) -> bool: | ||
"""Check dataset integrity.""" | ||
dataset_md5_path = self.data_folder / 'dataset.json' | ||
if dataset_md5_path.exists(): | ||
with open(dataset_md5_path, 'r') as f: | ||
old_md5 = json.load(f) | ||
new_md5 = self.calc_all_md5() | ||
return new_md5 == old_md5 | ||
return False | ||
|
||
@property | ||
def sample_shape(self) -> List[str]: | ||
"""Return the sample shape info.""" | ||
return self._sample_shape | ||
|
||
@property | ||
def target_shape(self) -> List[str]: | ||
"""Return the target shape info.""" | ||
return self._target_shape | ||
|
||
@property | ||
def dataset_description(self) -> str: | ||
"""Return the dataset description.""" | ||
return (f'Dogs and Cats dataset, shard number {self.rank} ' | ||
f'out of {self.worldsize}') |
2 changes: 2 additions & 0 deletions
2
openfl-tutorials/interactive_api/MXNet_landmarks/envoy/sd_requirements.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
pynvml | ||
kaggle |
6 changes: 6 additions & 0 deletions
6
openfl-tutorials/interactive_api/MXNet_landmarks/envoy/start_envoy.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/bin/bash | ||
set -e | ||
ENVOY_NAME=$1 | ||
SHARD_CONF=$2 | ||
|
||
fx envoy start -n "$ENVOY_NAME" --disable-tls --envoy-config-path "$SHARD_CONF" -dh localhost -dp 50051 |
Oops, something went wrong.