3D U-Net TensorFlow workspace for BraTS 2020 (#108)

* 3D U-Net for BraTS * Remove * Update moving UNet * Refactoring define_model * Working copy of 3D U-Net for BraTS * Update data.yaml * Update tf_3dunet_model.py * Update define_model.py * Crop dimension, input and output dimensions * Update plan.yaml * Bug fix for num_classes * Update README.md * Adding split_directory script * Update define_model.py * Update split_directory.sh * Update README.md * Update split_directory.sh * More parameters passed to model definition from plan.yaml * More parameters passed to model definition from plan.yaml * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update plan * Updating check * Updating check * Bug fix in define_model. Changed some defaults * Update cols.yaml * Update data.yaml * Update plan.yaml * Modifiying data directory * Update tasks_fast_estimator.yaml * Update tf_brats_dataloader.py * Update plan.yaml * Possible bug with db_stored_rounds. If it is 1, then it causes error on second round because graph cannot be found. Also probably want to check the input to make sure it is within a good range (e.g. max(2, db_stored_rounds) ) * Update tf_3dunet_model.py * Update __init__.py * Update tf_brats_dataloader.py * _init__.py updated * _init__.py updated * Trying to get __init__ to work * Missing files * Update test_hello_federation.sh * Update runner_keras.py * Update runner_keras.py * Flake8 fixes * Update cert check on collaborator * Flake8 fixes * Flake8 * Requested changes: tensorflow->tf; add check for existing directory; self.numFiles->self.numfiles * Adding error exception if no BraTS files are found. * Adding error exception if no BraTS files are found. * Bash shell correction for shard directories. * Update test_hello_federation.sh Co-authored-by: Ilya Trushkin <ilya.trushkin@intel.com>
securefederatedai · Aug 5, 2021 · e5bf981 · e5bf981
1 parent c261d26
commit e5bf981
Show file tree

Hide file tree

Showing 26 changed files with 1,243 additions and 7 deletions.
diff --git a/openfl-workspace/tf_3dunet_brats/.workspace b/openfl-workspace/tf_3dunet_brats/.workspace
@@ -0,0 +1,2 @@
+current_plan_name: default
+
diff --git a/openfl-workspace/tf_3dunet_brats/README.md b/openfl-workspace/tf_3dunet_brats/README.md
@@ -0,0 +1,166 @@
+# TensorFlow 3D U-Net for the BraTS dataset
+
+This is a full example for training the Brain Tumor Segmentation 2020 ([BraTS2020](https://www.med.upenn.edu/cbica/brats2020/data.html)) with OpenFL. 
+
+*Note: This is **not** the 3D U-Net model that was used in the paper and not the sharding used. Nevertheless, it should make a good template for how to train using OpenFL.*
+
+The files `src\dataloader.py` and `src\define_model.py` are where we define the TensorFlow [dataset loader](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) and the 3D U-Net model. In `src\dataloader.py` we demonstrate how to use an out-of-memory data loader that pulls batches of data from files as needed.
+
+## Steps to run
+
+1. Download the [BraTS 2020 dataset](https://www.med.upenn.edu/cbica/brats2020/registration.html). It should be the one labeled **BraTS'20 Training Data: Segmentation Task**. 
+
+2. Extract the `MICCAI_BraTS2020_TrainingData.zip` zip file to any folder. Let's call that folder `${DATA_PATH}`. The file structure of `${DATA_PATH}` should look like this: 
+
+```bash
+user@localhost ~$ tree ${DATA_PATH} -L 2
+${DATA_PATH}/MICCAI_BraTS2020_TrainingData
+├── BraTS20_Training_001
+│   ├── BraTS20_Training_001_flair.nii.gz    <── The MRI FLAIR channel (best one for prediction)
+│   ├── BraTS20_Training_001_seg.nii.gz      <── The ground truth label
+│   ├── BraTS20_Training_001_t1.nii.gz       <── The T1-weighted MRI channel
+│   ├── BraTS20_Training_001_t1ce.nii.gz     <── The T1-Contrast Enhanced-weighted MRI channel
+│   └── BraTS20_Training_001_t2.nii.gz       <── The T2-weighted MRI channel
+├── BraTS20_Training_002
+│   ├── BraTS20_Training_002_flair.nii.gz
+│   ├── BraTS20_Training_002_seg.nii.gz
+│   ├── BraTS20_Training_002_t1.nii.gz
+│   ├── BraTS20_Training_002_t1ce.nii.gz
+│   └── BraTS20_Training_002_t2.nii.gz
+├── ...
+├── BraTS20_Training_369
+│   ├── BraTS20_Training_369_flair.nii.gz
+│   ├── BraTS20_Training_369_seg.nii.gz
+│   ├── BraTS20_Training_369_t1.nii.gz
+│   ├── BraTS20_Training_369_t1ce.nii.gz
+│   └── BraTS20_Training_369_t2.nii.gz
+├── name_mapping.csv
+└── survival_info.csv
+```
+If `tree` is not installed, then run `sudo apt-get install tree` to install it (Ubuntu).
+
+3. In order for each collaborator to use separate slice of data, we split main folder into subfolders, one for each collaborator. **NOTE:** In the real world each collaborator will have it's own data and this split already exists. We're splitting here to simulate a federation with different participants.
+
+#### `split_directory.sh`
+```bash 
+#!/bin/bash
+# Copyright (C) 2020-2021 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+# Split the BraTS data directory into NUM_COLLABORATORS 
+
+SOURCE=${1}  # The directory where the BraTS dataset is located (e.g. ~/data/MICCAI_BraTS2020_TrainingData)
+DESTINATION=${2}   # The destination directory for the randomized, split training data folders
+NUM_COLLABORATORS=${3:-2}  # The number of collaborator splits for the subdirectories
+
+help() {
+    echo
+    echo "======================================================================="
+    echo "~$ split_directory.sh BRATS_DATA_SOURCE_DIRECTORY DESTINATION_DIRECTORY"
+    echo "======================================================================="
+    echo
+    echo "BRATS_DATA_SOURCE_DIRECTORY: The directory where the BraTS dataset is located (e.g. ~/data/MICCAI_BraTS2020_TrainingData)"
+    echo "DESTINATION DIRECTORY: The destination directory for the randomized, split training data folders (e.g. ~/brats_data_split)"
+    echo "NUM_COLLABORATORS: The number of collaborator splits for the subdirectories (default: 2)"
+    echo "-h, --help            display this help and exit"
+    echo
+    echo
+}
+
+if [ "$#" -lt 2 ] || ! [ -d ${1} ]; then
+    help
+    exit 1
+fi
+
+get_seeded_random()
+{
+  seed="$1"
+  openssl enc -aes-256-ctr -pass pass:"$seed" -nosalt \
+    </dev/zero 2>/dev/null
+}
+
+# Remove the destination directory if it exists
+if [ -d ${DESTINATION} ] 
+then
+    echo "Removing existing directory." 
+    rm -r ${DESTINATION}
+fi
+
+printf "Shard into ${NUM_COLLABORATORS} directories under ${DESTINATION}."
+echo ' '
+spin='-\|/'
+
+n=0
+i=0
+# Find the subdirectories under the SOURCE directory and randomly shuffle them (seed is the same)
+for f in `find ${SOURCE} -mindepth 1 -maxdepth 2 -type d | shuf --random-source=<(get_seeded_random 816)`; do
+
+  ((n++))
+
+  # The folder to put the folder
+  idx=$((n % ${NUM_COLLABORATORS}))
+
+  i=$(( (i+1) %4 ))
+  printf "\r${spin:$i:1} ${f}"
+
+  d=${DESTINATION}/split_${idx}/
+
+  # Make the directory (if it doesn't exist) and copy the folder to it.
+  mkdir -p ${d}
+  cp -r ${f} ${d}
+
+done
+
+echo ' '
+echo ' '
+```
+
+`~$ bash split_directory.sh ${DATA_PATH} ${NEW_PATH} ${NUMBER OF COLLABORATORS}`
+
+where `${NEW_PATH}` is where you want to copy the original data (and split it randomly into subdirectories). The default is 2 collaborators (so 2 splits).
+
+The new directories for the data are:
+```
+${NEW_PATH}
+├── split_0
+│   ├── BraTS20_Training_001
+│   ├── BraTS20_Training_002
+│   ├── BraTS20_Training_003
+│   ├── ...
+└── split_1
+    ├── BraTS20_Training_009
+    ├── BraTS20_Training_014
+    ├── BraTS20_Training_015
+    ├── ...
+```
+
+4. Now update the `plan/data.yaml` file to reflect the new data directories:
+
+```
+$ cat plan/data.yaml
+# Copyright (C) 2020 Intel Corporation
+# Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.
+
+# all keys under 'collaborators' corresponds to a specific colaborator name the corresponding dictionary has data_name, data_path pairs.
+# Note that in the mnist case we do not store the data locally, and the data_path is used to pass an integer that helps the data object
+# construct the shard of the mnist dataset to be use for this collaborator.
+#
+# collaborator_name,data_directory_path
+
+# You'll need to shard as necessary
+# Symbolically link the ./data directory to whereever you have BraTS stored.
+# e.g. ln -s ~/data/MICCAI_BraTS2020_TrainingData ./data/one
+
+one,${NEW_PATH}/split_0
+two,${NEW_PATH}/split_1
+
+```
+
+where you replace `${NEW_PATH}` by the new directory path
+
+5. We are ready to train! Try executing the [Hello Federation](https://openfl.readthedocs.io/en/latest/running_the_federation.baremetal.html#hello-federation-your-first-federated-learning-training) steps. Make sure you have `openfl` installed in your Python virtual environment. All you have to do is to specify collaborator data paths to slice folders. We have combined all 'Hello Federation' steps in a single bash script, so it is easier to test:
+
+```bash
+bash tests/github/test_hello_federation.sh tf_3dunet_brats fed_work12345alpha81671 one123dragons beta34unicorns localhost --col1-data-path $NEW_PATH/split_0 --col2-data-path $NEW_PATH/$SUBFOLDER/split_1 --rounds-to-train 5
+```
+The result of the execution of the command above is 5 completed training rounds. 
diff --git a/openfl-workspace/tf_3dunet_brats/plan/cols.yaml b/openfl-workspace/tf_3dunet_brats/plan/cols.yaml
@@ -0,0 +1,5 @@
+# Copyright (C) 2020 Intel Corporation
+# Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.
+
+collaborators:
+   - one
diff --git a/openfl-workspace/tf_3dunet_brats/plan/data.yaml b/openfl-workspace/tf_3dunet_brats/plan/data.yaml
@@ -0,0 +1,16 @@
+# Copyright (C) 2020 Intel Corporation
+# Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.
+
+# all keys under 'collaborators' corresponds to a specific colaborator name the corresponding dictionary has data_name, data_path pairs.
+# Note that in the mnist case we do not store the data locally, and the data_path is used to pass an integer that helps the data object
+# construct the shard of the mnist dataset to be use for this collaborator.
+#
+# collaborator_name,data_directory_path
+
+# You'll need to shard as necessary
+# Symbolically link the ./data directory to whereever you have BraTS stored.
+# e.g. ln -s ~/data/MICCAI_BraTS2020_TrainingData ./data/one
+
+one,~/MICCAI_BraTS2020_TrainingData/split_0
+two,~/MICCAI_BraTS2020_TrainingData/split_1
+
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/aggregator.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/aggregator.yaml
@@ -0,0 +1,4 @@
+template : openfl.component.Aggregator
+settings :
+    db_store_rounds   : 1
+
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/assigner.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/assigner.yaml
@@ -0,0 +1,9 @@
+template : openfl.component.RandomGroupedAssigner
+settings :
+  task_groups  :
+    - name       : train_and_validate
+      percentage : 1.0
+      tasks      :
+        - aggregated_model_validation
+        - train
+        - locally_tuned_model_validation
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/collaborator.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/collaborator.yaml
@@ -0,0 +1,5 @@
+template : openfl.component.Collaborator
+settings :
+    opt_treatment     : 'CONTINUE_LOCAL'
+    delta_updates     : True
+    db_store_rounds   : 1
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/compression_pipeline.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/compression_pipeline.yaml
@@ -0,0 +1 @@
+template: openfl.pipelines.NoCompressionPipeline
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/data_loader.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/data_loader.yaml
@@ -0,0 +1 @@
+template: openfl.federated.DataLoader
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/defaults b/openfl-workspace/tf_3dunet_brats/plan/defaults/defaults
@@ -0,0 +1,2 @@
+../../workspace/plan/defaults
+
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/network.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/network.yaml
@@ -0,0 +1,9 @@
+template: openfl.federation.Network
+settings:
+    agg_addr                   : auto
+    agg_port                   : auto
+    hash_salt                  : auto
+    disable_tls                : False
+    client_reconnect_interval  : 5
+    disable_client_auth        : False
+    cert_folder                : cert
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/task_runner.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/task_runner.yaml
@@ -0,0 +1 @@
+template: openfl.federated.task_runner.CoreTaskRunner
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_fast_estimator.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_fast_estimator.yaml
@@ -0,0 +1,22 @@
+aggregated_model_validation:
+  function : validate
+  kwargs   :
+    batch_size : 32
+    apply      : global
+    metrics    :
+      - accuracy
+
+locally_tuned_model_validation:
+  function : validate
+  kwargs   :
+    batch_size : 32
+    apply      : local
+    metrics    :
+      - accuracy
+train:
+  function : train
+  kwargs   :
+    batch_size : 32
+    epochs     : 1
+    metrics    :
+    - loss
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_keras.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_keras.yaml
@@ -0,0 +1,23 @@
+aggregated_model_validation:
+  function : validate
+  kwargs   :
+    batch_size : 32
+    apply      : global
+    metrics    :
+      - accuracy
+
+locally_tuned_model_validation:
+  function : validate
+  kwargs   :
+    batch_size : 32
+    apply      : local
+    metrics    :
+      - accuracy
+
+train:
+  function : train
+  kwargs   :
+    batch_size : 32
+    epochs     : 1
+    metrics    :
+    - loss
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_tensorflow.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_tensorflow.yaml
@@ -0,0 +1,23 @@
+aggregated_model_validation:
+  function : validate
+  kwargs   :
+    batch_size : 32
+    apply      : global
+    metrics    :
+      - acc
+
+locally_tuned_model_validation:
+  function : validate
+  kwargs   :
+    batch_size : 32
+    apply      : local
+    metrics    :
+      - acc
+
+train:
+  function : train_batches
+  kwargs   :
+    batch_size  : 32
+    num_batches : 1
+    metrics     :
+    - loss
diff --git a/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_torch.yaml b/openfl-workspace/tf_3dunet_brats/plan/defaults/tasks_torch.yaml
@@ -0,0 +1,19 @@
+aggregated_model_validation:
+  function : validate
+  kwargs   :
+    apply   : global
+    metrics :
+      - acc
+
+locally_tuned_model_validation:
+  function  : validate
+  kwargs    :
+    apply: local
+    metrics :
+      - acc
+
+train:
+  function : train_batches
+  kwargs   :
+    metrics     :
+    - loss