Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Adding pre-commit hooks and black formatting #560

Merged
merged 11 commits into from
Sep 30, 2021
41 changes: 41 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
# See https://github.com/pre-commit/pre-commit-hooks/blob/master/.pre-commit-config.yaml for an example with more hooks

exclude: '^excluded_files_regex$'
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.0.1
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- id: check-ast
- id: check-merge-conflict
- id: debug-statements
- id: mixed-line-ending
args: [--fix=lf]

- repo: https://github.com/PyCQA/flake8
rev: 3.9.2
hooks:
- id: flake8
additional_dependencies: [flake8-typing-imports==1.7.0]

- repo: https://github.com/pre-commit/mirrors-autopep8
rev: v1.5.7
hooks:
- id: autopep8

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.910
hooks:
- id: mypy
additional_dependencies: [types-all]

- repo: https://github.com/ambv/black
rev: 21.9b0
hooks:
- id: black
language_version: python3.7
31 changes: 16 additions & 15 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ jobs that run in AzureML.
ensemble) using the parameter `model_id`.
- ([#554](https://github.com/microsoft/InnerEye-DeepLearning/pull/554)) Added a parameter `pretraining_dataset_id` to
`NIH_COVID_BYOL` to specify the name of the SSL training dataset.
- ([#560](https://github.com/microsoft/InnerEye-DeepLearning/pull/560)) Added pre-commit hooks.

### Changed
- ([#531](https://github.com/microsoft/InnerEye-DeepLearning/pull/531)) Updated PL to 1.3.8, torchmetrics and pl-bolts and changed relevant metrics and SSL code API.
Expand All @@ -47,7 +48,7 @@ gets uploaded to AzureML, by skipping all test folders.
- ([#546](https://github.com/microsoft/InnerEye-DeepLearning/pull/546)) Environment and hello_world_model documentation updated
- ([#525](https://github.com/microsoft/InnerEye-DeepLearning/pull/525)) Enable --store_dataset_sample
- ([#495](https://github.com/microsoft/InnerEye-DeepLearning/pull/495)) Fix model comparison.
- ([#547](https://github.com/microsoft/InnerEye-DeepLearning/pull/547)) The parameter pl_find_unused_parameters was no longer used
- ([#547](https://github.com/microsoft/InnerEye-DeepLearning/pull/547)) The parameter pl_find_unused_parameters was no longer used
to initialize the DDP Plugin.
- ([#482](https://github.com/microsoft/InnerEye-DeepLearning/pull/482)) Check bool parameter is either true or false.
- ([#475](https://github.com/microsoft/InnerEye-DeepLearning/pull/475)) Bug in AML SDK meant that we could not train
Expand Down Expand Up @@ -91,8 +92,8 @@ in inference-only runs when using lightning containers.
- ([#454](https://github.com/microsoft/InnerEye-DeepLearning/pull/454)) Checking that labels are mutually exclusive.
- ([#447](https://github.com/microsoft/InnerEye-DeepLearning/pull/447/)) Added a sanity check to ensure there are no
missing channels, nor missing files. If missing channels in the csv file or filenames associated with channels are
incorrect, pipeline exits with error report before running training or inference.
- ([#446](https://github.com/microsoft/InnerEye-DeepLearning/pull/446)) Guarding `save_outlier` so that it works when
incorrect, pipeline exits with error report before running training or inference.
- ([#446](https://github.com/microsoft/InnerEye-DeepLearning/pull/446)) Guarding `save_outlier` so that it works when
institution id and series id columns are missing.
- ([#441](https://github.com/microsoft/InnerEye-DeepLearning/pull/441)) Add script to move models from one AzureML workspace to another: `python InnerEye/Scripts/move_model.py`
- ([#417](https://github.com/microsoft/InnerEye-DeepLearning/pull/417)) Added a generic way of adding PyTorch Lightning
Expand Down Expand Up @@ -143,8 +144,8 @@ with the FastMRI challenge datasets.
console for easier diagnostics.
- ([#445](https://github.com/microsoft/InnerEye-DeepLearning/pull/445)) Adding test coverage for the `HelloContainer`
model with multiple GPUs
- ([#450](https://github.com/microsoft/InnerEye-DeepLearning/pull/450)) Adds the metric "Accuracy at threshold 0.5" to the classification report (`classification_crossval_report.ipynb`).
- ([#451](https://github.com/microsoft/InnerEye-DeepLearning/pull/451)) Write a file `model_outputs.csv` with columns
- ([#450](https://github.com/microsoft/InnerEye-DeepLearning/pull/450)) Adds the metric "Accuracy at threshold 0.5" to the classification report (`classification_crossval_report.ipynb`).
- ([#451](https://github.com/microsoft/InnerEye-DeepLearning/pull/451)) Write a file `model_outputs.csv` with columns
`subject`, `prediction_target`, `label`, `model_output` and `cross_validation_split_index`. This file is not written out for sequence models.
- ([#440](https://github.com/microsoft/InnerEye-DeepLearning/pull/440)) Added support for training of self-supervised
models (BYOL and SimCLR) based on the bring-your-own-model framework. Providing examples configurations for training
Expand Down Expand Up @@ -178,22 +179,22 @@ console for easier diagnostics.
- ([#437](https://github.com/microsoft/InnerEye-DeepLearning/pull/437)) Upgrade to PyTorch-Lightning 1.2.8.
- ([#439](https://github.com/microsoft/InnerEye-DeepLearning/pull/439)) Recovery checkpoints are now
named `recovery_epoch=x.ckpt` instead of `recovery.ckpt` or `recovery-v0.ckpt`.
- ([#451](https://github.com/microsoft/InnerEye-DeepLearning/pull/451)) Change the signature for function `generate_custom_report`
- ([#451](https://github.com/microsoft/InnerEye-DeepLearning/pull/451)) Change the signature for function `generate_custom_report`
in `ModelConfigBase` to take only the path to the reports folder and a `ModelProcessing` object.
- ([#444](https://github.com/microsoft/InnerEye-DeepLearning/pull/444)) The method `before_training_on_rank_zero` of
the `LightningContainer` class has been renamed to `before_training_on_global_rank_zero`. The order in which the
hooks are called has been changed.
- ([#458](https://github.com/microsoft/InnerEye-DeepLearning/pull/458)) Simplifying and generalizing the way we handle
data augmentations for classification models. The pipelining logic is now taken care of by a ImageTransformPipeline
- ([#458](https://github.com/microsoft/InnerEye-DeepLearning/pull/458)) Simplifying and generalizing the way we handle
data augmentations for classification models. The pipelining logic is now taken care of by a ImageTransformPipeline
class that takes as input a list of transforms to chain together. This pipeline takes of applying transforms on 3D or
2D images. The user can choose to apply the same transformation for all channels (RGB example) or whether to apply
different transformation for each channel (if each channel represents a different
modality / time point for example). The pipeline can now work directly with out-of-the box torchvision transform
(as long as they support [..., C, H, W] inputs). This allows to get rid of nearly all of our custom augmentations
functions. The conversion from pipeline of image transformation to ScalarItemAugmentation is now taken care of under
2D images. The user can choose to apply the same transformation for all channels (RGB example) or whether to apply
different transformation for each channel (if each channel represents a different
modality / time point for example). The pipeline can now work directly with out-of-the box torchvision transform
(as long as they support [..., C, H, W] inputs). This allows to get rid of nearly all of our custom augmentations
functions. The conversion from pipeline of image transformation to ScalarItemAugmentation is now taken care of under
the hood, the user does not need to call this wrapper for each config class. In models derived from ScalarModelConfig
to change which augmentations are applied to the images inputs (resp. segmentations inputs), users can override
`get_image_transform` (resp. `get_segmentation_transform`). These two functions replace the old
`get_image_transform` (resp. `get_segmentation_transform`). These two functions replace the old
`get_image_sample_transforms` method. See `docs/building_models.md` for more information on augmentations.

### Fixed
Expand All @@ -215,7 +216,7 @@ console for easier diagnostics.
- ([#450](https://github.com/microsoft/InnerEye-DeepLearning/pull/450)) Delete unused `classification_report.ipynb`.
- ([#455](https://github.com/microsoft/InnerEye-DeepLearning/pull/455)) Removed the AzureRunner conda environment.
The full InnerEye conda environment is needed to submit a training job to AzureML.
- ([#458](https://github.com/microsoft/InnerEye-DeepLearning/pull/458)) Getting rid of all the unused code for
- ([#458](https://github.com/microsoft/InnerEye-DeepLearning/pull/458)) Getting rid of all the unused code for
RandAugment & Co. The user has now instead complete freedom to specify the set of augmentations to use.
- ([#468](https://github.com/microsoft/InnerEye-DeepLearning/pull/468)) Removed the `KneeSinglecoil` example model

Expand Down
106 changes: 70 additions & 36 deletions Tests/ML/augmentations/test_transform_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,45 +7,64 @@
import PIL
import pytest
import torch
from torchvision.transforms import CenterCrop, ColorJitter, RandomAffine, RandomErasing, RandomHorizontalFlip, \
RandomResizedCrop, Resize, ToTensor
from torchvision.transforms import (
CenterCrop,
ColorJitter,
RandomAffine,
RandomErasing,
RandomHorizontalFlip,
RandomResizedCrop,
Resize,
ToTensor,
)
vale-salvatelli marked this conversation as resolved.
Show resolved Hide resolved
from torchvision.transforms.functional import to_tensor

from InnerEye.ML.augmentations.image_transforms import AddGaussianNoise, ElasticTransform, ExpandChannels, RandomGamma
from InnerEye.ML.augmentations.transform_pipeline import ImageTransformationPipeline, \
create_cxr_transforms_from_config
from InnerEye.ML.augmentations.image_transforms import (
AddGaussianNoise,
ElasticTransform,
ExpandChannels,
RandomGamma,
)
from InnerEye.ML.augmentations.transform_pipeline import (
ImageTransformationPipeline,
create_cxr_transforms_from_config,
)

from Tests.SSL.test_data_modules import cxr_augmentation_config

import numpy as np

image_size = (32, 32)
crop_size = 24
test_image_as_array = np.ones(list(image_size)) * 255.
test_image_as_array = np.ones(list(image_size)) * 255.0
test_image_as_array[10:15, 10:20] = 1
test_image_as_pil = PIL.Image.fromarray(test_image_as_array).convert("L")
test_2d_image_as_CHW_tensor = to_tensor(test_image_as_array)

test_2d_image_as_ZCHW_tensor = test_2d_image_as_CHW_tensor.unsqueeze(0)

test_4d_scan_as_tensor = torch.ones([5, 4, *image_size]) * 255.
test_4d_scan_as_tensor = torch.ones([5, 4, *image_size]) * 255.0
test_4d_scan_as_tensor[..., 10:15, 10:20] = 1


@pytest.mark.parametrize("use_different_transformation_per_channel", [True, False])
def test_torchvision_on_various_input(use_different_transformation_per_channel: bool) -> None:
def test_torchvision_on_various_input(
use_different_transformation_per_channel: bool,
) -> None:
"""
This tests that we can run transformation pipeline with out of the box torchvision transforms on various types
of input: PIL image, 3D tensor, 4D tensors. Tests that use_different_transformation_per_channel has the correct
behavior.
"""

transform = ImageTransformationPipeline(
[CenterCrop(crop_size),
RandomErasing(),
RandomAffine(degrees=(10, 12), shear=15, translate=(0.1, 0.3))
],
use_different_transformation_per_channel)
[
CenterCrop(crop_size),
RandomErasing(),
RandomAffine(degrees=(10, 12), shear=15, translate=(0.1, 0.3)),
],
use_different_transformation_per_channel,
)

# Test PIL image input
transformed = transform(test_image_as_pil)
Expand All @@ -68,22 +87,29 @@ def test_torchvision_on_various_input(use_different_transformation_per_channel:
assert transformed.shape == torch.Size([5, 4, crop_size, crop_size])

# Same transformation should be applied to all slices and channels.
assert torch.isclose(transformed[0, 0], transformed[1, 1]).all() != use_different_transformation_per_channel
assert (
torch.isclose(transformed[0, 0], transformed[1, 1]).all()
!= use_different_transformation_per_channel
)


@pytest.mark.parametrize("use_different_transformation_per_channel", [True, False])
def test_custom_tf_on_various_input(use_different_transformation_per_channel: bool) -> None:
def test_custom_tf_on_various_input(
use_different_transformation_per_channel: bool,
) -> None:
"""
This tests that we can run transformation pipeline with our custom transforms on various types
of input: PIL image, 3D tensor, 4D tensors. Tests that use_different_transformation_per_channel has the correct
behavior. The transforms are test individually in test_image_transforms.py
"""
pipeline = ImageTransformationPipeline(
[ElasticTransform(sigma=4, alpha=34, p_apply=1),
AddGaussianNoise(p_apply=1, std=0.05),
RandomGamma(scale=(0.3, 3))
],
use_different_transformation_per_channel)
[
ElasticTransform(sigma=4, alpha=34, p_apply=1),
AddGaussianNoise(p_apply=1, std=0.05),
RandomGamma(scale=(0.3, 3)),
],
use_different_transformation_per_channel,
)

# Test PIL image input
transformed = pipeline(test_image_as_pil)
Expand All @@ -104,29 +130,35 @@ def test_custom_tf_on_various_input(use_different_transformation_per_channel: bo
assert transformed.shape == test_4d_scan_as_tensor.shape

# Same transformation should be applied to all slices and channels.
assert torch.isclose(transformed[0, 0], transformed[1, 1]).all() != use_different_transformation_per_channel
assert (
torch.isclose(transformed[0, 0], transformed[1, 1]).all()
!= use_different_transformation_per_channel
)


def test_create_transform_pipeline_from_config() -> None:
"""
Tests that the pipeline returned by create_transform_pipeline_from_config returns the expected transformation.
"""
transformation_pipeline = create_cxr_transforms_from_config(cxr_augmentation_config, apply_augmentations=True)
fake_cxr_as_array = np.ones([256, 256]) * 255.
transformation_pipeline = create_cxr_transforms_from_config(
cxr_augmentation_config, apply_augmentations=True
)
fake_cxr_as_array = np.ones([256, 256]) * 255.0
fake_cxr_as_array[100:150, 100:200] = 1
fake_cxr_image = PIL.Image.fromarray(fake_cxr_as_array).convert("L")

all_transforms = [ExpandChannels(),
RandomAffine(degrees=180, translate=(0, 0), shear=40),
RandomResizedCrop(scale=(0.4, 1.0), size=256),
RandomHorizontalFlip(p=0.5),
RandomGamma(scale=(0.5, 1.5)),
ColorJitter(saturation=0, brightness=0.2, contrast=0.2),
ElasticTransform(sigma=4, alpha=34, p_apply=0.4),
CenterCrop(size=224),
RandomErasing(scale=(0.15, 0.4), ratio=(0.33, 3)),
AddGaussianNoise(std=0.05, p_apply=0.5)
]
all_transforms = [
ExpandChannels(),
RandomAffine(degrees=180, translate=(0, 0), shear=40),
RandomResizedCrop(scale=(0.4, 1.0), size=256),
RandomHorizontalFlip(p=0.5),
RandomGamma(scale=(0.5, 1.5)),
ColorJitter(saturation=0, brightness=0.2, contrast=0.2),
ElasticTransform(sigma=4, alpha=34, p_apply=0.4),
CenterCrop(size=224),
RandomErasing(scale=(0.15, 0.4), ratio=(0.33, 3)),
AddGaussianNoise(std=0.05, p_apply=0.5),
]

np.random.seed(3)
torch.manual_seed(3)
Expand All @@ -135,7 +167,7 @@ def test_create_transform_pipeline_from_config() -> None:
transformed_image = transformation_pipeline(fake_cxr_image)
assert isinstance(transformed_image, torch.Tensor)
# Expected pipeline
image = np.ones([256, 256]) * 255.
image = np.ones([256, 256]) * 255.0
image[100:150, 100:200] = 1
image = PIL.Image.fromarray(image).convert("L")
# In the pipeline the image is converted to tensor before applying the transformations. Do the same here.
Expand All @@ -154,7 +186,9 @@ def test_create_transform_pipeline_from_config() -> None:
assert torch.isclose(expected_transformed, transformed_image).all()

# Test the evaluation pipeline
transformation_pipeline = create_cxr_transforms_from_config(cxr_augmentation_config, apply_augmentations=False)
transformation_pipeline = create_cxr_transforms_from_config(
cxr_augmentation_config, apply_augmentations=False
)
transformed_image = transformation_pipeline(image)
assert isinstance(transformed_image, torch.Tensor)
all_transforms = [ExpandChannels(), Resize(size=256), CenterCrop(size=224)]
Expand Down