Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to perform data augmentations using transform function. #815

Closed
anacis opened this issue Jun 1, 2021 · 5 comments · Fixed by #818 or #826
Closed

Unable to perform data augmentations using transform function. #815

anacis opened this issue Jun 1, 2021 · 5 comments · Fixed by #818 or #826
Assignees
Labels
bug category: fixes an error in the code priority:high

Comments

@anacis
Copy link

anacis commented Jun 1, 2021

Issue description

Current behavior

Whenever I try to train an ivadomed model with RandomeAffine and ElasticDeform transforms I receive an error stating:

Traceback (most recent call last): File "/home/acismaru/.local/bin/ivadomed", line 8, in <module> sys.exit(run_main()) File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/main.py", line 496, in run_main resume_training=bool(args.resume_training)) File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/main.py", line 275, in run_command debugging=context["debugging"]) File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/training.py", line 159, in train for i, batch in enumerate(train_loader): File "/home/acismaru/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__ data = self._next_data() File "/home/acismaru/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/acismaru/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/acismaru/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/loader/loader.py", line 526, in __getitem__ data_type="im") File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/transforms.py", line 150, in __call__ sample, metadata = tr(sample, metadata) File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/transforms.py", line 49, in wrapper return wrapped(self, sample, metadata) File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/transforms.py", line 75, in wrapper return wrapped(self, sample, metadata) File "/home/acismaru/.local/lib/python3.6/site-packages/ivadomed/transforms.py", line 710, in __call__ metadata['rotation'] = [angle, axes] TypeError: list indices must be integers or slices, not str

It seems like metadata is wrapped in a list rather than a dictionary which is expected in the RandomAffine call function. I tried extracting the SampleMetadata object from that list and passing it in but that still yielded numerous errors.

I have been successful with training models on my dataset without the augmentations.

Steps to reproduce

I ran ivadomed --train -c aug_config.json and used the config file below:

{ "command": "train", "gpu_ids": [0], "path_output": "bids_model_aug_thresh_05", "model_name": "aug_0.1_thresh_05", "debugging": false, "object_detection_params": { "object_detection_path": null, "safety_factor": [1.0, 1.0, 1.0] }, "loader_parameters": { "path_data": ["bids_dataset_3D_thresh_05"], "subject_selection": {"n": [], "metadata": [], "value": []}, "target_suffix": ["_seg-manual"], "roi_params": { "suffix": null, "slice_filter_roi": null }, "contrast_params": { "training_validation": ["T2star"], "testing": ["T2star"], "balance": {} }, "slice_filter_params": { "filter_empty_mask": true, "filter_empty_input": true }, "slice_axis": "axial", "multichannel": false, "soft_gt": false }, "split_dataset": { "fname_split": null, "random_seed": 6, "center_test": [], "method": "per_patient", "balance": null, "train_fraction": 0.5, "test_fraction": 0.5 }, "training_parameters": { "batch_size": 11, "loss": { "name": "DiceLoss" }, "training_time": { "num_epochs": 1000, "early_stopping_patience": 50, "early_stopping_epsilon": 0.001 }, "scheduler": { "initial_lr": 0.001, "lr_scheduler": { "name": "CosineAnnealingLR", "base_lr": 1e-5, "max_lr": 1e-2 } }, "balance_samples": { "applied": false, "type": "gt" }, "mixup_alpha": null, "transfer_learning": { "retrain_model": null, "retrain_fraction": 1.0, "reset": true } }, "default_model": { "name": "Unet", "dropout_rate": 0.4, "bn_momentum": 0.1, "final_activation": "sigmoid", "depth": 3 }, "FiLMedUnet": { "applied": false, "metadata": "contrasts", "film_layers": [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] }, "Modified3DUNet": { "applied": false, "length_3D": [128, 128, 16], "stride_3D": [128, 128, 16], "attention": false, "n_filters": 8 }, "uncertainty": { "epistemic": false, "aleatoric": false, "n_it": 0 }, "postprocessing": { "remove_noise": {"thr": -1}, "keep_largest": {}, "binarize_prediction": {"thr": 0.5}, "uncertainty": {"thr": -1, "suffix": "_unc-vox.nii.gz"}, "fill_holes": {}, "remove_small": {"unit": "vox", "thr": 3} }, "evaluation_parameters": { "target_size": {"unit": "vox", "thr": [20, 100]}, "overlap": {"unit": "vox", "thr": 3} }, "transformation": { "NumpyToTensor": {}, "RandomAffine": { "degrees": 4.6, "scale": [0.02, 0.02], "translate": [0.03, 0.03], "applied_to": ["im", "gt"], "dataset_type": ["training"] }, "ElasticTransform": { "alpha_range": [1.0, 1.0], "sigma_range": [10, 10], "p": 0.5, "applied_to": ["im", "gt"], "dataset_type": ["training"] }, "NormalizeInstance": {"applied_to": ["im"]} } }

Environment

System description

I ran this on Ubuntu 18.04.5.

@charleygros
Copy link
Member

Thank you for reporting this issue @anacis !

I'll try to reproduce your error and get back to your shortly. Cheers

@charleygros charleygros self-assigned this Jun 1, 2021
@charleygros charleygros added the bug category: fixes an error in the code label Jun 1, 2021
@charleygros
Copy link
Member

charleygros commented Jun 1, 2021

I was able to reproduce the error using your config file (on the ivadomed tutorial dataset) 👍

The bug comes from how the metadata is handled by NumpyToTensor. The ivadomed team will work on fixing this in short delays.

@anacis: As a work around, could you please move the "NumpyToTensor" transform from the first to the second to last position? Like this:

"transformation": {
	"RandomAffine": { "degrees": 4.6, "scale": [0.02, 0.02], "translate": [0.03, 0.03], "applied_to": ["im", "gt"], "dataset_type": ["training"] },
	"ElasticTransform": { "alpha_range": [1.0, 1.0], "sigma_range": [10, 10], "p": 0.5, "applied_to": ["im", "gt"], "dataset_type": ["training"] },
	"NumpyToTensor": {},
	"NormalizeInstance": {"applied_to": ["im"]}
}

It has allowed me to run a training without encountering the bug you've reported, and hopefully it'll allow you to run your training with Data Augmentation.

Let us know how it goes. Cheers

@jcohenadad
Copy link
Member

Thank you for reporting this issue @anacis, and thank you for looking at it @charleygros 🙏

@charleygros
Copy link
Member

charleygros commented Jun 2, 2021

Thank you @dyt811 for accepting to work on this.

The idea is to:

  • Remove the NumpyToTensor transform from the config files
  • To force its application after the Data Augmentation (eg ElasticTransform, RandomAffine) and before NormalizeInstance

@cakester
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug category: fixes an error in the code priority:high
Projects
None yet
5 participants