Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autorunner AlgoEnsemble improvements #5564

Closed
myron opened this issue Nov 22, 2022 · 9 comments
Closed

Autorunner AlgoEnsemble improvements #5564

myron opened this issue Nov 22, 2022 · 9 comments

Comments

@myron
Copy link
Collaborator

myron commented Nov 22, 2022

we need to make a few fixes to autorunner and ensembler

  1. BUG: AutoRunner().set_num_fold() does not update the enemble method num_folds. So e.g. if the default method is AlgoEnsembleBestByFold (it'll be initialized with 5 folds by default), but when a user calls AutoRunner().set_num_fold(2), it only updates number of folds for training, and the ensembling will still look for 5 folds, which results in some algorithms being None. I believe this error AutoRunner small updates #5523 (comment) was due to this, and not only due to Auto3DSeg ensembler module doesn't work with Bundle Algorithms generated by the HPO Module #5558

  2. change the default ensemble method to AlgoEnsembleBestByFold in AutoRunner().set_ensemble_method()

2b. improve AlgoEnsembleBestByFold to return N best from each fold. Currently it returns only 1 best result, but we need it to return for e.g. N=2, or N=3)

  1. VoteEnsemble() currently does not work for (sigmoid==True) mode

  2. remove pickling of algorithms class instances, and re-initialize them in standard way

  3. in class AlgoEnsembleBestN() line 196 , raises error if I select only 1 algorithm and 1 fold for training. The default n_best is 2, since we trained only 1 fold/1alg, len(ranks)==1

  4. in the ensembling step we accumulate all results in memory, and then save them to disk. this can create OOM if RAM is small. I think we should save to disk each result as soon as we get it. https://github.com/Project-MONAI/MONAI/blob/dev/monai/apps/auto3dseg/ensemble_builder.py#L162

@mingxin-zheng

@myron
Copy link
Collaborator Author

myron commented Dec 7, 2022

first issue is addressed in #5667

@mingxin-zheng
Copy link
Contributor

mingxin-zheng commented Dec 13, 2022

Hi @myron , for the Item 3, I am wondering where the change should happen. Do you think it should be making monai.transforms.VoteEnsemble support one-hot encoding input?

wyli pushed a commit that referenced this issue Dec 13, 2022
#5722)

Change the default ensemble method

Signed-off-by: Mingxin Zheng
<18563433+mingxin-zheng@users.noreply.github.com>

Fixes Item 2 in #5564 .

### Description

A few sentences describing the changes proposed in this pull request.

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [x] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.

Signed-off-by: Mingxin Zheng <18563433+mingxin-zheng@users.noreply.github.com>
@myron
Copy link
Collaborator Author

myron commented Dec 13, 2022

VoteEnsemble

I AlgoEnemble.ensemble_pred()

            classes = [prob2class(p, dim=0, keepdim=True, sigmoid=False) for p in preds]
            return VoteEnsemble(num_classes=preds[0].shape[0])(classes)

it converts to classes fo sigmoid==False (hardcoded).
there should be an alternative for sigmoid==True

  • get classes as classes = [prob2class(p, dim=0, keepdim=True, sigmoid=True) for p in preds] (size PxCxHxWxD), then you can rearrange, for each channel , may be like this
 for i in range(classes.shape[1]):
        classes_i = [c[:,[i]] for c in range(len(classes))]
         v_i=VoteEnsemble(...)(classes_i)
 return concat/stack(v_i)...

@myron
Copy link
Collaborator Author

myron commented Dec 13, 2022

PS: I also added issue 5 (above) @mingxin-zheng

@mingxin-zheng
Copy link
Contributor

mingxin-zheng commented Dec 14, 2022

@myron Thanks for the explanation. Before I ran into issues using VoteEnsemble with one-hot encoding. After checking the code details, I think the way to VoteEnsemble in my PR should address and give equivalent results.

@mingxin-zheng
Copy link
Contributor

PS: I also added issue 5 (above) @mingxin-zheng

Issue 5 was also addressed. Now it would post a warning instead of raising an error.

wyli pushed a commit that referenced this issue Dec 14, 2022
)

Signed-off-by: Mingxin Zheng
<18563433+mingxin-zheng@users.noreply.github.com>

Fixes item 3 and 5 in #5564.


### Description

The `vote` method in `ensemble_pred` currently does not work for under
sigmoid mode, because the function overrides the argument to False
before the `VoteEnsemble`.

Also, if the user only trains a small number of algorithm (1 fold for 1
algo) and forgets the update the `n_best` (default is 5) in
`AlgoEnsembleBestN` , instead of throwing an error, the fix will
automatically use all available algos after posting a warning.

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [x] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.

Signed-off-by: Mingxin Zheng <18563433+mingxin-zheng@users.noreply.github.com>
@myron
Copy link
Collaborator Author

myron commented Dec 14, 2022

@mingxin-zheng ok, thank you. I've added 6th item (it's non-urgent)

@mingxin-zheng
Copy link
Contributor

@myron Thanks for the suggestion. When the saving should happen, right after infer_instance.predict()?

Also, item 2 is fixed, but this enhancement isn't implemented yet, just fyi
"improve AlgoEnsembleBestByFold to return N best from each fold. Currently it returns only 1 best result, but we need it to return for e.g. N=2, or N=3)"

@myron
Copy link
Collaborator Author

myron commented Dec 15, 2022

@myron Thanks for the suggestion. When the saving should happen, right after infer_instance.predict()?

Also, item 2 is fixed, but this enhancement isn't implemented yet, just fyi "improve AlgoEnsembleBestByFold to return N best from each fold. Currently it returns only 1 best result, but we need it to return for e.g. N=2, or N=3)"

no, after the self.ensemble_pred()
https://github.com/Project-MONAI/MONAI/blob/dev/monai/apps/auto3dseg/ensemble_builder.py#L162

It will require some re-design to be for each file ( infer all models, ensemble, and save). Currently, we have for each file (infer all models, ensemble), then again for each file (save)

@myron myron closed this as completed May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants