Notable performance difference between nnUNet v1 and v2 #1779

MathijsdeBoer · 2023-11-02T10:35:37Z

MathijsdeBoer
Nov 2, 2023

Hey there!

I've been using nnUNet for a little while and am in the process of moving some earlier v1 based models into v2 for easier distributing. Virtually all my transferred datasets result in models with statistically identical performance, except one. This dataset consists of a TOF-MRA, with individual arterial segments labelled, resulting in some 21 classes.

Whereas the v1 dataset showed a remarkably good performance, I have been unable to replicate this performance in the v2 environment. I got close by attempting a two-stage variation, where the first model would predict a binary foreground/background label, and feeding that into the next stage. When using a binary mask based on the manual GT in the second stage, I got excellent performance, roughly on par with v1, but when I switched this binary mask with the predictions of the previous model, performance dropped to the regular v2 model level again. Finally, I attempted to create a similar binary mask using some basic automatic seedpoint selection and region growing techniques. Unfortunately still no dice, if you'll excuse the pun.

v1 Base: standard nnUNet v1 preprocessing and training
v2 Base: standard nnUNet v2 preprocesssing and training, data directly copied from v1 Task
v2 TwoStage w/ GT: Model with two channel input, one image, and one is the manual GT which has been binarized into one overall class
v2 TwoSTage w/ previous model: Same as w/ GT, but with the output from a single-class nnU-Net instead
v2 a-priori: Two channel input, one with region growing label in second channel to mimic TwoStage w/ GT

I'm running out of things to try, and my one remaining theory is that the v1 model just had a very lucky initialization. I've not tested this theory yet by rerunning training on a v1 model and collecting the metrics. I was wondering if there are any other ideas by people who are a little more intimately familiar with both codebases.

FabianIsensee · 2023-11-02T13:32:58Z

FabianIsensee
Nov 2, 2023
Maintainer

Hey, we recently had a problem where predicted logits could overflow resulting in erroneously predicted background in some locations. This manifested as holes in the segmentations. This issue was also related to the dataset having many classes (this was totalsegmentator). We pushed a new release last week which addressed that. Can you please try that? Just rerun the validation/inference, no need to retrain!

1 reply

MathijsdeBoer Nov 6, 2023
Author

Hi Fabian,

Unfortunately, there's no real difference in performance. I guess I'll have to dig up a v1 Singularity image to see what happens if I train another model, and rule out any lucky initializations

MathijsdeBoer · 2023-11-03T16:14:36Z

MathijsdeBoer
Nov 3, 2023
Author

Hi, apologies for the delay, I ran into some issue with using nnUNet_compile=1 suddenly causing errors on train/infer. Something about AssertionError: libcuda.so cannot found! in torch.dynamo. For now I just can't be bothered to put on my library debugging hat, so I've disabled the feature.

Upon rereading I noticed that instead of pulling at v2.2, like I thought you meant, you meant the actual most recent commits in the git ( commit 2bc504d, for posterity).

Unfortunately, the pipeline to rebuild the Docker->Singularity image takes a long while to complete, so I can't quite test this until next week.

1 reply

MathijsdeBoer Nov 6, 2023
Author

For future reference, should anyone happen upon this page:

In my Dockerfile, I was using the nvcr.io/nvidia/pytorch:23.10-py3 base image.
Not sure what changed, but getting the error message described above was fixed with adding

RUN ldconfig /usr/local/cuda-12.2/compat/lib.real/

To the Dockerfile.

You can confirm that this worked for you by running

ldconfig -p | grep libcuda

Complete Dockerfile (With things added on top that I'm using, adapt it as you see fit):

FROM nvcr.io/nvidia/pytorch:23.10-py3
WORKDIR /

ARG DEBIAN_FRONTEND=noninteractive
ENV TZ=Europe/Amsterdam \
    nnUNet_compile=1 \
    RUSTUP_HOME=/usr/local/rustup \
    CARGO_HOME=/usr/local/cargo \
    PATH="${PATH}:/usr/local/cargo/bin:~/.local/bin" \
    CUDA_DEVICE_ORDER=PCI_BUS_ID

RUN apt-get update \
    && apt-get upgrade -y \
    && apt-get install -y \
    curl \
    wget \
    build-essential \
    git \
    cmake \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/* \
    && curl https://sh.rustup.rs -sSf | sh -s -- -y \
    && cargo --version \
    && python -m pip install -U pip

RUN pip install git+https://github.com/MIC-DKFZ/nnUNet.git@2bc504de420fb64f130bfe22ce9867cbad2b9d4a
RUN ldconfig /usr/local/cuda-12.2/compat/lib.real/

FabianIsensee · 2023-11-29T08:34:17Z

FabianIsensee
Nov 29, 2023
Maintainer

Hey @MathijsdeBoer , apologies for the delay in our response. I have been on parental leave this last month. Have you been able to run additional experiments or were you able to narrow down the problem in some other way?
Best,
Fabian

3 replies

MathijsdeBoer Nov 29, 2023
Author

Congratulations on the child! We've just decided to take the hit in performance and continue with v2, as that one is much easier to export and deploy for us. I did try to train again with a v1 docker container, and performance is a little better than v2 still, but not as good as the original v1 model, suggesting a lucky initialization or some other randomness.

As a final observation, we noticed that the dice metrics for each class were very similar to the original, good, v1 model with TTA turned off. Don't think that will lead to much, but it might indicate a bug? For the record, we use -f all to train, as we keep a separate train set, so there's no model ensembling happening for our purposes.

FabianIsensee Nov 30, 2023
Maintainer

Hey, hard to say what could be going on without diving into the data. If you can share it I would be happy to take a closer look but without it I fear there is not much I can do

MathijsdeBoer Dec 1, 2023
Author

Unfortunately, I don't think I have the ability to share the data, but I can always ask around.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notable performance difference between nnUNet v1 and v2 #1779

{{title}}

Replies: 3 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Notable performance difference between nnUNet v1 and v2 #1779

MathijsdeBoer Nov 2, 2023

Replies: 3 comments · 5 replies

FabianIsensee Nov 2, 2023 Maintainer

MathijsdeBoer Nov 6, 2023 Author

MathijsdeBoer Nov 3, 2023 Author

MathijsdeBoer Nov 6, 2023 Author

FabianIsensee Nov 29, 2023 Maintainer

MathijsdeBoer Nov 29, 2023 Author

FabianIsensee Nov 30, 2023 Maintainer

MathijsdeBoer Dec 1, 2023 Author

MathijsdeBoer
Nov 2, 2023

Replies: 3 comments 5 replies

FabianIsensee
Nov 2, 2023
Maintainer

MathijsdeBoer Nov 6, 2023
Author

MathijsdeBoer
Nov 3, 2023
Author

MathijsdeBoer Nov 6, 2023
Author

FabianIsensee
Nov 29, 2023
Maintainer

MathijsdeBoer Nov 29, 2023
Author

FabianIsensee Nov 30, 2023
Maintainer

MathijsdeBoer Dec 1, 2023
Author