Initialization with shared reference leads to averaging the losses after each epoch. #2

gabrielriqu3ti · 2021-09-09T00:29:57Z

Hey,

I believe I have found a bug in this project.

When I train a GLIA-Net network, the total, the local and the global average losses are all equal after each epoch.

An example follows:

2021-09-04 23:39:19 [MainThread] INFO [TaskAneurysmSegTrainer] - (Time epoch: 6081.78)train epoch 57/66 finished. total_loss_avg: 0.1874 local_loss_avg: 0.1874 global_loss_avg: 0.1874 ap: 0.1771 auc: 0.9058 precision: 0.6890 recall: 0.0313 dsc: 0.0598 hd95: 20.5659 per_target_precision: 0.0385 per_target_recall: 0.0057

I believe the problem originates from the initialization of the OrderedDicts avg_losses and eval_avg_losses , created in the following lines:

GLIA-Net/core.py

Line 473 in 8d4dfb2

    
           avg_losses = OrderedDict(zip(list(losses.keys()), [RunningAverage()] * len(losses)))

GLIA-Net/core.py

Line 570 in 8d4dfb2

    
           eval_avg_losses = OrderedDict(zip(list(losses.keys()), [RunningAverage()] * len(losses)))

The exact problem is that the list containing the 3 RunningAverages are created using the notation [a]*n. This notation uses a shared reference for the 3 RunningAverages. Therefore, when we try to update a RunningAverage, all three losses are updated together.

A solution for this problem would be to use this initialization [RunningAverage() for _ in range(len(losses))] instead of [RunningAverage()] * len(losses).

This solution seams to work for me.

An example follows:

2021-09-08 20:58:40 [MainThread] INFO [TaskAneurysmSegTrainer] - (Time epoch: 6101.42)train epoch 86/86 finished. total_loss_avg: 0.2528 local_loss_avg: 0.2200 global_loss_avg: 0.0328 ap: 0.4419 auc: 0.9313 precision: 0.5942 recall: 0.3963 dsc: 0.4755 hd95: 12.1462 per_target_precision: 0.1321 per_target_recall: 0.1637

And as we can see the total average loss is equal to the sum of the local and global average losses.

Great work again!

Best regards,

Gabriel

The text was updated successfully, but these errors were encountered:

MeteorsHub · 2021-09-09T08:34:14Z

Hi, Gabriel.

Thanks you for the bug report. I have fixed this bug according to your suggestions.

Best regards

MeteorsHub added a commit that referenced this issue Sep 9, 2021

fix loss_avg bug (#2)

c33bc17

gabrielriqu3ti closed this as completed Sep 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialization with shared reference leads to averaging the losses after each epoch. #2

Initialization with shared reference leads to averaging the losses after each epoch. #2

gabrielriqu3ti commented Sep 9, 2021

MeteorsHub commented Sep 9, 2021 •

edited

Loading

Initialization with shared reference leads to averaging the losses after each epoch. #2

Initialization with shared reference leads to averaging the losses after each epoch. #2

Comments

gabrielriqu3ti commented Sep 9, 2021

MeteorsHub commented Sep 9, 2021 • edited Loading

MeteorsHub commented Sep 9, 2021 •

edited

Loading