Retry error on loading validation set images in multiclass project #32

jenspete · 2023-06-17T19:39:15Z

I believe I have tracked the error down to the line 336 in im_utils.py.

"all_annot_fnames = set(cur_annot_fnames + prev_annot_fnames)"

Reading line 338 makes me assume that the variable "all_annot_fnames" is supposed to match index-wise with all_dirs, as this is what cur_annot_fnames does. However, cur_annot_fnames has repeated elements, because it contains filenames across multiple directories corresponding to each of the multiple classes. Moreover the set datastructure does not preserve order, so any correspondence between cur_annot_fnames and all_dirs is lost. The end result is that sometimes filenames are assumed to be in directories they are not actually present in and it therefore triggers the retry error. If my interpretation is correct, I also assume that this means a much smaller validation set is actually used than what is present, as filenames are unique across classes.

jenspete · 2023-06-17T21:37:36Z

While I am still concerned about the above code, it seems the exception can be tracked to lines 276-279 in datasets.py. In my interpretation the loop runs over all classes and assumes a filename is present in each, which I guess, is not the case if not every class was annotated? Or is this an assumption now?

Abe404 · 2023-06-18T08:13:50Z

Thanks for the extra details. The code is wrong as far as I can tell - I think it should not assume that every class is annotated.

Just for my reference, this is related to: #30

I will look into both soon and propose a fix.

Abe404 · 2023-06-21T11:41:25Z

I've managed to reproduce the issue now, or at least I am getting errors with missing validation annotations that I should not be. I'm working on an automated test. Will update when I have more progress.

Abe404 · 2023-06-22T12:30:09Z

Note:

We may want to look into a consider the implications of the following code:

RootPainter3D/trainer/im_utils.py

Line 336 in bedb9dc

all_annot_fnames = set(cur_annot_fnames + prev_annot_fnames)

See how this line assume that dirs and fnames correspond:
https://github.com/Abe404/RootPainter3D/blob/bedb9dc66d0c7d8cb9bfde2d0b4c2eedfd99cd6a/trainer/im_utils.py#LL338C8-L338C8

But we converted the dirs to a set before this. Can we rely on consistent ordering?

Also it looks like we only get a single instance of each scan, if we use a set? So the validation set would only include a single class for each scan and other annotations would be ignored for this scan?

Abe404 · 2023-07-17T08:33:32Z

I have written a failing integration test for this now:
67a18ff

Next step is a failing unit test that better isolates the problem in im_utils.py

Abe404 · 2023-07-17T09:17:20Z

Failing unit test now implemented: 9552497

(depends on the commit after it to run).

it doesn't isolate the problem quite as much as I would like (which I think would require some refactoring and I don't want to shave the yak too much) but it shows what is going wrong in terms of patch refs not having the correct directories in a multi class project.

Abe404 · 2023-07-17T10:01:58Z

Unit test now passing:
c528ba2

But it looks like there's still a problem with the integration test.

The validation code still assumes there's an annotation for every class for every image in the validation set.

RootPainter3D/trainer/datasets.py

Line 286 in c528ba2

def get_annots_for_image(self, annot_fname):

Abe404 · 2023-07-17T10:24:23Z

Integration test now passing also.
7b14aa0

I believe this is fixed but I will try some more manual testing with everything running connected to the GUI just to make sure.

Abe404 · 2023-07-17T10:32:41Z

I'm occasionally getting this issue when testing with the total segment dataset: #34

It is intermittent due to the test picking random images. I will aim to address this soon (as a separate issue).

Abe404 · 2023-07-20T14:40:09Z

#34 is almost finished. I will close both issues once all tests pass and I've tested it out manually.

I think #29 might also need a bit more work for the manual test to go well.

Abe404 · 2023-07-27T11:03:03Z

This is fixed now. Please let me know if you run into any more problems.

Abe404 · 2023-07-30T13:41:03Z

I found another issue related to this so re-opening.

There is a problem with the function load_train_image_and_annot

RootPainter3D/trainer/im_utils.py

Line 246 in 3bfdbe5

    
           def load_train_image_and_annot(dataset_dir, train_seg_dirs, train_annot_dirs, use_seg,

It is supposed to always return a matching list of class names and annotations. When force_fg is true it may not return annotations that do not have any foreground (as these should not be used in training) but it still returns the corresponding class, leading to a mismatch between the list of classes and list of annotations, which causes an exception in the loss function.

This means training can crash for projects where only the background is annotated for some images, without any foreground annotated.

This is the test that revealed the bug:
0a701df

(testing multi class training works on total segmentor dataset).

Abe404 · 2023-07-30T14:07:48Z

I found another issue related to this so re-opening.

There is a problem with the function load_train_image_and_annot

RootPainter3D/trainer/im_utils.py

Line 246 in 3bfdbe5

def load_train_image_and_annot(dataset_dir, train_seg_dirs, train_annot_dirs, use_seg,

It is supposed to always return a matching list of class names and annotations. When force_fg is true it may not return annotations that do not have any foreground (as these should not be used in training) but it still returns the corresponding class, leading to a mismatch between the list of classes and list of annotations, which causes an exception in the loss function.

This means training can crash for projects where only the background is annotated for some images, without any foreground annotated.

This is the test that revealed the bug: 0a701df

(testing multi class training works on total segmentor dataset).

Fixed with: 0449648

Abe404 self-assigned this Jul 16, 2023

Abe404 closed this as completed Jul 27, 2023

Abe404 reopened this Jul 30, 2023

Abe404 closed this as completed Jul 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry error on loading validation set images in multiclass project #32

Retry error on loading validation set images in multiclass project #32

jenspete commented Jun 17, 2023

jenspete commented Jun 17, 2023

Abe404 commented Jun 18, 2023 •

edited

Loading

Abe404 commented Jun 21, 2023

Abe404 commented Jun 22, 2023

Abe404 commented Jul 17, 2023

Abe404 commented Jul 17, 2023

Abe404 commented Jul 17, 2023 •

edited

Loading

Abe404 commented Jul 17, 2023

Abe404 commented Jul 17, 2023

Abe404 commented Jul 20, 2023 •

edited

Loading

Abe404 commented Jul 27, 2023

Abe404 commented Jul 30, 2023 •

edited

Loading

Abe404 commented Jul 30, 2023

Retry error on loading validation set images in multiclass project #32

Retry error on loading validation set images in multiclass project #32

Comments

jenspete commented Jun 17, 2023

jenspete commented Jun 17, 2023

Abe404 commented Jun 18, 2023 • edited Loading

Abe404 commented Jun 21, 2023

Abe404 commented Jun 22, 2023

Abe404 commented Jul 17, 2023

Abe404 commented Jul 17, 2023

Abe404 commented Jul 17, 2023 • edited Loading

Abe404 commented Jul 17, 2023

Abe404 commented Jul 17, 2023

Abe404 commented Jul 20, 2023 • edited Loading

Abe404 commented Jul 27, 2023

Abe404 commented Jul 30, 2023 • edited Loading

Abe404 commented Jul 30, 2023

Abe404 commented Jun 18, 2023 •

edited

Loading

Abe404 commented Jul 17, 2023 •

edited

Loading

Abe404 commented Jul 20, 2023 •

edited

Loading

Abe404 commented Jul 30, 2023 •

edited

Loading