-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry error on loading validation set images in multiclass project #32
Comments
While I am still concerned about the above code, it seems the exception can be tracked to lines 276-279 in datasets.py. In my interpretation the loop runs over all classes and assumes a filename is present in each, which I guess, is not the case if not every class was annotated? Or is this an assumption now? |
Thanks for the extra details. The code is wrong as far as I can tell - I think it should not assume that every class is annotated. Just for my reference, this is related to: #30 I will look into both soon and propose a fix. |
I've managed to reproduce the issue now, or at least I am getting errors with missing validation annotations that I should not be. I'm working on an automated test. Will update when I have more progress. |
Note: We may want to look into a consider the implications of the following code: RootPainter3D/trainer/im_utils.py Line 336 in bedb9dc
See how this line assume that dirs and fnames correspond: But we converted the dirs to a set before this. Can we rely on consistent ordering? Also it looks like we only get a single instance of each scan, if we use a set? So the validation set would only include a single class for each scan and other annotations would be ignored for this scan? |
I have written a failing integration test for this now: Next step is a failing unit test that better isolates the problem in im_utils.py |
Failing unit test now implemented: 9552497 (depends on the commit after it to run). it doesn't isolate the problem quite as much as I would like (which I think would require some refactoring and I don't want to shave the yak too much) but it shows what is going wrong in terms of patch refs not having the correct directories in a multi class project. |
Unit test now passing: But it looks like there's still a problem with the integration test. The validation code still assumes there's an annotation for every class for every image in the validation set. RootPainter3D/trainer/datasets.py Line 286 in c528ba2
|
Integration test now passing also. I believe this is fixed but I will try some more manual testing with everything running connected to the GUI just to make sure. |
I'm occasionally getting this issue when testing with the total segment dataset: #34 It is intermittent due to the test picking random images. I will aim to address this soon (as a separate issue). |
This is fixed now. Please let me know if you run into any more problems. |
I found another issue related to this so re-opening. There is a problem with the function load_train_image_and_annot RootPainter3D/trainer/im_utils.py Line 246 in 3bfdbe5
It is supposed to always return a matching list of class names and annotations. When force_fg is true it may not return annotations that do not have any foreground (as these should not be used in training) but it still returns the corresponding class, leading to a mismatch between the list of classes and list of annotations, which causes an exception in the loss function. This means training can crash for projects where only the background is annotated for some images, without any foreground annotated. This is the test that revealed the bug: (testing multi class training works on total segmentor dataset). |
Fixed with: 0449648 |
I believe I have tracked the error down to the line 336 in im_utils.py.
"all_annot_fnames = set(cur_annot_fnames + prev_annot_fnames)"
Reading line 338 makes me assume that the variable "all_annot_fnames" is supposed to match index-wise with all_dirs, as this is what cur_annot_fnames does. However, cur_annot_fnames has repeated elements, because it contains filenames across multiple directories corresponding to each of the multiple classes. Moreover the set datastructure does not preserve order, so any correspondence between cur_annot_fnames and all_dirs is lost. The end result is that sometimes filenames are assumed to be in directories they are not actually present in and it therefore triggers the retry error. If my interpretation is correct, I also assume that this means a much smaller validation set is actually used than what is present, as filenames are unique across classes.
The text was updated successfully, but these errors were encountered: