Random Failures #12

tchott · 2023-01-10T21:03:54Z

Hello!

I have been experiencing this random error with your code. If I run this code multiple times (this was cut from a pytest), it will randomly fail due to the len of keep being 0. Here are my fixed values:

import numpy  as np 
from lsnms import nms  #v0.3.1
nms_test_dict_failure = {
     'scores': np.array([0.03128776, 0.15489164, 0.05489164]) 
     'boxes': np.array([[ 623.47991943,  391.94015503,  675.83850098,  445.0836792 ],
         [  11.48574257,   15.99506855, 1053.84313965, 1074.78381348],
         [  11.48574257,   15.99506855, 1053.84313965, 1074.78381348]])
     'class_labels': np.array([ 1, 27, 23]),
     'score_thresh': 0.1} 

def test_lsnms_failure_scenario(): 
  keep = nms(
              boxes=nms_test_dict_fail["boxes"],
              scores=nms_test_dict_fail["scores"],
              score_threshold=nms_test_dict_fail["score_thresh"],
              class_ids=nms_test_dict_fail["class_labels"],
          )   
  assert len(keep)==1

Is there some randomness to how the tree is generated? Here is why I am confused. For this specific example:

Two of the bboxes are the exact same but are distinct in their class label.
Each of the three class ids are distinct

I am wondering if something is off in your offset_bboxes code that would create boxes where these examples would be overlapping. Regardless, based on the chosen score threshold alone, it seems I should always have a singular detection kept, regardless of the associated bboxes and their associated categories. It would be weird to have a singular detection make it successfully through the scores check of your code on

lsnms/lsnms/nms.py

Line 23 in 62139e6

boxes = boxes[scores > score_threshold]

and for that singular detection to get nms into 0 detections in the end.

Do you have any insights into what I might be failing to consider in this scenario?

Thanks!

Taylor Chott.

The text was updated successfully, but these errors were encountered:

tchott · 2023-01-10T21:52:46Z

I have further isolated this down to being purely dependent on the fact that there are 3 class ids leveraged. In the case that all detections have the same class id, this is NOT an issue and the number of detections returned from nms is correct.

remydubois · 2023-01-11T12:12:42Z

Hello,

thank you for your issue.
The code used to offset the bboxes class-wise is quite arbitrary, and some parameters were chosen ad-hoc (such as

lsnms/lsnms/util.py

Line 387 in 62139e6

max_offset = bboxes[:, dimensionality:].max() + 2

), maybe you can try tweaking the parameter here (I have no access to my laptop until sunday eow).
Let me know

remydubois · 2023-01-15T11:56:16Z

Hello,

Sorry for my previous answer which was not so on point. I managed to find the bug, even if I fail to completely understand where that randomness comes from (which I witnessed, regardless of the class labels).
Basically: when masking the boxes based on their score, I forgot to mask the scores as well. This is messy because right after, I order the boxes by their score, so there is a mismatch between the amount of boxes and the amount of scores to filter out, see the associated PR.

Thank you very much for reporting this issue which unfortunately hid under my radars !

remydubois · 2023-01-16T20:14:01Z

I took some time to dig into the randomness thing, and I have no explanation of the phenomenon. Basically it seems Numba rather returns random bits when trying to slice arrays out-of-index, (which happened before the fix) when there were more scores than boxes after the score thresholding.

Something as simple as this shows this faulty behaviour.

@njit
def f(x, i):
    return x[i]

x = np.arange(10)

>>> f(x, 10) # Works fine
>>> f(x, 11) # Will throw random garbage, while it should raise IndexError

Without this weird bug, lsnms.nsm would raise IndexError instead of throwing random results. The issue is now fixed on lsnms but probably something I should raise to the numba maintainers.

Edit: this is an expected behaviour of Numba, see bounds checking

tchott · 2023-01-20T18:48:37Z

Awesome! thanks for digging in Remy! Any idea on when your next release will be?

remydubois · 2023-01-21T12:54:26Z

I will try to merge and release 0.3.2 in the coming days.
Also, I will try and do the same for 0.4.0 which should implement compilation caching and bring significant speedup.

- Fixed issue #12, by masking scores as well as boxes. - Added torch and torchvision as proper dev dependencies - Fixed Pillow version (dev dep) to 9.3.0 in dev dependencies because 9.4.0 does not compile on my mbp (see python-pillow/Pillow#6862) - Removed deprecated arguments: `cutoff_distance` and `tree`. Removed associated tests. - Added sanity check to ensure `leaf_size` is strictly positive.

remydubois · 2023-01-21T17:05:06Z

0.3.2 should fix the issue and is released on pypi.
Closing the issue but feel free to re-open if problem arises again.

remydubois added the bug Something isn't working label Jan 11, 2023

remydubois mentioned this issue Jan 15, 2023

Fixed issue 12, updated changelog #13

Merged

remydubois closed this as completed Jan 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random Failures #12

Random Failures #12

tchott commented Jan 10, 2023 •

edited

Loading

tchott commented Jan 10, 2023

remydubois commented Jan 11, 2023

remydubois commented Jan 15, 2023

remydubois commented Jan 16, 2023 •

edited

Loading

tchott commented Jan 20, 2023

remydubois commented Jan 21, 2023

remydubois commented Jan 21, 2023

Random Failures #12

Random Failures #12

Comments

tchott commented Jan 10, 2023 • edited Loading

tchott commented Jan 10, 2023

remydubois commented Jan 11, 2023

remydubois commented Jan 15, 2023

remydubois commented Jan 16, 2023 • edited Loading

tchott commented Jan 20, 2023

remydubois commented Jan 21, 2023

remydubois commented Jan 21, 2023

tchott commented Jan 10, 2023 •

edited

Loading

remydubois commented Jan 16, 2023 •

edited

Loading