Bug: Deduplication fails when oldest matching finding is mitigated
Description
Deduplication stops working when the oldest matching finding (by unique_id_from_tool or hash_code) has been mitigated. New findings are not marked as duplicates even though there are other active findings with the same identifier/hash_code.
Steps to Reproduce
- Import a scan report containing a finding (you can test with the one attached to this issue)
- Manually mitigate that finding
- Import the same report again. A new finding is created (expected, as it's a regression).
- Import the same report a third time. Another new finding is created instead of being marked as duplicate of the second one.
- Reopen the first (mitigated) finding
- Import the report again. Now the finding is correctly marked as duplicate
Expected Behavior
In step 4, the third finding should be marked as a duplicate of the second finding (which is active).
Actual Behavior
The third finding is created as a new active finding, not a duplicate. Deduplication silently fails.
Debug Logs
With deduplication debug logging enabled, the logs show:
Finding 12699703: Found 30 findings with same unique_id_from_tool
new_finding.status(): 12119802 Active, Verified
existing_finding.status(): 1211700 Inactive, Mitigated
Found a regression. Ignore this so that a new duplicate chain can be made
The deduplication finds 30 candidates, but only tries the oldest one. When that fails because it's mitigated, it stops instead of trying the next candidate.
Possible Root Cause
The recent refactoring on deduplication logic #13491.
Old code had one loop that would try the next candidate on failure:
|
def deduplicate_unique_id_from_tool(new_finding): |
New code fails and has no way to try next candidate. I think the problem is there's no continue for when set_duplicate fails (mitigated/regression case):
https://github.com/DefectDojo/django-DefectDojo/blob/87ff93ad596e30dc905d0c33147148865d528a5f/dojo/finding/deduplication.py#L474C1-L488C50
Environment
- DefectDojo version: 2.53.1.
- Scanner: Any (reproduced with AWS Prowler V3)
Impact
Users with mitigated findings will see duplicate findings accumulate instead of being properly deduplicated.
one_vuln.json
Bug: Deduplication fails when oldest matching finding is mitigated
Description
Deduplication stops working when the oldest matching finding (by
unique_id_from_toolorhash_code) has been mitigated. New findings are not marked as duplicates even though there are other active findings with the same identifier/hash_code.Steps to Reproduce
Expected Behavior
In step 4, the third finding should be marked as a duplicate of the second finding (which is active).
Actual Behavior
The third finding is created as a new active finding, not a duplicate. Deduplication silently fails.
Debug Logs
With deduplication debug logging enabled, the logs show:
The deduplication finds 30 candidates, but only tries the oldest one. When that fails because it's mitigated, it stops instead of trying the next candidate.
Possible Root Cause
The recent refactoring on deduplication logic #13491.
Old code had one loop that would try the next candidate on failure:
django-DefectDojo/dojo/utils.py
Line 428 in 8f98d4e
New code fails and has no way to try next candidate. I think the problem is there's no continue for when set_duplicate fails (mitigated/regression case):
https://github.com/DefectDojo/django-DefectDojo/blob/87ff93ad596e30dc905d0c33147148865d528a5f/dojo/finding/deduplication.py#L474C1-L488C50
Environment
Impact
Users with mitigated findings will see duplicate findings accumulate instead of being properly deduplicated.
one_vuln.json