You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I loaded test findings with the same component_name, severity and the same host, port and path into the endpoint. I noticed that the hash_code is the same in both findings, but finding is not set as a duplicate after starting deduplication. Logs from dojo.specific-loggers.deduplication end after this line:
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:256] Starting deduplication by endpoint fields for finding 40958 with urls [DecodedURL(url=URL.from_text('tcp://10.20.197.218:6379'))] and finding 40957 with urls [DecodedURL(url=URL.from_text('10.20.197.218:6379'))]
The only difference was in the endpoints. As a result of the debug, I discovered that django-DefectDojo/dojo/utils.py function are_urls_equal returns False. The thing is that endpoints parsed by the python hyperlink module (hyperlink.parse(str(e))) are passed into the function, and if the endpoint does not have a scheme, then it is parsed incorrectly.
# endpoint with scheme
>>> import hyperlink
>>> e = hyperlink.parse("tcp://10.20.197.218:6379")
>>> e.scheme
'tcp'
>>> e.host
'10.20.197.218'
>>> e.port
6379
# endpoint without scheme
>>> e = hyperlink.parse("10.20.197.218:6379")
>>> e.scheme
'10.20.197.218'
>>> e.host
''
>>> e.port
>>> e.path
('6379',)
I think this can be fixed by adding // to the beginning of the endpoint if scheme is missing in the django-DefectDojo/dojo/utils.pyget_endpoints_as_url function, as is done in the tools parsers.
Steps to reproduce
Steps to reproduce the behavior:
Set up the config as in my description
Enable deduplication
Import added examples with redis from scanners Nuclei and Nessus
Deduplication won't work
Logs
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [titlecase:201] Redis - Default Logins
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2074] using HASHCODE_FIELDS_PER_SCANNER for test_type.name: Nuclei Scan
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2082] HASHCODE_FIELDS_PER_SCANNER is: ['severity', 'component_name']
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2091] using HASHCODE_ALLOWS_NULL_CWE for test_type.name: Nuclei Scan
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2099] HASHCODE_ALLOWS_NULL_CWE is: True
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2633] computing hash_code for finding id 40957 based on: severity, component_name
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2650] severity : Critical
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2650] component_name : redis
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2651] compute_hash_code - fields_to_hash = Criticalredis
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.models:2734] fields_to_hash : Criticalredis
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.models:2735] fields_to_hash lower: criticalredis
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2999] Hash_code computed for finding: 56d70dd7468d3a76c5282c21dfb6d96dfc41e0e87b087946e20b612df22da60d
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.models:3028] Saving finding of id 40957 dedupe_option:True (self.pk is not None)
...
django-defectdojo-celeryworker-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:282] dedupe for: 40957:Redis - Default Logins
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.notifications.helper:78] creating personal notifications for event: test_added
django-defectdojo-celeryworker-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2057] using DEDUPLICATION_ALGORITHM_PER_PARSER for test_type.name: Nuclei Scan
django-defectdojo-celeryworker-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2065] DEDUPLICATION_ALGORITHM_PER_PARSER is: hash_code
django-defectdojo-celeryworker-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:285] deduplication algorithm: hash_code
django-defectdojo-uwsgi-1 | [11/May/2024 16:15:22] DEBUG [dojo.notifications.helper:93] Filtering users for the product Deduplication Test
django-defectdojo-celeryworker-1 | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:469] Found 0 findings with the same hash_code
...
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [titlecase:201] Redis Server Unprotected by Password Authentication
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2074] using HASHCODE_FIELDS_PER_SCANNER for test_type.name: Tenable Scan
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2082] HASHCODE_FIELDS_PER_SCANNER is: ['severity', 'component_name']
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2091] using HASHCODE_ALLOWS_NULL_CWE for test_type.name: Tenable Scan
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2099] HASHCODE_ALLOWS_NULL_CWE is: True
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2633] computing hash_code for finding id 40958 based on: severity, component_name
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2650] severity : Critical
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2650] component_name : redis
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2651] compute_hash_code - fields_to_hash = Criticalredis
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.models:2734] fields_to_hash : Criticalredis
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.models:2735] fields_to_hash lower: criticalredis
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2999] Hash_code computed for finding: 56d70dd7468d3a76c5282c21dfb6d96dfc41e0e87b087946e20b612df22da60d
django-defectdojo-uwsgi-1 | [11/May/2024 16:22:42] DEBUG [dojo.models:3028] Saving finding of id 40958 dedupe_option:True (self.pk is not None)
...
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:282] dedupe for: 40958:Redis Server Unprotected by Password Authentication
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2057] using DEDUPLICATION_ALGORITHM_PER_PARSER for test_type.name: Tenable Scan
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2065] DEDUPLICATION_ALGORITHM_PER_PARSER is: hash_code
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:285] deduplication algorithm: hash_code
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:469] Found 1 findings with the same hash_code
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:256] Starting deduplication by endpoint fields for finding 40958 with urls [DecodedURL(url=URL.from_text('tcp://10.20.197.218:6379'))] and finding 40957 with urls [DecodedURL(url=URL.from_text('10.20.197.218:6379'))]
django-defectdojo-celeryworker-1 | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:215] Check if url tcp://10.20.197.218:6379 and url 10.20.197.218:6379 are equal in terms of ['host', 'port', 'path'].
Bug description
I tried to set up deduplication between two scanners (in my case between Nessus and Nuclei).
My config
settings.dist.py
:I loaded test findings with the same component_name, severity and the same host, port and path into the endpoint. I noticed that the hash_code is the same in both findings, but finding is not set as a duplicate after starting deduplication. Logs from
dojo.specific-loggers.deduplication
end after this line:The only difference was in the endpoints. As a result of the debug, I discovered that
django-DefectDojo/dojo/utils.py
functionare_urls_equal
returns False. The thing is that endpoints parsed by the python hyperlink module (hyperlink.parse(str(e))) are passed into the function, and if the endpoint does not have a scheme, then it is parsed incorrectly.I think this can be fixed by adding
//
to the beginning of the endpoint if scheme is missing in thedjango-DefectDojo/dojo/utils.py
get_endpoints_as_url
function, as is done in the tools parsers.Steps to reproduce
Steps to reproduce the behavior:
Logs
Sample scan files
redis_dedupe_examlpe.nuclei.json
redis_dedupe_example.nessus.csv
The text was updated successfully, but these errors were encountered: