Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cross-scanner deduplication incorrect endpoint parsing #10215

Open
macsoun opened this issue May 15, 2024 · 0 comments
Open

cross-scanner deduplication incorrect endpoint parsing #10215

macsoun opened this issue May 15, 2024 · 0 comments
Labels

Comments

@macsoun
Copy link

macsoun commented May 15, 2024

Bug description
I tried to set up deduplication between two scanners (in my case between Nessus and Nuclei).
My config settings.dist.py:

DEDUPLICATION_ALGORITHM_PER_PARSER = {
    'Tenable Scan': DEDUPE_ALGO_HASH_CODE,
    'Nuclei Scan': DEDUPE_ALGO_HASH_CODE,
}

HASHCODE_FIELDS_PER_SCANNER = {
    'Tenable Scan': ['component_name', 'severity'],
    'Nuclei Scan': ['component_name', 'severity'],
} 

HASH_CODE_FIELDS_ALWAYS = []
DEDUPE_ALGO_ENDPOINT_FIELDS = ['host', 'port', 'path']

I loaded test findings with the same component_name, severity and the same host, port and path into the endpoint. I noticed that the hash_code is the same in both findings, but finding is not set as a duplicate after starting deduplication. Logs from dojo.specific-loggers.deduplication end after this line:

django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:256] Starting deduplication by endpoint fields for finding 40958 with urls [DecodedURL(url=URL.from_text('tcp://10.20.197.218:6379'))] and finding 40957 with urls [DecodedURL(url=URL.from_text('10.20.197.218:6379'))]

The only difference was in the endpoints. As a result of the debug, I discovered that django-DefectDojo/dojo/utils.py function are_urls_equal returns False. The thing is that endpoints parsed by the python hyperlink module (hyperlink.parse(str(e))) are passed into the function, and if the endpoint does not have a scheme, then it is parsed incorrectly.

# endpoint with scheme
>>> import hyperlink
>>> e = hyperlink.parse("tcp://10.20.197.218:6379")
>>> e.scheme
'tcp'
>>> e.host
'10.20.197.218'
>>> e.port
6379

# endpoint without scheme
>>> e = hyperlink.parse("10.20.197.218:6379")
>>> e.scheme
'10.20.197.218'
>>> e.host
''
>>> e.port
>>> e.path
('6379',)

I think this can be fixed by adding // to the beginning of the endpoint if scheme is missing in the django-DefectDojo/dojo/utils.py get_endpoints_as_url function, as is done in the tools parsers.

Steps to reproduce
Steps to reproduce the behavior:

  1. Set up the config as in my description
  2. Enable deduplication
  3. Import added examples with redis from scanners Nuclei and Nessus
  4. Deduplication won't work

Logs

django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [titlecase:201] Redis - Default Logins
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2074] using HASHCODE_FIELDS_PER_SCANNER for test_type.name: Nuclei Scan
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2082] HASHCODE_FIELDS_PER_SCANNER is: ['severity', 'component_name']
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2091] using HASHCODE_ALLOWS_NULL_CWE for test_type.name: Nuclei Scan
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2099] HASHCODE_ALLOWS_NULL_CWE is: True
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2633] computing hash_code for finding id 40957 based on: severity, component_name
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2650] severity : Critical
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2650] component_name : redis
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2651] compute_hash_code - fields_to_hash = Criticalredis
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.models:2734] fields_to_hash      : Criticalredis
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.models:2735] fields_to_hash lower: criticalredis
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2999] Hash_code computed for finding: 56d70dd7468d3a76c5282c21dfb6d96dfc41e0e87b087946e20b612df22da60d
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.models:3028] Saving finding of id 40957 dedupe_option:True (self.pk is not None)
...
django-defectdojo-celeryworker-1  | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:282] dedupe for: 40957:Redis - Default Logins
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.notifications.helper:78] creating personal notifications for event: test_added
django-defectdojo-celeryworker-1  | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2057] using DEDUPLICATION_ALGORITHM_PER_PARSER for test_type.name: Nuclei Scan
django-defectdojo-celeryworker-1  | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:2065] DEDUPLICATION_ALGORITHM_PER_PARSER is: hash_code
django-defectdojo-celeryworker-1  | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:285] deduplication algorithm: hash_code
django-defectdojo-uwsgi-1         | [11/May/2024 16:15:22] DEBUG [dojo.notifications.helper:93] Filtering users for the product Deduplication Test
django-defectdojo-celeryworker-1  | [11/May/2024 16:15:22] DEBUG [dojo.specific-loggers.deduplication:469] Found 0 findings with the same hash_code
...
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [titlecase:201] Redis Server Unprotected by Password Authentication
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2074] using HASHCODE_FIELDS_PER_SCANNER for test_type.name: Tenable Scan
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2082] HASHCODE_FIELDS_PER_SCANNER is: ['severity', 'component_name']
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2091] using HASHCODE_ALLOWS_NULL_CWE for test_type.name: Tenable Scan
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2099] HASHCODE_ALLOWS_NULL_CWE is: True
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2633] computing hash_code for finding id 40958 based on: severity, component_name
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2650] severity : Critical
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2650] component_name : redis
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2651] compute_hash_code - fields_to_hash = Criticalredis
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.models:2734] fields_to_hash      : Criticalredis
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.models:2735] fields_to_hash lower: criticalredis
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2999] Hash_code computed for finding: 56d70dd7468d3a76c5282c21dfb6d96dfc41e0e87b087946e20b612df22da60d
django-defectdojo-uwsgi-1         | [11/May/2024 16:22:42] DEBUG [dojo.models:3028] Saving finding of id 40958 dedupe_option:True (self.pk is not None)
...
django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:282] dedupe for: 40958:Redis Server Unprotected by Password Authentication
django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2057] using DEDUPLICATION_ALGORITHM_PER_PARSER for test_type.name: Tenable Scan
django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:2065] DEDUPLICATION_ALGORITHM_PER_PARSER is: hash_code
django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:285] deduplication algorithm: hash_code
django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:469] Found 1 findings with the same hash_code
django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:256] Starting deduplication by endpoint fields for finding 40958 with urls [DecodedURL(url=URL.from_text('tcp://10.20.197.218:6379'))] and finding 40957 with urls [DecodedURL(url=URL.from_text('10.20.197.218:6379'))]
django-defectdojo-celeryworker-1  | [11/May/2024 16:22:42] DEBUG [dojo.specific-loggers.deduplication:215] Check if url tcp://10.20.197.218:6379 and url 10.20.197.218:6379 are equal in terms of ['host', 'port', 'path'].

Sample scan files
redis_dedupe_examlpe.nuclei.json
redis_dedupe_example.nessus.csv

@macsoun macsoun added the bug label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant