Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nuclei parser: UnicodeDecodeError on special url characters (%c0) #9201

Closed
1 of 3 tasks
Tlafay1 opened this issue Dec 20, 2023 · 6 comments
Closed
1 of 3 tasks

Nuclei parser: UnicodeDecodeError on special url characters (%c0) #9201

Tlafay1 opened this issue Dec 20, 2023 · 6 comments
Labels

Comments

@Tlafay1
Copy link

Tlafay1 commented Dec 20, 2023

Bug description
When importing a nuclei scan containing special URL characters in the matched-at section of the json, the exception UnicodeDecodeError is thrown (see stacktrace below). To my understanding, this is due to the character being decoded in hyperlink.parse (dojo/models.py:2543), and therefore interpreted as a special character (when it should just be treated as normal characters).

Steps to reproduce
Steps to reproduce the behavior:

  1. Run a scan using nuclei: nuclei -target scanme.nmap.org -json-export /tmp/nuclei-poc.json.
  2. In the json report, modify any matched-at field by appending /%c0 at the end of the existing url.
  3. Import the json in any engagement.
  4. Notice the import failure, with the error message similar to the stacktrace .

Expected behavior
Import is successful

Deployment method

  • Docker Compose
  • Kubernetes
  • GoDojo

Environment information

  • Operating System: [Linux kali 6.5.0]
  • DefectDojo version (see footer) or commit message: [DefectDojo/release/2.29.3]

Logs
I removed the code from the try/except block in importer.py to backtrack the issue. I also purposefully removed a second error since fixing this one fixes everything:

----------------------------------
url: https://example.com/%c0
----------------------------------

[20/Dec/2023 13:35:22] ERROR [dojo.engagement.views:983] 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte
Traceback (most recent call last):
  File "/app/dojo/engagement/views.py", line 945, in post
    test, finding_count, closed_finding_count, _ = importer.import_scan(
                                                   ^^^^^^^^^^^^^^^^^^^^^
  File "/app/dojo/importers/importer/importer.py", line 456, in import_scan
    parsed_findings = parser.get_findings(scan, test)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/dojo/tools/nuclei/parser.py", line 63, in get_findings
    endpoint = Endpoint.from_uri(matched)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/dojo/models.py", line 2546, in from_uri
    url = hyperlink.parse(url=uri)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/hyperlink/_url.py", line 2447, in parse
    dec_url = DecodedURL(enc_url, lazy=lazy)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/hyperlink/_url.py", line 2046, in __init__
    self.host, self.userinfo, self.path, self.query, self.fragment
                              ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/hyperlink/_url.py", line 2177, in path
    [
  File "/usr/local/lib/python3.11/site-packages/hyperlink/_url.py", line 2178, in <listcomp>
    _percent_decode(p, raise_subencoding_exc=True)
  File "/usr/local/lib/python3.11/site-packages/hyperlink/_url.py", line 766, in _percent_decode
    return unquoted_bytes.decode(subencoding)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte
@Tlafay1 Tlafay1 added the bug label Dec 20, 2023
@Tlafay1
Copy link
Author

Tlafay1 commented Dec 20, 2023

If needed I can fix this in a pull request relatively soon, as I understand the root cause and could find a fix pretty quickly

@manuel-sommer
Copy link
Contributor

@Tlafay1 could you please provide me a sample output? I will make a PR.

manuel-sommer added a commit to manuel-sommer/django-DefectDojo that referenced this issue Dec 20, 2023
@manuel-sommer
Copy link
Contributor

See PR @Tlafay1

@Tlafay1
Copy link
Author

Tlafay1 commented Dec 21, 2023

@Tlafay1 could you please provide me a sample output? I will make a PR.

I'm not sure what you mean by sample output, are you talking about a nuclei scan that introduces the bug ?

@manuel-sommer
Copy link
Contributor

@Tlafay1 could you please provide me a sample output? I will make a PR.

I'm not sure what you mean by sample output, are you talking about a nuclei scan that introduces the bug ?

Yes, I was talking aboiut a scan that introduces the bug, but I already was able to reproduce it, see PR.

Maffooch pushed a commit that referenced this issue Dec 22, 2023
* 🐛 fix issue #9201

* flake8
@manuel-sommer
Copy link
Contributor

This can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants