Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Trufflehog and change dedupe test file path #1491

Merged
merged 1 commit into from Aug 14, 2019

Conversation

@Maffooch
Copy link
Contributor

commented Aug 13, 2019

Note: DefectDojo is now on Python3 and Django 2.2.1 Please submit your pull requests to the 'dev' branch as the 'legacy-python2.7' branch is only for bug fixes. Any new features submitted to the legacy branch will be ignored and closed.

Trufflehog fix motivated from #1482
Dedupe path change is for obvious reasons.

  • Your code is flake8 compliant
  • If this is a new feature and not a bug fix, you've included the proper documentation in the ReadTheDocs documentation folder. https://github.com/DefectDojo/Documentation/tree/master/docs or provide feature documentation in the PR.
  • Model changes must include the necessary migrations in the dojo/dd_migrations folder.
  • Add applicable tests to the unit tests.
@@ -69,7 +69,10 @@ def __init__(self, filename, test):

def parse_json(self, json_output):
try:
json_data = json.loads(json_output)
try:
json_data = json.loads(str(json_output, 'utf-8'))

This comment has been minimized.

Copy link
@ptrovatelli

ptrovatelli Aug 14, 2019

Contributor

Edit: just found out this fixes #1482. Making a few corrections after testing!

What will happen for those bytes that are not utf-8?
If you have a doubt you're really reading utf-8, I believe the clean way is :
json_output.decode('utf-8')
by defaut this will throw an exception if it's not utf-8, but you can also :
json_output.decode('utf-8', errors='replace')
which will replace unknown characters by �
Or
json_output.decode('utf-8', errors='ignore')
which will strip out unknown characters

From that point on you're working with a <type 'str'> (json_output was a <type 'bytes'>)
when you're finished working with it and want a type bytes (to pass to the database for ex) then you'll need to
encode('utf8')

NB: the loads function in version 3.5, loads doesn't support bytes yet you need to pass str type:
https://docs.python.org/3/library/json.html

json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using this conversion table.
(...)
Changed in version 3.6: s can now be of type bytes or bytearray. The input encoding should be UTF-8, UTF-16 or UTF-32.

this should be a good start but not working; i'm trying something else...
json_data = json_output.decode('utf-8', errors='replace')

(no exception handling needed)

This comment has been minimized.

Copy link
@Maffooch

Maffooch Aug 14, 2019

Author Contributor

@ptrovatelli Good point on the chars that are not utf-8. I had not thought about that. The reason for the exception handling is that the files are not always bytes or strings. For example, I tried a fix that handled imports correctly, but would fail the unit tests. They used the same function, so I was totally stumped. Unfortunately, there's about 25 other parsers that suffered from this issue. Here is the rest on #1467. I had forgotten Trufflehog in the cluster!

This comment has been minimized.

Copy link
@ptrovatelli

ptrovatelli Aug 14, 2019

Contributor

OK. it's just that json.load takes utf-8 strings in input so it seems strange to give a string that failed str(..., utf-8). However after a few tests it doesn't seem as easy as I expected, i'll need to dig it a little to see what's what.

This comment has been minimized.

Copy link
@Maffooch

Maffooch Aug 14, 2019

Author Contributor

Believe me, I know. It nearly took me three times to get it working! I've seen the import through GUI/API reads files as bytes. I suspect it's because of the temp file upload. The unit tests open the file to read so it auto converts to string. Thats why I did it in that order.

@devGregA devGregA merged commit 2a2dfc5 into DefectDojo:dev Aug 14, 2019
4 checks passed
4 checks passed
AccessLint Review complete
continuous-integration/travis-ci/pr The Travis CI build passed
Details
security/snyk - components/package.json (aaronweaver (GitHub marketplace)) No manifest changes detected
security/snyk - requirements.txt (aaronweaver (GitHub marketplace)) No manifest changes detected
@Maffooch Maffooch deleted the Maffooch:dev branch Aug 16, 2019
@Maffooch Maffooch restored the Maffooch:dev branch Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.