Reporting feature #387

pablosnt · 2020-12-22T15:54:21Z

This feature allows to generate reports that include all the detected secrets details. This feature can be executed with the option --report of the audit command. The generated reports include the following fields for each detected secret:

class: class associated to the secret. It can be TRUE_POSITIVE, FALSE_POSITIVE or UNKNOWN.
filename: file where the secret was detected.
lines: list of lines where the secret was detected. It will contain an entry for each occurrence of the secret in these file.
secret: secret value in plaintext.
types: list of secret types associated to the secret. Note that each secret can be detected by more than one plugin.

Moreover, this feature offer the following options:

--only-real: only includes secrets with class UNKNOWN or TRUE_POSITIVE in the report.
--only-false: only includes secrets with class FALSE_POSITIVE in the report.

Both options are optional, so if no option was selected, the report will include secrets of all classes.

Next, you can see an example of execution:

Thank you very much!
I hope that you like this feature

domanchi

Overall, an interesting concept. Thank you for adding the screenshot -- that helps with understanding the intent of this feature much clearer.

There are a lot of developer improvements that will be released in v1: most of my comments revolve around this fact. You should check them out (and the (much more thorough) docs/) and use them, rather than rewriting a lot of shared concepts.

Please also write tests if you intend to merge this. It also looks like linting may not have happened in this branch, so make sure you setup your development environment, and run the pre-commit hooks configured.

Also, I am curious -- my assumption when writing this tool is that the secret value is rarely used multiple times within the same file. This is written as if that may not be the case. Have you encountered real secrets declared multiple times within the same file?

detect_secrets/audit/report.py

detect_secrets/core/usage/audit.py

detect_secrets/audit/report.py

pablosnt · 2020-12-23T10:05:54Z

Have you encountered real secrets declared multiple times within the same file?

Yes, this situation can be possible in large projects. With the information about the occurrences of a secret (line list) we can monitor the utilization of secrets for multiple systems and help the developers to retire all occurrences from the repository.

Thank you for all your comments! I'm working in them

pablosnt · 2021-02-02T11:27:58Z

detect_secrets/audit/common.py

@@ -95,6 +95,43 @@ def get_raw_secret_from_file(
    raise SecretNotFoundOnSpecifiedLineError(secret.line_number)


+def get_all_secrets_from_file(


This method is very similar to get_raw_secret_from_file with the difference that it search all the ocurrences of one secret in one file with one plugin. It allows to offer a very complete report of all secrets found with detect-secrets. The method get_all_secrets_from_file has been added to the detect_secrets/audit/common.py file, so it can be reused by other features in the future.

Continuing the discussion from #387 (comment), in my side-by-side diff, I identify two distinct changes between this function and get_raw_secret_from_file:

This does not require a line_number to be associated with the secret. Instead, it looks like it enumerates through the lines of the file to try and essentially try to find the secret again

It returns multiple results, rather than just the first one.

Given this, it still makes sense to me to refactor these requirements into one function, for code reuse. For example:

def get_raw_secret_line_from_file(secret, line_getter_factory) -> str: if not secret.line_number: raise NoLineNumberError for item in get_raw_secrets_from_file(secret, line_getter_factory): return item.secret_value def get_raw_secrets_from_file(secret, line_getter_factory) -> Iterator[PotentialSecret]: ... while True: if secret.line_number: # if we have a line number, let's use it. try: lines_to_scan = [line_getter.lines[secret.line_number - 1]] line_numbers = [secret.line_number] except IndexError: raise SecretNotFoundOnSpecifiedLineError(secret.line_number) else: # otherwise, let's just scan the whole file lines_to_scan = line_getter.lines line_numbers = range(1, len(lines_to_scan) + 1) for line_number, line in zip(line_numbers, lines_to_scan): ...

pablosnt · 2021-02-02T11:29:34Z

detect_secrets/audit/report.py

+            continue
+        detections = get_all_secrets_from_file(secret)
+        identifier = hashlib.sha512((secret.secret_hash + filename).encode('utf-8')).hexdigest()
+        line_getter = line_getter_factory(filename)


The method generate_report uses the new LineGetter class to manage all the file lines.

pablosnt · 2021-02-02T11:31:12Z

docs/audit.md

@@ -140,3 +140,126 @@ There are times you want to extract the raw secret values to run further analysi
 so with the `--raw` flag.

 TODO: Example when this feature is written up.
+
+## Report generation


The audit documentation has been updated to include the new reporting feature information. It also includes some examples.

pablosnt · 2021-02-02T11:32:45Z

README.md

@@ -357,29 +357,34 @@ const secret = "hunter2";

 ```bash
 $ detect-secrets audit --help


The audit help output has been updated to include the new report generation options.

pablosnt · 2021-02-02T13:07:41Z

Hi @domanchi, I think that this new implementation of the reporting feature solves your previous comments. With this implementation, the report feature is simpler and supports the code reutilization. I hope that you like it, can you review it?

pablosnt · 2021-02-18T16:41:54Z

Hi @domanchi, do you think this implementation is good enough to be merged? Please, tell me any improvement you want to apply. Thank you very much!

domanchi

This is looking a lot better!

Many of my comments are style-related; I think the core functionality for this report is sound. Tests look sane, with nice edge cases considered. I hope my code examples are illustrative and helpful, if Python isn't your primary development language.

Sorry it's took me so long to get to this! We've been addressing other issues internally, but finally able to get back to this effort and bring this tool to a new shiny future.

detect_secrets/core/usage/audit.py

detect_secrets/audit/common.py

domanchi · 2021-02-24T21:04:10Z

detect_secrets/audit/common.py

@@ -95,6 +95,43 @@ def get_raw_secret_from_file(
    raise SecretNotFoundOnSpecifiedLineError(secret.line_number)


+def get_all_secrets_from_file(


Continuing the discussion from #387 (comment), in my side-by-side diff, I identify two distinct changes between this function and get_raw_secret_from_file:

This does not require a line_number to be associated with the secret. Instead, it looks like it enumerates through the lines of the file to try and essentially try to find the secret again

It returns multiple results, rather than just the first one.

Given this, it still makes sense to me to refactor these requirements into one function, for code reuse. For example:

def get_raw_secret_line_from_file(secret, line_getter_factory) -> str: if not secret.line_number: raise NoLineNumberError for item in get_raw_secrets_from_file(secret, line_getter_factory): return item.secret_value def get_raw_secrets_from_file(secret, line_getter_factory) -> Iterator[PotentialSecret]: ... while True: if secret.line_number: # if we have a line number, let's use it. try: lines_to_scan = [line_getter.lines[secret.line_number - 1]] line_numbers = [secret.line_number] except IndexError: raise SecretNotFoundOnSpecifiedLineError(secret.line_number) else: # otherwise, let's just scan the whole file lines_to_scan = line_getter.lines line_numbers = range(1, len(lines_to_scan) + 1) for line_number, line in zip(line_numbers, lines_to_scan): ...

detect_secrets/audit/common.py

tests/audit/report_test.py

pablosnt · 2021-02-25T19:10:38Z

Hi @domanchi, thank you very much for the comments and suggestions, I think that everything is done!

PD: Contratulations for the new release of detect-secrets 💯

domanchi

Minor comments, but LGTM overall. Let's get this out in our next feature release (waiting for any other patch fixes that may need to be applied to 1.0.X lol)

domanchi · 2021-02-26T15:44:45Z

detect_secrets/audit/common.py

+        if secret.line_number:
+            try:
+                lines_to_scan = [line_getter.lines[secret.line_number - 1]]
+                line_numbers = [secret.line_number]


You might have an off-by-one error here.

detect_secrets/audit/report.py

pablosnt · 2021-02-26T16:52:16Z

detect_secrets/audit/common.py

+                plugin.analyze_line,
+                filename=secret.filename,
+                line=line,
+                line_number=line_number + 1,


if secret.line_number: try: lines_to_scan = [line_getter.lines[secret.line_number - 1]] line_numbers = [secret.line_number]

I think that this code is correct, note that in the line 98 the line_number value is increased in one.

Right, but secret.line_number is already 1-based. As such, when we obtain the target_line, we subtract it by 1 (to obtain from the line_getter.lines array).

So, if we add one to it again, I think we'll get an off-by-one error.

Okay, I get it. I have runned the tests and I think that it's fine. Thank you!

domanchi · 2021-04-12T22:59:39Z

@pablosantiagolopez, can you run make test and fix the surfaced issues? Looks like there's some mypy issues, and detect-secrets findings too.

…/detect-secrets into release/1.0.0

domanchi · 2021-04-13T14:26:36Z

detect_secrets/audit/common.py

@@ -44,7 +45,7 @@ def open_file(filename: str) -> 'LineGetter':
 def get_raw_secret_from_file(
    secret: PotentialSecret,
    line_getter_factory: Callable[[str], 'LineGetter'] = open_file,
-) -> str:
+) -> Any:


Uh, is there a reason why this is Any? str looks sound to me.

Hi @domanchi, we made this change because mypy recommended it. I'm going to check it right now. When I have something else I will reply this comment.

Oh, I think that I got it... We should change str to Optional[str]. I will test again, run the pre-commit hook and push this change. What do you think @domanchi?

Ohhh, I see the issue. It's because of https://github.com/Yelp/detect-secrets/blob/master/detect_secrets/core/potential_secret.py#L66, so mypy (rightly) expects this to return the same type.

Yeah, Optional[str] works for me.

syn-4ck · 2021-04-13T15:18:35Z

Hi again @domanchi, with the last change (#387 (comment)) I think the mypy issues are fixed. Thanks for your feedback!

Prepare release of optional db2 feature

Reporting feature

adc835d

domanchi reviewed Dec 22, 2020

View reviewed changes

Reporting feature: first corrections

f895819

pablosnt requested a review from domanchi December 23, 2020 20:22

pablosnt added 7 commits January 8, 2021 14:29

Reporting feature: first test version

c9633de

Reporting feature optimization

46d0adb

Code correction

f2e2421

Reporting feature documentation

74614cb

Merge branch 'pre-v1-launch' into release/1.0.0

40a0630

Documentation corrections

0473257

Pre-commit errors fix

45ec641

pablosnt commented Feb 2, 2021

View reviewed changes

domanchi reviewed Feb 24, 2021

View reviewed changes

pablosnt added 3 commits February 25, 2021 14:59

Reporting feature optimization

8f45d25

Reporting test correction

efd9cda

Documentation upgrade

d1430e1

pablosnt changed the base branch from pre-v1-launch to master February 25, 2021 19:10

Merge branch 'master' into release/1.0.0

ac28287

pablosnt requested a review from domanchi February 25, 2021 19:15

domanchi approved these changes Feb 26, 2021

View reviewed changes

pablosnt commented Feb 26, 2021

View reviewed changes

pablosnt added 2 commits February 26, 2021 18:00

Corrections

c4e4a2c

Corrections

fcbee98

syn-4ck added 3 commits April 13, 2021 12:37

Merge branch 'version1' into release/1.0.0

57bffac

Merge branch 'release/1.0.0' of https://github.com/pablosantiagolopez…

8c10e4e

…/detect-secrets into release/1.0.0

Correct mypy issues

b4e9cc4

domanchi reviewed Apr 13, 2021

View reviewed changes

syn-4ck added 2 commits April 13, 2021 16:39

Reorder imports by precommit

14c964f

Improve mypy issue resolution

4001e8e

domanchi merged commit f4f7247 into Yelp:master Apr 13, 2021

jimmyhlee94 pushed a commit to jimmyhlee94/detect-secrets that referenced this pull request Aug 19, 2021

Bump version (Yelp#387)

a108ec7

Prepare release of optional db2 feature

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reporting feature #387

Reporting feature #387

pablosnt commented Dec 22, 2020

domanchi left a comment

pablosnt commented Dec 23, 2020

pablosnt Feb 2, 2021

domanchi Feb 24, 2021

pablosnt Feb 2, 2021

pablosnt Feb 2, 2021

pablosnt Feb 2, 2021

pablosnt commented Feb 2, 2021 •

edited

Loading

pablosnt commented Feb 18, 2021

domanchi left a comment

domanchi Feb 24, 2021

pablosnt commented Feb 25, 2021

domanchi left a comment

domanchi Feb 26, 2021

pablosnt Feb 26, 2021

domanchi Feb 26, 2021

pablosnt Feb 26, 2021 •

edited

Loading

domanchi commented Apr 12, 2021

domanchi Apr 13, 2021

syn-4ck Apr 13, 2021 •

edited

Loading

syn-4ck Apr 13, 2021 •

edited

Loading

domanchi Apr 13, 2021

syn-4ck commented Apr 13, 2021

		@@ -95,6 +95,43 @@ def get_raw_secret_from_file(
		raise SecretNotFoundOnSpecifiedLineError(secret.line_number)


		def get_all_secrets_from_file(

		@@ -357,29 +357,34 @@ const secret = "hunter2";

		```bash
		$ detect-secrets audit --help

Reporting feature #387

Reporting feature #387

Conversation

pablosnt commented Dec 22, 2020

domanchi left a comment

Choose a reason for hiding this comment

pablosnt commented Dec 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pablosnt commented Feb 2, 2021 • edited Loading

pablosnt commented Feb 18, 2021

domanchi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pablosnt commented Feb 25, 2021

domanchi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pablosnt Feb 26, 2021 • edited Loading

Choose a reason for hiding this comment

domanchi commented Apr 12, 2021

Choose a reason for hiding this comment

syn-4ck Apr 13, 2021 • edited Loading

Choose a reason for hiding this comment

syn-4ck Apr 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

syn-4ck commented Apr 13, 2021

pablosnt commented Feb 2, 2021 •

edited

Loading

pablosnt Feb 26, 2021 •

edited

Loading

syn-4ck Apr 13, 2021 •

edited

Loading

syn-4ck Apr 13, 2021 •

edited

Loading