New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Input Engine with triage functionality #777
Conversation
Codecov Report
@@ Coverage Diff @@
## master #777 +/- ##
==========================================
- Coverage 88.32% 87.38% -0.94%
==========================================
Files 139 140 +1
Lines 2483 2474 -9
Branches 301 297 -4
==========================================
- Hits 2193 2162 -31
- Misses 226 243 +17
- Partials 64 69 +5
Continue to review full report at Codecov.
|
| from .version import VERSION | ||
| from .version_scanner import VersionScanner | ||
|
|
||
| ERROR_CODES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom error codes
| def excepthook(exctype, exc, trace): | ||
| oldHook(exctype, exc, trace) | ||
| if ERROR_CODES.get(exctype): | ||
| sys.exit(ERROR_CODES[exctype]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
custom except hook to also throw exit code while exception raise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really lovely, thanks. One of our users (I think someone running this on a Yocto based system?) was saying that they use the return codes to colourize their CI dashboard, so this may be very helpful to others as well.
cve_bin_tool/cli.py
Outdated
| parser.add_argument( | ||
| "-i", | ||
| "--input-file", | ||
| action="store", | ||
| default=None, | ||
| help="Provide input filename", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New option to specify input file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it be the actual path to the file? I guess it should be file-path
|
|
||
| LOGGER.info("") | ||
| LOGGER.info("Overall CVE summary: ") | ||
| LOGGER.info( | ||
| f"There are {scanner.files_with_cve} files with known CVEs detected" | ||
| f"There are {cve_scanner.products_with_cve} products with known CVEs detected" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a better name because files_with_cves won't make much sense while using it as csv2cve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thinking, though I wonder if products will be confusing for people scanning directories, as they may think of a "product" as a .deb or somesuch and we have a different definition. it's tempting to just say "things" instead of files or products, but I guess that doesn't sound quite as formal.
| else: | ||
| csv2cve.update_logLevel(log_level=args.log_level) | ||
| csv2cve.generate_output(args.output_file, args.format) | ||
| argv = argv or sys.argv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dummy csv2cve file
| class InvalidCsvError(Exception): | ||
| """ Given File is an Invalid CSV """ | ||
|
|
||
|
|
||
| class MissingFieldsError(Exception): | ||
| """ Missing needed fields """ | ||
|
|
||
|
|
||
| class InvalidJsonError(Exception): | ||
| """ Given File is an Invalid JSON """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
custom exceptions to raise exceptions
| class Remarks(OrderedEnum): | ||
| NewFound = 1, "1", "NewFound", "n", "N" | ||
| Unexplored = 2, "2", "Unexplored", "u", "U", "" | ||
| Mitigated = 4, "4", "Mitigated", "m", "M" | ||
| Confirmed = 3, "3", "Confirmed", "c", "C" | ||
| Ignored = 5, "5", "Ignored", "i", "I" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
custom enum class that can handle multiple aliases easily and can also be sorted.
| @@ -109,3 +110,25 @@ def pattern_match(text, patterns): | |||
| if fnmatch.fnmatch(text, pattern): | |||
| return True | |||
| return False | |||
|
|
|||
|
|
|||
| class OrderedEnum(Enum): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to provide comparison methods that allows sorting of list of Remarks.
| @@ -118,6 +118,7 @@ jobs: | |||
| ACTIONS: 1 | |||
| LONG_TESTS: 0 | |||
| NO_EXIT_CVE_NUM: 1 | |||
| PYTHONIOENCODING: 'utf8' | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes problem with Windows CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YAY! Thanks for this
| test/test_cli.py | ||
| test/test_extractor.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed because of https://bugs.python.org/issue35621.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got a bunch of nitpicky comments about error codes and better PR splits, but given that @SinghHrmn I think is waitnig for this PR to be merged, I'm going to merge and file issues for my feedback rather than request changes.
@Niraj-Kamdar one big thing though is that you really need to get into the habit of grouping the changes in PRs by function, so the checker changes here should have been in a separate branch and a separate PR. Much of the minor refactoring stuff should also have been split out. We've let you get into bad habits because we had so much refactoring that needed doing that I was expecting to see giant PRs, and that's on me and @pdxjohnny as mentors, but it's something I think is worth working on going forwards.
Here's a couple of helpful articles on how to split up PRs that I think you might find helpful: https://unhashable.com/stacked-pull-requests-keeping-github-diffs-small/ https://www.thedroidsonroids.com/blog/splitting-pull-request
| CONTAINS_PATTERNS = [r"http://bugzilla.gnome.org/enter_bug.cgi?product=GStreamer"] | ||
| CONTAINS_PATTERNS = [ | ||
| r"http://bugzilla.gnome.org/enter_bug.cgi?product=GStreamer", | ||
| r"libgstreamer-((\d+\.)+\d+)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be altering the checker default so it scans for VERSION_PATTERNS as well as CONTAINS_PATTERNS by default? It seems silly to duplicate lines in each checker if we probably actually want version patterns to also be a valid detection string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that can be helpful.
| def excepthook(exctype, exc, trace): | ||
| oldHook(exctype, exc, trace) | ||
| if ERROR_CODES.get(exctype): | ||
| sys.exit(ERROR_CODES[exctype]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really lovely, thanks. One of our users (I think someone running this on a Yocto based system?) was saying that they use the return codes to colourize their CI dashboard, so this may be very helpful to others as well.
| @@ -37,8 +60,7 @@ def __call__(self, parser, namespace, value, option_string=None): | |||
|
|
|||
| def main(argv=None): | |||
| """ Scan a binary file for certain open source libraries that may have CVEs """ | |||
| if argv is None: | |||
| argv = sys.argv | |||
| argv = argv or sys.argv | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of prefer the clarity of the is None check in some cases (because we get so many new pythonistas looking at this project) but this isn't a thing I expect a lot of new contributors to look at so I think streamlining this is the right choice.
| default=None, | ||
| help="provide output filename (default: output to stdout)", | ||
| default="", | ||
| help="provide output filename", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the default is fine, but don't change the help message here: that's to clarify for users because the default messages from argparser didn't make the behaviour clear.
| help="provide output filename", | |
| help="provide output filename (default: output to stdout)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add that.
| "-x", | ||
| "--extract", | ||
| action="store_true", | ||
| default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been wondering about this: should we change the default to auto-extract? I've opened an issue to discuss, since that's out of scope for this PR. #786
|
|
||
| LOGGER.info("") | ||
| LOGGER.info("Overall CVE summary: ") | ||
| LOGGER.info( | ||
| f"There are {scanner.files_with_cve} files with known CVEs detected" | ||
| f"There are {cve_scanner.products_with_cve} products with known CVEs detected" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thinking, though I wonder if products will be confusing for people scanning directories, as they may think of a "product" as a .deb or somesuch and we have a different definition. it's tempting to just say "things" instead of files or products, but I guess that doesn't sound quite as formal.
| argv = argv or sys.argv | ||
| if len(argv) < 2: | ||
| LOGGER.error("Invalid Usage: csv file required") | ||
| return -2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're importing cli, we might as well use the same error code list here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change that
| csv2cve.update_logLevel(log_level=args.log_level) | ||
| csv2cve.generate_output(args.output_file, args.format) | ||
| LOGGER.error("Invalid Usage: csv file required") | ||
| return -4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another error code spot.
|
|
||
| @staticmethod | ||
| def openssl_convert(version): | ||
| def openssl_convert(version: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yay type hints!
| @@ -4,7 +4,7 @@ | |||
| "version": "4.1.3", | |||
| "version_strings": [ | |||
| "%s version 4.1.3", | |||
| "Codec %s is not recognized by FFmpeg.", | |||
| "Codec '%s' is not recognized by FFmpeg.", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In future, can you put unrelated changes to checkers in a separate PR? They could all be grouped together, but they shouldn't be grouped with the InputEngine changes. It's considered bad form by most open source projects to group them the way you have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have put it here because I have to make VersionScanner generator to use InputEngine parsed data as supplement and it was breaking the test_scanner because I have also included test for contains string previously but I have forget to clean up scanner and test was only passing due to previous scan has detected it. So, that's why when I converted that in generator this bug come to surface. I should have divide this PR in two parts: 1) Making version_scanner generator and 2) building input_engine. Thanks, for the feedback. I will try to keep PR size smaller.
New features:
cve-bin-tool -i=test.csvcve-bin-tool -i=test.csv /path/to/scan