Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent invalid CPE field values #514

Merged
merged 7 commits into from Sep 24, 2021
Merged

Conversation

luhring
Copy link
Contributor

@luhring luhring commented Sep 23, 2021

This is the second half of the solution to anchore/grype#417 (the first half being anchore/grype#425).

This PR prevents CPEs from being generated where the product or vendor field contains URL-like strings (e.g. http://jcp_org/en/jsr/detail?id=173), or more generically, invalid string values.

This PR takes the approach of considering any field value that contains a colon (:) to be invalid, since colons are the delimiter of CPE fields, and this covers the case of URLs that contain substrings like http://.

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
@luhring luhring requested a review from a team September 23, 2021 16:51
@github-actions
Copy link

github-actions bot commented Sep 23, 2021

Benchmark Test Results

Benchmark results from the latest changes vs base branch
name                                                   old time/op    new time/op    delta
ImagePackageCatalogers/ruby-gemspec-cataloger-2          1.00ms ± 3%    0.99ms ± 2%     ~     (p=0.841 n=5+5)
ImagePackageCatalogers/python-package-cataloger-2        1.43ms ± 3%    1.57ms ± 3%   +9.82%  (p=0.008 n=5+5)
ImagePackageCatalogers/javascript-package-cataloger-2     520µs ± 8%     504µs ± 2%     ~     (p=0.222 n=5+5)
ImagePackageCatalogers/dpkgdb-cataloger-2                 494µs ± 3%     530µs ± 2%   +7.37%  (p=0.008 n=5+5)
ImagePackageCatalogers/rpmdb-cataloger-2                  512µs ± 2%     541µs ± 2%   +5.61%  (p=0.008 n=5+5)
ImagePackageCatalogers/java-cataloger-2                  10.5ms ± 2%    10.8ms ± 2%   +3.32%  (p=0.032 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                  735µs ± 4%     827µs ± 4%  +12.48%  (p=0.008 n=5+5)
ImagePackageCatalogers/go-cataloger-2                     264µs ± 1%     254µs ± 2%   -3.86%  (p=0.008 n=5+5)
ImagePackageCatalogers/rust-cataloger-2                   432µs ± 1%     439µs ± 2%     ~     (p=0.151 n=5+5)

name                                                   old alloc/op   new alloc/op   delta
ImagePackageCatalogers/ruby-gemspec-cataloger-2           134kB ± 0%     144kB ± 0%   +7.58%  (p=0.008 n=5+5)
ImagePackageCatalogers/python-package-cataloger-2         643kB ± 0%     715kB ± 0%  +11.27%  (p=0.008 n=5+5)
ImagePackageCatalogers/javascript-package-cataloger-2     116kB ± 0%     118kB ± 0%   +2.06%  (p=0.008 n=5+5)
ImagePackageCatalogers/dpkgdb-cataloger-2                 127kB ± 0%     132kB ± 0%   +4.41%  (p=0.008 n=5+5)
ImagePackageCatalogers/rpmdb-cataloger-2                  138kB ± 0%     140kB ± 0%   +1.39%  (p=0.008 n=5+5)
ImagePackageCatalogers/java-cataloger-2                  2.64MB ± 0%    2.69MB ± 0%   +1.95%  (p=0.008 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                 1.16MB ± 0%    1.17MB ± 0%   +1.12%  (p=0.008 n=5+5)
ImagePackageCatalogers/go-cataloger-2                    52.9kB ± 0%    54.9kB ± 0%   +3.78%  (p=0.008 n=5+5)
ImagePackageCatalogers/rust-cataloger-2                   110kB ± 0%     121kB ± 0%   +9.44%  (p=0.008 n=5+5)

name                                                   old allocs/op  new allocs/op  delta
ImagePackageCatalogers/ruby-gemspec-cataloger-2           2.04k ± 0%     2.33k ± 0%  +14.11%  (p=0.008 n=5+5)
ImagePackageCatalogers/python-package-cataloger-2         6.14k ± 0%     8.11k ± 0%  +32.05%  (p=0.016 n=4+5)
ImagePackageCatalogers/javascript-package-cataloger-2     1.93k ± 0%     1.99k ± 0%   +3.32%  (p=0.008 n=5+5)
ImagePackageCatalogers/dpkgdb-cataloger-2                 2.41k ± 0%     2.53k ± 0%   +5.23%  (p=0.008 n=5+5)
ImagePackageCatalogers/rpmdb-cataloger-2                  3.20k ± 0%     3.24k ± 0%   +1.31%  (p=0.008 n=5+5)
ImagePackageCatalogers/java-cataloger-2                   34.8k ± 0%     36.1k ± 0%   +3.66%  (p=0.008 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                  1.92k ± 0%     2.27k ± 0%  +18.12%  (p=0.008 n=5+5)
ImagePackageCatalogers/go-cataloger-2                     1.42k ± 0%     1.46k ± 0%   +2.96%  (p=0.008 n=5+5)
ImagePackageCatalogers/rust-cataloger-2                   2.83k ± 0%     3.09k ± 0%   +9.11%  (p=0.008 n=5+5)

@@ -129,6 +133,8 @@ func candidateVendors(p pkg.Package) []string {
// generate sub-selections of each candidate based on separators (e.g. jenkins-ci -> [jenkins, jenkins-ci])
addAllSubSelections(vendors)

vendors.removeByCondition(invalidCPEValue)
Copy link
Contributor

@kzantow kzantow Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would all clearly work for the case where there's as : in the vendor... are there other fields that could contain colons or other characters that would result in these being invalid? Would a better approach be validating the CPE can be re-parsed later on? Would that be too slow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting! I'm up for that approach. If I understand correctly: before returning all generated CPEs, we'd test each one by serializing it and then deserializing it. Is that right?

@wagoodman Curious for your thoughts here, too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I just remembered we have benchmark tests that could show us a substantial change in performance!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempted in 1a51bd5 — let me know what you think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach much better... instead of covering the single case of : we're getting to the root problem of "don't allow CPEs that will never work" 👍

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Copy link
Contributor

@wagoodman wagoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: naming... fieldCandidateCondition > filterFieldCandidateFn 👍
Also, adding the optional conditions to value() and list() is very nice 🚀

@luhring luhring merged commit dd23d49 into main Sep 24, 2021
@luhring luhring deleted the prevent-invalid-cpe-field-values branch September 24, 2021 13:23
spiffcs added a commit that referenced this pull request Sep 30, 2021
* main:
  Add vendor + product known good CPE field values (#517)
  Add SBOM to releases (#500)
  Add announcement for KubeCon meetup (#515)
  Prevent invalid CPE field values (#514)
  Filter out CPE product candidates that are asterisks (#513)
  Use Anchore fork of packageurl lib without replace directive (#512)
  update log file permissions to 0644 (#511)

Signed-off-by: Christopher Angelo Phillips <christopher.phillips@anchore.com>
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
* Fix CPE set comparison mismatch

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>

* Add failing test to assert CPE generation excludes URLs

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>

* Add removeByCondition method to fieldCandidateSet

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>

* Prevent invalid CPE values for products and vendors

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>

* Introduce removeWhere and rename filter to condition

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>

* Refactor fieldCandidateSet and condition logic

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>

* Move CPE parsing filter to end of CPE generation

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants