Skip to content
This repository has been archived by the owner on Jan 27, 2023. It is now read-only.

Unicode parsing failure #910

Closed
tonkapango opened this issue Feb 12, 2021 · 3 comments · Fixed by #923
Closed

Unicode parsing failure #910

tonkapango opened this issue Feb 12, 2021 · 3 comments · Fixed by #923
Labels
Milestone

Comments

@tonkapango
Copy link

Is this a request for help?:
yes

Is this a BUG REPORT or a FEATURE REQUEST? (choose one):
BUG

Version of Anchore Engine and Anchore CLI if applicable:

anchore-cli, version 0.8.1
engine: v0.9.1
What happened:
parsing failure on a unicode field when trying to catalogue python packages.

What did you expect to happen:
proper parsing and scan to complete without this uncaught failure.

Any relevant log output from /var/log/anchore:
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"cataloger 'ruby-gemspec-cataloger' discovered '0' packages","time":"2021-02-11 22:11:03"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"encountered package.json file without a name and/or version field, ignoring this file","time":"2021-02-11 22:11:05"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"cataloger 'javascript-package-cataloger' discovered '11' packages","time":"2021-02-11 22:11:05"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"cataloger 'dpkgdb-cataloger' discovered '332' packages","time":"2021-02-11 22:11:06"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"cataloger 'rpmdb-cataloger' discovered '0' packages","time":"2021-02-11 22:11:06"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"cataloger 'java-cataloger' discovered '2' packages","time":"2021-02-11 22:11:09"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"cataloger 'apkdb-cataloger' discovered '0' packages","time":"2021-02-11 22:11:09"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"debug","msg":"cataloger 'go-cataloger' discovered '0' packages","time":"2021-02-11 22:11:10"}
2021-02-11 22:11:11+0000 [-] [Thread-764] [anchore_engine.utils/run_check()] [ERROR] {"level":"error","msg":"failed to catalog input: 1 error occurred:\n\t* unable to catalog python package=/opt/mxnet/python/mxnet.egg-info/PKG-INFO: cannot parse field from line: '\u003c!--- or more contributor license agreements. See the NOTICE file --\u003e'\n\n","time":"2021-02-11 22:11:11"}
2021-02-11 22:11:13+0000 [-] Traceback (most recent call last):
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/localanchore_standalone.py", line 1100, in analyze_image
2021-02-11 22:11:13+0000 [-] analyzer_report = run_anchore_analyzers(
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/localanchore_standalone.py", line 903, in run_anchore_analyzers
2021-02-11 22:11:13+0000 [-] analyzer_report = analyzer_manager.run(
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/analyzers/manager.py", line 18, in run
2021-02-11 22:11:13+0000 [-] _run_syft(analyzer_report, copydir)
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/analyzers/manager.py", line 91, in _run_syft
2021-02-11 22:11:13+0000 [-] results = syft.catalog_image(imagedir=copydir)
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/analyzers/syft/init.py", line 16, in catalog_image
2021-02-11 22:11:13+0000 [-] all_results = run_syft(imagedir)
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/syft_wrapper.py", line 20, in run_syft
2021-02-11 22:11:13+0000 [-] stdout, _ = run_check(shlex.split(cmd), env=proc_env)
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/utils.py", line 277, in run_check
2021-02-11 22:11:13+0000 [-] raise CommandException(cmd, code, stdout, stderr)
2021-02-11 22:11:13+0000 [-] anchore_engine.utils.CommandException: Non-zero exit status code when running subprocess: cmd=syft -vv -o json oci-dir:/analysis_scratch/acbdc3f6-29e6-4756-be55-4e6752cf0ccf/raw, rc=1
2021-02-11 22:11:13+0000 [-]
2021-02-11 22:11:13+0000 [-] During handling of the above exception, another exception occurred:
2021-02-11 22:11:13+0000 [-]
2021-02-11 22:11:13+0000 [-] Traceback (most recent call last):
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/services/analyzer/analysis.py", line 333, in process_analyzer_job
2021-02-11 22:11:13+0000 [-] image_data = perform_analyze(
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/services/analyzer/analysis.py", line 193, in perform_analyze
2021-02-11 22:11:13+0000 [-] analyzed_image_report, manifest_raw = localanchore_standalone.analyze_image(
2021-02-11 22:11:13+0000 [-] File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/localanchore_standalone.py", line 1125, in analyze_image
2021-02-11 22:11:13+0000 [-] raise AnalysisError(
2021-02-11 22:11:13+0000 [-] anchore_engine.clients.localanchore_standalone.AnalysisError: failed to download, unpack, analyze, and generate image export (nvcr.io/nvidian/prodsec/sridevi@sha256:104a0b3f3cc2f7f315b60390685450a0355dd6a76f43ea3ad8d046183658cc57) - exception: Non-zero exit status code when running subprocess: cmd=syft -vv -o json oci-dir:/analysis_scratch/acbdc3f6-29e6-4756-be55-4e6752cf0ccf/raw, rc=1
2021-02-11 22:11:13+0000 [-] [Thread-764] [anchore_engine.services.analyzer.analysis/process_analyzer_job()] [ERROR] problem analyzing image - exception: failed to download, unpack, analyze, and generate image export (nvcr.io/nvidian/prodsec/sridevi@sha256:104a0b3f3cc2f7f315b60390685450a0355dd6a76f43ea3ad8d046183658cc57) - exception: Non-zero exit status code when running subprocess: cmd=syft -vv -o json oci-dir:/analysis_scratch/acbdc3f6-29e6-4756-be55-4e6752cf0ccf/raw, rc=1
2021-02-11 22:11:17+0000 [-] [Thread-8] [anchore_engine.services.analyzer.service/handle_image_analyzer()] [INFO] worker thread completed

What docker images are you using:
anchore/anchore-engine v0.9.1 aec8da91351f 8 days ago 599MB

How to reproduce the issue:
anchore-cli image add basimentos/poc:anchore_unicode_bug

added a slimmed down image, with the problem. the file can be found /opt/python/egg-info

Anything else we need to know:
thanks!!!!!!!!

@zhill
Copy link
Member

zhill commented Feb 16, 2021

HI, @tonkapango , thanks for reporting this. We're looking into it. This looks like an issue within Syft itself. If it proves to be that we'll open an issue in that repo as well to track the fix there, which will then be included in engine after it gets released in Syft.

To reproduce it:

❯ syft docker:basimentos/poc:anchore_unicode_bug
 ✔ Pulled image
 ✔ Loaded image
 ✔ Parsed image
 ⠋ Cataloging image     [packages 0]

[0025] ERROR failed to catalog input: 1 error occurred:
	* unable to catalog python package=/opt/python/egg-info/PKG-INFO: cannot parse field from line: '<!--- or more contributor license agreements.  See the NOTICE file -->'

@zhill
Copy link
Member

zhill commented Feb 17, 2021

Fix will be delivered as a dependency update on the Syft version.

zhill added a commit to zhill/anchore-engine that referenced this issue Feb 26, 2021
…ue. Fixes anchore#910

Signed-off-by: Zach Hill <zach@anchore.com>
zhill added a commit to zhill/anchore-engine that referenced this issue Feb 26, 2021
…ue. Fixes anchore#910

Signed-off-by: Zach Hill <zach@anchore.com>
zhill added a commit that referenced this issue Feb 28, 2021
Fixes #910 by updating to syft 0.12.7
@zhill zhill closed this as completed Mar 4, 2021
@zhill zhill linked a pull request Mar 4, 2021 that will close this issue
dakaneye pushed a commit that referenced this issue Mar 10, 2021
…ue. Fixes #910

Signed-off-by: Zach Hill <zach@anchore.com>
Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>
robertp pushed a commit that referenced this issue Mar 11, 2021
* Improve the message and description for vulnerability_data_unavailable and stale_feed_data triggers in the vulnerabilities gate. Fixes #879

Signed-off-by: Zach Hill <zach@anchore.com>

* Bump version numbers for 0.9.1

Signed-off-by: Robert Prince <robert.prince@anchore.com>

* Multiple policy bundle dirs (#862)

* Allow for localconfig to read policy bundles from multiple dirs.

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Expect fully-qualifed policy bundle dirs.

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Reload policy bundle from file whenever a new bundle dir is added.

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Linting

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Linting, again.

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Linting commas

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Fix test.

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Code review comments, add some extra logging and another test.

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Linting

Signed-off-by: Daniel Palmer <dan.palmer@anchore.com>

* Fix method name to match parent class

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* Removed ui from swagger url

Signed-off-by: Zane Burstein <zane.burstein@anchore.com>

* Add ability to support multiple grant types for the oauth client

Signed-off-by: Zach Hill <zach@anchore.com>

* Update Dockerfile to use UBI 8.3. Fixes #888

Signed-off-by: Zach Hill <zach@anchore.com>

* Update CHANGELOG.md for 0.9.1

Signed-off-by: Zach Hill <zach@anchore.com>

* Fix confusing typo in changelog

Signed-off-by: Zach Hill <zach@anchore.com>

* Update syft to v0.12.5

Signed-off-by: Dan Luhring <dan.luhring@anchore.com>

* add bundles/ dir to anchore_service_dir

Signed-off-by: Brady Todhunter <bradyt@anchore.com>

* Updates to vulnerability listing dedup logic

- prioritize vulnerabilities from other namespaces over nvd out vulnerabilities
- filter duplicates

Fixes #893

Signed-off-by: Swathi Gangisetty <swathi@anchore.com>

* Set the python package location according to the package key, which is the absolute path (#895)

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* Update the scanner config method in policy engine for providing overr… (#896)

* Update the scanner config method in policy engine for providing overridable functions for vuln and cpe results. Adds use of that in the vulnerability policy gate

Signed-off-by: Zach Hill <zach@anchore.com>

* first draft at a dedup pass

Signed-off-by: Swathi Gangisetty <swathi@anchore.com>

* Try to load Policy Engine ImageCpes from syft generated cpes, with fallback to fuzzy matching

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* Add unit test for loader paths

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* include update and meta into cpe comparison

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* fix return type

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* Unit tests for cpe comparisons used for vulnerability dedup

Signed-off-by: Swathi Gangisetty <swathi@anchore.com>

* Downgrade empty content log message to debug level

Signed-off-by: Swathi Gangisetty <swathi@anchore.com>

* tests: change previous invalid schema for integers/floats

According to draft-6 which the new jsonschema supports 1.0 is considered
an integer. Relevant doc from the draft:

In draft-04, "integer" is listed as a primitive type and defined as “a JSON number without a fraction or exponent part”; in draft-06, "integer" is not considered a primitive type and is only defined in the section for keyword "type" as “any number with a zero fractional part”; 1.0 is thus not a valid "integer" type in draft-04 and earlier, but is a valid "integer" type in draft-06 and later; note that both drafts say that integers SHOULD be encoded in JSON without fractional parts

Link https://json-schema.org/draft-06/json-schema-release-notes.html

Signed-off-by: Alfredo Deza <adeza@anchore.com>
(cherry picked from commit 2f859a1)

* requirements: bump jsonschema to avoid legacy validator import issues

Signed-off-by: Alfredo Deza <adeza@anchore.com>
(cherry picked from commit 99dcb10)

* Update syft to 0.12.7 to fix analysis failure due to syft parsing issue. Fixes #910

Signed-off-by: Zach Hill <zach@anchore.com>

* Update cryptography lib to 3.3.2 from 3.3.1. Fixes #909

Signed-off-by: Zach Hill <zach@anchore.com>

* add package filtering by relationships

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>

* Fix the client metadata merge process during oauth init. Fixes #931

Signed-off-by: Zach Hill <zach@anchore.com>

* Bump version

Signed-off-by: Robert Prince <robert.prince@anchore.com>

* Add default admin pw to e2e test values file

Signed-off-by: Robert Prince <robert.prince@anchore.com>

* Make sure to return content correctly for manifest and dockerfile content types

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* [docs] 0.9.2 release notes and changelog updates. includes missing release notes for 0.9.1 (#939)

Updates CHANGELOG for 0.9.2 and adds 0.9.1 and 0.9.2 release notes 

Also fixes ordering problem in release notes page

Signed-off-by: Zach Hill <zach@anchore.com>

* Update Quickstart Docker-Compose image tag to v0.9.2

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* Iterate API patch version 0.1.16->0.1.17

Signed-off-by: Samuel Dacanay <sam.dacanay@anchore.com>

* Add distro mapping from "redhat" to "rhel" for vuln matching

Signed-off-by: Zach Hill <zach@anchore.com>

* Adds distro mapper in import path to ensure rhel instead of redhat distro name

Signed-off-by: Zach Hill <zach@anchore.com>

* Fix integration tests that used redhat as a negative test example

Signed-off-by: Zach Hill <zach@anchore.com>

Co-authored-by: Zach Hill <zach@anchore.com>
Co-authored-by: Dan Palmer <dan.palmer@anchore.com>
Co-authored-by: Samuel Dacanay <sam.dacanay@anchore.com>
Co-authored-by: Zane Burstein <zane.burstein@anchore.com>
Co-authored-by: Dan Luhring <dan.luhring@anchore.com>
Co-authored-by: Brady Todhunter <bradyt@anchore.com>
Co-authored-by: Swathi Gangisetty <swathi@anchore.com>
Co-authored-by: Alfredo Deza <adeza@anchore.com>
Co-authored-by: Alex Goodman <alex.goodman@anchore.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
@dland
Copy link

dland commented Mar 15, 2021

A report from the field:

I was having a variant of the same problem, python analysis aborting on a multi-line string. Upgraded from 0.9.0 to 0.9.2 and the problem went away. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants