Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

['Document cannot be parsed.'] - SPDX format file being used #28

Closed
RicardoAReyes opened this issue Dec 29, 2022 · 13 comments
Closed

['Document cannot be parsed.'] - SPDX format file being used #28

RicardoAReyes opened this issue Dec 29, 2022 · 13 comments
Assignees
Labels
bug Something isn't working

Comments

@RicardoAReyes
Copy link

(ntia-conformance-checker) (base) ricardo@MB cli_tools % python checker.py
File name: /Users/ricardo/_git/spdx_sboms/us-demo-org-2_react.spdx
['Document cannot be parsed.']

Is there a way to produce more details on the error why the document cannot be parsed?

I'm referencing a standard SPDX file format. I have presented absolute path to the source file, and I have even moved the file inside the cli_tools/ directory where checker.py is located.

(ntia-conformance-checker) (base) ricardo@MC cli_tools % python checker.py
File name: us-demo-org-2_react.spdx
['Document cannot be parsed.']

Thanks.

@jspeed-meyers
Copy link
Collaborator

@RicardoAReyes, thanks for the request.

It might indeed be possible to produce more details. I'll investigate.

It would help if I could use the actual SPDX SBOM document you are using. Are you willing and able to share it?

@RicardoAReyes
Copy link
Author

RicardoAReyes commented Dec 29, 2022

sure, more than happy to share one of the spdx files @jspeed-meyers , our system generated an SBOM SPDX format file for the React framework open source project.

react.spdx
https://drive.google.com/file/d/1qVTIbDVAXy9AJPg75rqoj7cRjreFSAU9/view?usp=share_link

@RicardoAReyes
Copy link
Author

here is a second example, react_app.spdx file

https://drive.google.com/file/d/11mirJpnD2PkWrQde0nhToOPr0YKRCz7S/view?usp=share_link

@jspeed-meyers
Copy link
Collaborator

Excellent, thank you, @RicardoAReyes.

@jspeed-meyers
Copy link
Collaborator

@RicardoAReyes, I pushed a branch that prints out more information. See PR #29.

When I run:

python3 ntia_conformance_checker/cli_tools/checker.py --file ~/Desktop/react.spdx

The error I now see is:

['Document cannot be parsed: FileType Not Supported/Users/johnspeedmeyers/Desktop/react.spdx']

When I check filetype, I get:

file ~/Desktop/react.spdx
/Users/johnspeedmeyers/Desktop/react.spdx: ASCII text

Hmm, I'm not an SPDX file type expert. I'm not able to quickly solve this. Let's call in reinforcements. @goneall or @linynjosh, any thoughts?

If nothing turns up, I can investigate more next week :)

@goneall
Copy link
Member

goneall commented Dec 29, 2022

I just ran the SPDX Tools Online Validator on the file and it did parse without any exceptions. It did produce a few warnings, but I don't think they are related to this issue.

Below are the warnings:

The following warning(s) were raised: [Package at line 8345 invalid: Missing required package files for yocto-queue, Package at line 8345 invalid: Missing required package verification code for package yocto-queue, Missing required copyright text for yocto-queue in yocto-queue, Missing required license information from files for yocto-queue, Missing required package files for yocto-queue, Missing required package verification code for package yocto-queue]

@RicardoAReyes
Copy link
Author

RicardoAReyes commented Dec 29, 2022

Thank you for pushing that code exception fix @jspeed-meyers and @goneall for validating the SBOM spdx file with the online tool.

@jspeed-meyers do you think checker.py needs more reviews as to why it not able to parsed our files?

@jspeed-meyers
Copy link
Collaborator

@RicardoAReyes, feel free to take a look. I'm not sure of the bug right now. It might be in the underlying python library doing the parsing. Feel free to investigate or submit a PR.

I will take a look early next week :)

@goneall
Copy link
Member

goneall commented Dec 29, 2022

I looked at the dependencies and this library is using a fork of the tools-python library.

There has recently been a lot of improvements in the library which are not reflected in the fork.

I'm thinking we should migrate to the most recent release of the tools-python code. It looks like there are 3 commits we'll want to move over.

@jspeed-meyers
Copy link
Collaborator

@goneall, ahh, good find. Let's do that migration and then come back to this issue. I suspect they're related.

For "moving over" these commits, do you mean submitting a PR to upstream of tools-python with those 3 additional commits?

@goneall
Copy link
Member

goneall commented Dec 30, 2022

For "moving over" these commits, do you mean submitting a PR to upstream of tools-python with those 3 additional commits?

Yes - that what I was thinking. I recall @linynjosh was going to make a PR to the Python libraries - not sure if it was already done.

@licquia - let me know if you recall if the upstream changes were made.

@jspeed-meyers
Copy link
Collaborator

@RicardoAReyes, I believe this bug is fixed due to now-merged PR #31.

When I now run the tool:

python3 ntia_conformance_checker/cli_tools/checker.py --file react.spdx

I get this, which I have abridged to make it easier to read:

['us-demo-org-2/react: Document  version SPDX-2.0 not supported.', 'us-demo-org-2/react: Document has no author.', 'us-demo-org-2/react: abab has no supplier.', 'us-demo-org-2/react: abab has no supplier.',...

To get a machine-readable JSON output, you can use this command:

python3 ntia_conformance_checker/cli_tools/checker.py --file ../../../react.spdx --output json

You will get this output, which I have abridged to make it easier to read:

{
  "componentVersions": {
    "nonconformantComponents": [],
    "allProvided": true
  },
  "componentIdentifiers": {
    "nonconformantComponents": [],
    "allProvided": true
  },
  "componentSuppliers": {
    "nonconformantComponents": [
      "us-demo-org-2/react:",
     ...
    ],
    "allProvided": false
  },
  "componentNames": {
    "numNonconformantComponents": 0,
    "allProvided": true
  },
  "authorNameProvided": false,
  "timestampProvided": true,
  "dependencyRelationshipsProvided": true,
  "isNtiaConformant": false,
  "sbomName": "us-demo-org-2/react"
}

PTAL and let me know if this output looks right. Further bug reports welcome, though please open separate issues (assuming the issue is different then the ['Document cannot be parsed .'] issue). Thank you again for bringing this bug to our attention!!!

I'll close this issue in a week, and if you can't get to this issue by then, feel free to re-open it then if there is still an outstanding issue.

@jspeed-meyers jspeed-meyers self-assigned this Jan 2, 2023
@jspeed-meyers jspeed-meyers added the bug Something isn't working label Jan 2, 2023
@RicardoAReyes
Copy link
Author

Thank you @jspeed-meyers @goneall I appreciate everyone's contribution to address the issue so rapidly. I will continue to test and submit new issues should I run into more problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants