Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes necessary to validate against SPDX jsonschema #197

Closed
wants to merge 2 commits into from

Conversation

dholth
Copy link
Contributor

@dholth dholth commented Nov 12, 2021

These changes were necessary to validate against the SPDX jsonschema https://github.com/spdx/spdx-spec/blob/development/v2.2.2/schemas/spdx-schema.json and to include multiple checksums.

This is a work in progress. No attention has been given to parsing or non-json/yaml formats.

Let me know if this approach is acceptable for the project.

Fixes #184

@pombredanne
Copy link
Member

Thank you ++ This works. ... Great and honored to see you drop by!

@dholth
Copy link
Contributor Author

dholth commented Nov 15, 2021

@pombredanne OK to add Python 3 type annotations? OK to format with black?

@dholth
Copy link
Contributor Author

dholth commented Nov 16, 2021

@pombredanne I think this could be merged. It shoud be able to read/write json that is valid according to the spdx jsonschema, and it should be able to write multiple checksums.

Still open is to read multiple checksums, but that further complicates the code. It would also be nice to add a test that validates the output against the spdx spec using the jsonschema Python package.

@pombredanne
Copy link
Member

@pombredanne OK to add Python 3 type annotations? OK to format with black?

I am not too keen but OK wrt. Python 3 type annotations and yes black is fine, but it would be best in a completely separate PR as this would create a "wall" in the history

@pombredanne
Copy link
Member

I think this could be merged. It shoud be able to read/write json that is valid according to the spdx jsonschema, and it should be able to write multiple checksums.

Still open is to read multiple checksums, but that further complicates the code. It would also be nice to add a test that validates the output against the spdx spec using the jsonschema Python package.

Thanks! I will need a few days before I can review.

@pombredanne
Copy link
Member

I am not sure why the macOS tests fail... but Circle CI is a bit flaky at times

@RodneyRichardson
Copy link
Contributor

macos tests are failing due to using a now-unsupported version of xcode: See #202 for a fix

@nettrino
Copy link

@pombredanne let us know if you require any updates to this - it would be useful to get it merged! Thanks

@bjamesv bjamesv mentioned this pull request Aug 5, 2022
4 tasks
Signed-off-by: Daniel Holth <dholth@anaconda.com>
Signed-off-by: Daniel Holth <dholth@anaconda.com>
@goneall
Copy link
Member

goneall commented Aug 5, 2022

@dholth can you rebase this PR to the latest in the main branch? This should fix the CircleCI errors.

@nicoweidner
Copy link
Collaborator

CircleCI should be fixed once #226 is merged and this PR is rebased on top of the changes.

@nicoweidner
Copy link
Collaborator

@dholth Will you have time to rebase this PR on main? I'd also be happy to do so if you are short on time.

Copy link
Collaborator

@nicoweidner nicoweidner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think adding full package objects to documentDescribes is a good idea, since it contradicts the spec

package["sha1"] = checksum["checksumValue"]
break

document["documentDescribes"] = [{ "Package": package} for package in packages ]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not comply with the json schema, which just has an array of spdx ids here: https://github.com/spdx/spdx-spec/blob/development/v2.3.1/schemas/spdx-schema.json#L219-L226

Even if it's just a temporary internal data structure, I find this highly confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very intentional, this PR didn't invent flatten/unflatten as it is a pain to follow links around when the document is in memory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of flattening the structure in this case though. The only instance where the value is accessed is this part where the packages are parsed, and this can be done just as well without modifying the data structure first. In fact, it is already done this way (although the fact that this logic is technically duplicated should be fixed).

If resolving links becomes annoying, I would find it more natural to add a utility method that handles it instead of introducing an additional intermediate data structure. This will come with a performance impact if called many times, but I would be surprised if this was an issue.

Additional remark: The current structure only considers files that are included in packages. To my understanding, files should also be allowed to be independent of any package (this might be related to the python tools still being out of date though, it seems like snippets are not considered at all, for example).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional comment: My understanding so far is that in a json document, any SPDX-Document DESCRIBES SPDX-Package relationship would be omitted and instead the package id would appear in documentDescribes (similar for files and snippets). This would imply that during parsing, we have to create new relationships for elements of documentDesccribes. I don't see that happening anywhere.

Relevant discussion: #242. Let's see if I'm wrong here 😅

@nicoweidner nicoweidner removed the stale label Oct 25, 2022
@meretp
Copy link
Collaborator

meretp commented Oct 25, 2022

@dholth As you already pointed out this PR doesn't handle parsing or other file formats. Do you plan on working on these things more or are you okay with me working on them?

@dholth
Copy link
Contributor Author

dholth commented Oct 25, 2022

I would be happy for anyone to pick this up, thanks!

@meretp
Copy link
Collaborator

meretp commented Oct 27, 2022

Thanks for your work @dholth! I openend another PR which continues this one and will close this one.

@meretp meretp closed this Oct 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSON is not copatible with v2.2 spec - "packages" and "files" lists should be outside "documentDescribes"
7 participants