Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Component __lt__ function can result in invalid components SortedSet #586

Closed
joe-dipilato opened this issue Apr 11, 2024 · 10 comments · Fixed by #587 or #599
Closed

The Component __lt__ function can result in invalid components SortedSet #586

joe-dipilato opened this issue Apr 11, 2024 · 10 comments · Fixed by #587 or #599
Assignees
Labels
bug Something isn't working

Comments

@joe-dipilato
Copy link

Since the lt function only compares type, group, name, version, the components SortedSet can have their ordering violated
This violates the constraints that containers use to track their elements.

Since adding or removing items from the sorted set will only be ordered based on the lt operator, items that have the same type/group/name/version are not ordered deterministically. this can result in checks for contains to pass for ._set but fail for _list.

An example use case is when an SBOM has multiple components that have the same type, group, name, and version, but DIFFERENT qualifiers etc..

The solution just requires updating the lt function to include additional criteria including the purl, cpe, etc...

An additional side-effect of the current lt operation is that the contents produced from the exact same code can produce different outputs.

@madpah madpah changed the title The Component __lt__ function can result in invalid components SortedSet The Component __lt__ function can result in invalid components SortedSet Apr 12, 2024
@madpah madpah self-assigned this Apr 12, 2024
@madpah
Copy link
Collaborator

madpah commented Apr 12, 2024

Whilst ordering is not important according to the CycloneDX schemas, the current implementation with it's use of SortedSet does attempt to provide ordering.

I see no reason to update the __lt__ method to consider all parts of the Component.

@madpah madpah added the bug Something isn't working label Apr 12, 2024
@joe-dipilato
Copy link
Author

If the bom has multiple components that have the same type/group/name/version, performing components.remove(component) can result in a ValueError since it is possible for the SortedSet to report:
component in components._set -> True
component in components._list -> False

And since the remove SortedSet remove function tries to remove from both _set and _list, the command will raise an Exception.

@jkowalleck
Copy link
Member

jkowalleck commented Apr 12, 2024

re: #586 (comment)
@joe-dipilato could you provide a practical example? Something that could e used in regression test?

madpah added a commit that referenced this issue Apr 12, 2024
Signed-off-by: Paul Horton <paul.horton@owasp.org>
@joe-dipilato
Copy link
Author

An example would be having packages for:

https://repo1.maven.org/maven2/com/google/googlejavaformat/google-java-format/1.19.1

Which contains jars for
-all-deps.jar -javadoc.jar -sources.jar .jar

This is distinguished via purl. e.g.

  • pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1?classifier=all-deps
  • pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1
  • pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1?classifier=javadoc
  • pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1?classifier=sources

though the actual components have the same type/group/name/version

@joe-dipilato
Copy link
Author

    {
      "bom-ref": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1",
      "group": "com.google.googlejavaformat",
      "name": "google-java-format",
      "purl": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1",
      "type": "library",
      "version": "1.19.1"
    },
    {
      "bom-ref": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1?classifier=all-deps",
      "group": "com.google.googlejavaformat",
      "name": "google-java-format",
      "purl": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1?classifier=all-deps",
      "type": "library",
      "version": "1.19.1"
    },
    {
      "bom-ref": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1?classifier=sources",
      "group": "com.google.googlejavaformat",
      "name": "google-java-format",
      "purl": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1?classifier=sources",
      "type": "library",
      "version": "1.19.1"
    },

@jkowalleck
Copy link
Member

jkowalleck commented Apr 12, 2024

re: #586 (comment)
re: #586 (comment)

@joe-dipilato I do not understand. Why do you have the exact same component used multiple times?

Is the following not the only correct form?

{
  "bom-ref": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1",
  "group": "com.google.googlejavaformat",
  "name": "google-java-format",
  "purl": "pkg:maven/com.google.googlejavaformat/google-java-format@1.19.1",
  "type": "library",
  "version": "1.19.1",
  "externalReferences": [
    {
     "type": "distribution",
      "url": "https://repo1.maven.org/maven2/com/google/googlejavaformat/google-java-format/1.19.1/google-java-format-1.19.1.jar"
    }
    // other external refs here ... maybe to `bom`, `sources`, etc...
  ]
}

what is the practical use case in which you would have 4 different versions of com.google.googlejavaformat.google-java-format in the path? Is this a thing in Java, that you install all and hope that the path resolution does not shadow the wrong one? Does this actually happen in the real world?

@joe-dipilato
Copy link
Author

@jkowalleck - Yeah this is based on a real-world example.

If you take a look at https://repo1.maven.org/maven2/com/google/googlejavaformat/google-java-format/1.19.1/
there are 4 different jars, so the qualifier is necessary to distinguish between them.

There are many packages that need a qualifier to distinguish.

@jkowalleck
Copy link
Member

@joe-dipilato and you have a CycloneDX which includes multiple of the com.google.googlejavaformat.google-java-format? how comes?

@joe-dipilato
Copy link
Author

In this case it's included in a bundled software artifact along several other software items that are distinguished by qualiers. As long as it's part of a real software distribution, I believe that the CycloneDx format is intended be able to represent the software dependencies.

@jkowalleck
Copy link
Member

thank you for clarification. 💡

madpah added a commit that referenced this issue Apr 22, 2024
…587)

Fixes #586.

Signed-off-by: Paul Horton <paul.horton@owasp.org>
jkowalleck added a commit that referenced this issue Apr 26, 2024
reverts #587 - as this one introduced errors
fixes #598
fixes #586

---------

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>
Signed-off-by: Paul Horton <paul.horton@owasp.org>
Co-authored-by: Paul Horton <paul.horton@owasp.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants