Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

purl for apk packages missing when installed db file is not in root #1572

Closed
deitch opened this issue Feb 15, 2023 · 2 comments · Fixed by #1615
Closed

purl for apk packages missing when installed db file is not in root #1572

deitch opened this issue Feb 15, 2023 · 2 comments · Fixed by #1615
Labels
bug Something isn't working

Comments

@deitch
Copy link
Contributor

deitch commented Feb 15, 2023

What happened:

  • If lib/apk/db/installed is in the root of the filesystem being scanned, then syft includes everything, including the purl.
  • If lib/apk/db/installed is below the root of the filesystem being scanned, then syft includes everything, except the purl

In both cases it reads the installed file and parses it, yet for some reason, if not in the root, it misses the purl.

Interestingly, if the package is also in the root, i.e. in addition to the above, it is in lib/apk/db/installed, then it adds the purl for all of them.

What you expected to happen:

Add the apk purl for all packages, wherever it finds them.

Steps to reproduce the issue:

  1. Create a Dockerfile with a simple package, e.g. curl:
FROM alpine:3.17
RUN apk add curl
  1. Build it to a tmpdir, e.g. /tmp/oci:
$ docker buildx build -t spdx-test --output type=local,dest=/tmp/oci .
  1. Run syft on the single dir, it works: syft packages -o spdx-json /tmp/oci - WORKS
  2. Make a directory for it one layer down and copy it: mkdir -p /tmp/single/one && tar -C /tmp/oci -cvf - . | (cd /tmp/single/one; tar -xvf - )
  3. Run syft on the upper layer, see that it all works except missing the purl: syft packages -o spdx-json /tmp/single - FAILS
  4. Repeat with two parallel deeper directories: `mkdir -p /tmp/deep/one && mkdir -p /tmp/deep/two && tar -C /tmp/oci -cvf - . | (cd /tmp/deep/one; tar -xvf - ) && tar -C /tmp/oci -cvf - . | (cd /tmp/deep/two; tar -xvf - )
  5. Run syft on the upper layer, see that it captures both of the instances of the curl package, but still no purl: syft packages -o spdx-json /tmp/deep - FAILS
  6. Create a copy on the root level in addition to deeper, see that it works: mkdir -p /tmp/staggered && mkdir -p /tmp/staggered/one && mkdir -p /tmp/staggered/two && tar -C /tmp/oci -cvf - . | (cd /tmp/staggered/one; tar -xvf - ) && tar -C /tmp/oci -cvf - . | (cd /tmp/staggered/two; tar -xvf - ) && tar -C /tmp/oci -cvf - . | (cd /tmp/staggered; tar -xvf - )
  7. Run syft on the root level, see that it works: syft packages -o spdx-json /tmp/stagged

Output from the first (limited to curl package for readability):

  {
   "name": "curl",
   "SPDXID": "SPDXRef-Package-apk-curl-262ddcbe57a87ed8",
   "versionInfo": "7.87.0-r1",
   "originator": "Person: Natanael Copa \u003cncopa@alpinelinux.org\u003e",
   "downloadLocation": "https://curl.se/",
   "sourceInfo": "acquired package info from APK DB: lib/apk/db/installed",
   "licenseConcluded": "curl",
   "licenseDeclared": "curl",
   "copyrightText": "NOASSERTION",
   "description": "URL retrival utility and library",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:curl:curl:7.87.0-r1:*:*:*:*:*:*:*"
    },
    {
     "referenceCategory": "PACKAGE-MANAGER",
     "referenceType": "purl",
     "referenceLocator": "pkg:apk/alpine/curl@7.87.0-r1?arch=aarch64&upstream=curl&distro=alpine-3.17.2"
    }
   ]
  },

From the single but one layer deep:

  {
   "name": "curl",
   "SPDXID": "SPDXRef-Package-apk-curl-d6e41f455208de3",
   "versionInfo": "7.87.0-r1",
   "originator": "Person: Natanael Copa \u003cncopa@alpinelinux.org\u003e",
   "downloadLocation": "https://curl.se/",
   "sourceInfo": "acquired package info from APK DB: oci/lib/apk/db/installed",
   "licenseConcluded": "curl",
   "licenseDeclared": "curl",
   "copyrightText": "NOASSERTION",
   "description": "URL retrival utility and library",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:curl:curl:7.87.0-r1:*:*:*:*:*:*:*"
    }
   ]
  },

From two parallel but deep:

  {
   "name": "curl",
   "SPDXID": "SPDXRef-Package-apk-curl-a9c664463a5e13cd",
   "versionInfo": "7.87.0-r1",
   "originator": "Person: Natanael Copa \u003cncopa@alpinelinux.org\u003e",
   "downloadLocation": "https://curl.se/",
   "sourceInfo": "acquired package info from APK DB: one/lib/apk/db/installed",
   "licenseConcluded": "curl",
   "licenseDeclared": "curl",
   "copyrightText": "NOASSERTION",
   "description": "URL retrival utility and library",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:curl:curl:7.87.0-r1:*:*:*:*:*:*:*"
    }
   ]
  },
  {
   "name": "curl",
   "SPDXID": "SPDXRef-Package-apk-curl-f9087650e668eed7",
   "versionInfo": "7.87.0-r1",
   "originator": "Person: Natanael Copa \u003cncopa@alpinelinux.org\u003e",
   "downloadLocation": "https://curl.se/",
   "sourceInfo": "acquired package info from APK DB: two/lib/apk/db/installed",
   "licenseConcluded": "curl",
   "licenseDeclared": "curl",
   "copyrightText": "NOASSERTION",
   "description": "URL retrival utility and library",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:curl:curl:7.87.0-r1:*:*:*:*:*:*:*"
    }
   ]
  },

From staggered, i.e. two deep and one in root:

  {
   "name": "curl",
   "SPDXID": "SPDXRef-Package-apk-curl-262ddcbe57a87ed8",
   "versionInfo": "7.87.0-r1",
   "originator": "Person: Natanael Copa \u003cncopa@alpinelinux.org\u003e",
   "downloadLocation": "https://curl.se/",
   "sourceInfo": "acquired package info from APK DB: lib/apk/db/installed",
   "licenseConcluded": "curl",
   "licenseDeclared": "curl",
   "copyrightText": "NOASSERTION",
   "description": "URL retrival utility and library",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:curl:curl:7.87.0-r1:*:*:*:*:*:*:*"
    },
    {
     "referenceCategory": "PACKAGE-MANAGER",
     "referenceType": "purl",
     "referenceLocator": "pkg:apk/alpine/curl@7.87.0-r1?arch=aarch64&upstream=curl&distro=alpine-3.17.2"
    }
   ]
  },
  {
   "name": "curl",
   "SPDXID": "SPDXRef-Package-apk-curl-a9c664463a5e13cd",
   "versionInfo": "7.87.0-r1",
   "originator": "Person: Natanael Copa \u003cncopa@alpinelinux.org\u003e",
   "downloadLocation": "https://curl.se/",
   "sourceInfo": "acquired package info from APK DB: one/lib/apk/db/installed",
   "licenseConcluded": "curl",
   "licenseDeclared": "curl",
   "copyrightText": "NOASSERTION",
   "description": "URL retrival utility and library",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:curl:curl:7.87.0-r1:*:*:*:*:*:*:*"
    },
    {
     "referenceCategory": "PACKAGE-MANAGER",
     "referenceType": "purl",
     "referenceLocator": "pkg:apk/alpine/curl@7.87.0-r1?arch=aarch64&upstream=curl&distro=alpine-3.17.2"
    }
   ]
  },
  {
   "name": "curl",
   "SPDXID": "SPDXRef-Package-apk-curl-f9087650e668eed7",
   "versionInfo": "7.87.0-r1",
   "originator": "Person: Natanael Copa \u003cncopa@alpinelinux.org\u003e",
   "downloadLocation": "https://curl.se/",
   "sourceInfo": "acquired package info from APK DB: two/lib/apk/db/installed",
   "licenseConcluded": "curl",
   "licenseDeclared": "curl",
   "copyrightText": "NOASSERTION",
   "description": "URL retrival utility and library",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:curl:curl:7.87.0-r1:*:*:*:*:*:*:*"
    },
    {
     "referenceCategory": "PACKAGE-MANAGER",
     "referenceType": "purl",
     "referenceLocator": "pkg:apk/alpine/curl@7.87.0-r1?arch=aarch64&upstream=curl&distro=alpine-3.17.2"
    }
   ]
  },

Anything else we need to know

It was @dautovri who first figured this out and explained it. Credit where it is due.

Environment:

  • Output of syft version: latest commit
  • OS (e.g: cat /etc/os-release or similar): ran on macOS and Linux
@deitch deitch added the bug Something isn't working label Feb 15, 2023
@deitch
Copy link
Contributor Author

deitch commented Feb 15, 2023

I think I figured this out. The problem isn't the depth, or having the package at root. The issue is what it actually has in that package.

The cataloger looks for some linux release identifier before it starts parsing packages (any packages), whether they are in root or below. It does it here calling linux.IdentifyRelease, based on its standard list of files here: /etc/os-release, /usr/lib/os-release, /etc/system-release-cpe, /etc/redhat-release, /bin/busybox.

If at least one of these is not available in the root of the directory (or tar, or image, etc.) being scanned, then it cannot find it, and gives up on trying to give a purl for any package whose type requires it. So apk packages lose it, but golang packages do not (I didn't check any others).

This leads to a question and two suggestions:

  1. Why does that matter for apk packages? I see it in the purl, but something happily could run apk while on a different OS. Isn't the pkg:apk/alpine/ enough?
  2. Can we provide an ability to override those files? E.g. syft packages --apk-release-file=/etc/custom-release
  3. Can we provide an ability to just pass it a release explicitly? E.g. syft packages --apk-release-id="cool-linux". Or maybe it should be generic --os-release-id? 🤷‍♂️

I am not wholly sure that any of these fully addresses it, as you can have a situation where I am running on some other OS, my own odd custom distribution or high-level packaging system, but I am taking the packages from the official alpine package repository, by running apk --init-db -p /customdir and then apk add -p /customdir curl. According to the spec for apk purls:

There is no default package repository: this should be implied either from the distro qualifiers key or using a repository base url as repository_url qualifiers key.

So I do want the distro part of the purl to be alpine, even though my base OS does not have that in the discoverable files, e.g. /etc/os-release, or even in any discoverable files. That is partially why I wrote --apk-release-id instead of --os-release-id.

Some alternatives:

  • The spec provides for repository_url as an alternative to distro; have an ability to override it, e.g. syft packages --alpine-repository-url? Bypasses some of the nice scanning, unfortunately, so not really enamoured of it.
  • Read the distro from /etc/apk/repositories but relative to wherever the lib/apk/db/installed is found. This has the benefit of working well with multiple parallel entries. E.g. /foo/one packages come from the official alpine repositories, but /foo/two packages come from my own private repositories. Both are apk packages, both are in one dir I am scanning /foo, but are distinct.
  • Combine the two: read etc/apk/repositories, but use it to populate not distro but repository_url. This at least gets around trying to parse the URLs and convert them to something canonical.

I think I like the last one the best, as it avoids global settings that might break other things.

This would be so much easier if lib/apk/db/installed just included where the package came from. But it doesn't.

@deitch
Copy link
Contributor Author

deitch commented Feb 21, 2023

It occurs to me that using etc/apk/repositories might have issues too. What do you do when there are 4 URLs in there? Nothing in lib/apk/db/installed indicates which repository it came from (if it did, you wouldn't need linux.IdentifyRelease at all).

I believe that the apk logic for installation is to go through each repository in order, until it finds a matching package. I'm pretty sure we do not want syft to go do all of that.

Looking for some direction to help fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant