Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running Tern with -r and -c gives different results #999

Closed
JustinWonjaePark opened this issue Jul 13, 2021 · 16 comments · Fixed by #1031
Closed

Running Tern with -r and -c gives different results #999

JustinWonjaePark opened this issue Jul 13, 2021 · 16 comments · Fixed by #1031
Assignees
Labels
bug Something went wrong

Comments

@JustinWonjaePark
Copy link

Problem Statement

When you run tern without Scancode extension to create report,
you can get installed package information in "packages":
tern repot -f json -i debian:buster -o debian.json
But you usually don't get file level license information with out scancode extension.

So, if you want to get file level information, you need to run it with Scancode
tern repot -f json -x scancode -i debian:buster -o debian_scancode.json
But on report with Scancode extension run, package information is (usually)missing.

  • same thing happened for 'debian:buster', 'ubuntu:18.04' and etc.
  • But, scanning fossology/fossology:3.10.0 with Scancode extension provided package information from some layer(s)

So, if you want both package information and file level information, you need to run the scan twice,
and merge the reports by yourself.

Describe the Proposal
Make an option (or change Scancode extension behavior) to combine the result of 'general' scan result and result result with Scancode extension.

@JustinWonjaePark
Copy link
Author

JustinWonjaePark commented Jul 13, 2021

Just in case you need,

Tern at commit 273e3c8
Python 3.6.13
Docker version 19.03.13, build 4484c46d9d

Distributor ID: Ubuntu
Description: Ubuntu 16.04.7 LTS
Release: 16.04
Codename: xenial

@JustinWonjaePark
Copy link
Author

Scanning fossology/fossology:3.10 and postgres:9.5.25 with scancode extension provided package information.
It is bit different behavior comparing with scanning debian:buster. which didn't provided package information at all.

  • yet, result of fossology/fossology:3.10 contained package information for 'some' layers. not all.

@rnjudge
Copy link
Contributor

rnjudge commented Jul 23, 2021

I ran a regular Tern scan on fossology/fossology:3.10.0 and see that not all layers have packages installed. Perhaps this is the reason you're not seeing package information for all layers when running with scancode? There are not necessarily packages installed in every later of a container image. For the fossology/fossology:3.10.0 layer I see packages in layers 1 and 3.

@rnjudge
Copy link
Contributor

rnjudge commented Jul 23, 2021

@JustinWonjaePark I do see the issue you are describing -- if I run tern report -i debian:buster -x scancode -o deb_scancode.out regardless of whether cache is populated or not, only file licenses are reported.
We had this same issue open for the some time ago -- the issue with the packages missing was thought to be an issue with scancode but we thought it was resolved. Seems like we need to take another look at this.

@nishakm when you are back from vacation can you poke at this? I'll see how far I get before vacation and leave any updates here. When Tern is run with scancode for debian:buster, it doesn't run any of the native package metadata collection, which doesn't seem to be right? As a user, I would expect Tern to run the native package metadata collection AND scancode file/package collection when running with the -x scancode option and merge the results into an output report. On a regular Tern run for a debian image, the log would show "Processing Debian copyrights..." to indicate it is collecting package licenses using debian-inspector but when Tern is run with scancode on debian:buster, you only see the scancode command being executed in the log.

021-07-22 23:14:06,060 - DEBUG - common - Reading files in filesystem...
2021-07-22 23:14:07,980 - DEBUG - rootfs - Running command: sudo /home/rjudge/ternenv/bin/scancode -ilpcu --quiet --timeout 300 -n 4 --json - /home/rjudge/.tern/temp/e63992559353a522e62a73bc137fd023ee97e1226b96947df8e00e73d1a9d627/contents
2021-07-22 23:22:30,759 - DEBUG - executor - Collecting file data...
2021-07-22 23:22:34,371 - DEBUG - generator - Creating a detailed report of components in image...
2021-07-22 23:22:34,537 - DEBUG - rootfs - Running command: sudo rm -rf /home/rjudge/.tern/temp/mergedir
2021-07-22 23:22:34,551 - DEBUG - rootfs - Running command: sudo rm -rf /home/rjudge/.tern/temp/workdir
2021-07-22 23:22:34,563 - DEBUG - rootfs - Running command: sudo rm -rf /home/rjudge/.tern/temp/e63992559353a522e62a73bc137fd023ee97e1226b96947df8e00e73d1a9d627/contents
2021-07-22 23:22:34,696 - DEBUG - prep - Tearing down...

Even more strange is that even when the cache is populated with package info from a previous Tern run, running Tern with scancode does not pull the cached package metadata into the report. (I think this behavior is related to #1000 as well).

It's also interesting that Justin is seeing packages with scancode for other containers like postgres:9.5.25 which has a debian stretch base OS. I'm running scancode with postgres right now (likely overnight) to see what I find there and will report back.

@rnjudge
Copy link
Contributor

rnjudge commented Jul 26, 2021

Confirmed that running (with a populated cache) Tern + Scancode with the postgres:9.5.25 container image produces package information despite having a debian base OS. So the behavior where Tern does not merge cache package data with scancode data seems to be (so far) unique to debian:<> images.

@rnjudge rnjudge added the bug Something went wrong label Jul 26, 2021
@nishakm
Copy link
Contributor

nishakm commented Aug 6, 2021

@rnjudge I am not able to reproduce the package level inventory stuff.

Default inventory:

        Packages found in Layer:                                                                                                                                                              
        +------------------------+------------------------+---------+------------+                                                                                                            
        | Package                | Version                | License | Pkg Format |                                                                                                            
        +------------------------+------------------------+---------+------------+                                                                                                            
        | adduser                | 3.118                  |         | deb        |                                                                                                            
        | apt                    | 1.8.2.2                |         | deb        |                                                                                                            
        | base-files             | 10.3+deb10u9           |         | deb        |                                                                                                            
        | base-passwd            | 3.5.46                 |         | deb        |                                                                                                            
        | bash                   | 5.0-4                  |         | deb        |                                                                                                            
        | bsdutils               | 1:2.33.1-0.1           |         | deb        |                                                                                                            
        | coreutils              | 8.30-3                 |         | deb        |                                                                                                            
        | dash                   | 0.5.10.2-5             |         | deb        |                                                                                                            
        | debconf                | 1.5.71                 |         | deb        |                                                                                                            
        | debian-archive-keyring | 2019.1+deb10u1         |         | deb        |                                                                                                            
        | debianutils            | 4.8.6.1                |         | deb        |                                                                                                            
        | diffutils              | 1:3.7-3                |         | deb        |                                                                                                            
        | dpkg                   | 1.19.7                 |         | deb        |                                                                                                            
        | e2fsprogs              | 1.44.5-1+deb10u3       |         | deb        |                                                                                                            
        | fdisk                  | 2.33.1-0.1             |         | deb        |                                                                                                            
        | findutils              | 4.6.0+git+20190209-2   |         | deb        |                                                                                                            
        | gcc-8-base             | 8.3.0-6                |         | deb        |                                                                                                            
        | gpgv                   | 2.2.12-1+deb10u1       |         | deb        |                                                                                                            
        | grep                   | 3.3-1                  |         | deb        |                                                                                                            
        | gzip                   | 1.9-3                  |         | deb        |                                                                                                            
        | hostname               | 3.21                   |         | deb        |                                                                                                            
        | init-system-helpers    | 1.56+nmu1              |         | deb        |                                                                                                            
        | iproute2               | 4.20.0-2+deb10u1       |         | deb        |                                                                                                            
        | iputils-ping           | 3:20180629-2+deb10u2   |         | deb        |                                                                                                            
        | libacl1                | 2.2.53-4               |         | deb        |                                                                                                            
        | libapt-pkg5.0          | 1.8.2.2                |         | deb        |                                                                                                            
        | libattr1               | 1:2.4.48-4             |         | deb        |                                                                                                            
        | libaudit-common        | 1:2.8.4-3              |         | deb        |                                                                                                            
        | libaudit1              | 1:2.8.4-3              |         | deb        |                                                                                                            
        | libblkid1              | 2.33.1-0.1             |         | deb        |                                                                                                            
        | libbz2-1.0             | 1.0.6-9.2~deb10u1      |         | deb        |                                                                                                            
        | libc-bin               | 2.28-10                |         | deb        |                                                                                                            
        | libc6                  | 2.28-10                |         | deb        |                                                                                                            
        | libcap-ng0             | 0.7.9-2                |         | deb        |                                                                                                            
        | libcap2                | 1:2.25-2               |         | deb        |                                                                                                            
        | libcap2-bin            | 1:2.25-2               |         | deb        |                                                                                                            
        | libcom-err2            | 1.44.5-1+deb10u3       |         | deb        |                                                                                                            
        | libdb5.3               | 5.3.28+dfsg1-0.5       |         | deb        |                                                                                                            
        | libdebconfclient0      | 0.249                  |         | deb        |                                                                                                            
        | libelf1                | 0.176-1.1              |         | deb        |                                          
       | libext2fs2             | 1.44.5-1+deb10u3       |         | deb        |                                                                                                  [188/1072]
        | libfdisk1              | 2.33.1-0.1             |         | deb        |
        | libffi6                | 3.2.1-9                |         | deb        |
        | libgcc1                | 1:8.3.0-6              |         | deb        |
        | libgcrypt20            | 1.8.4-5                |         | deb        |
        | libgmp10               | 2:6.1.2+dfsg-4         |         | deb        |
        | libgnutls30            | 3.6.7-4+deb10u6        |         | deb        |
        | libgpg-error0          | 1.35-1                 |         | deb        |
        | libhogweed4            | 3.4.1-1                |         | deb        |
        | libidn2-0              | 2.0.5-1+deb10u1        |         | deb        |
        | liblz4-1               | 1.8.3-1                |         | deb        |
        | liblzma5               | 5.2.4-1                |         | deb        |
        | libmnl0                | 1.0.4-2                |         | deb        |
        | libmount1              | 2.33.1-0.1             |         | deb        |
        | libncursesw6           | 6.1+20181013-2+deb10u2 |         | deb        |
        | libnettle6             | 3.4.1-1                |         | deb        |
        | libp11-kit0            | 0.23.15-2+deb10u1      |         | deb        |
        | libpam-modules         | 1.3.1-5                |         | deb        |
        | libpam-modules-bin     | 1.3.1-5                |         | deb        |
        | libpam-runtime         | 1.3.1-5                |         | deb        |
        | libpam0g               | 1.3.1-5                |         | deb        |
        | libpcre3               | 2:8.39-12              |         | deb        |
        | libseccomp2            | 2.3.3-4                |         | deb        |
        | libselinux1            | 2.8-1+b1               |         | deb        |
        | libsemanage-common     | 2.8-2                  |         | deb        |
        | libsemanage1           | 2.8-2                  |         | deb        |
        | libsepol1              | 2.8-1                  |         | deb        |
        | libsmartcols1          | 2.33.1-0.1             |         | deb        |
        | libss2                 | 1.44.5-1+deb10u3       |         | deb        |
        | libstdc++6             | 8.3.0-6                |         | deb        |
        | libsystemd0            | 241-7~deb10u7          |         | deb        |
        | libtasn1-6             | 4.13-3                 |         | deb        |
        | libtinfo6              | 6.1+20181013-2+deb10u2 |         | deb        |
        | libudev1               | 241-7~deb10u7          |         | deb        |
        | libunistring2          | 0.9.10-1               |         | deb        |
        | libuuid1               | 2.33.1-0.1             |         | deb        |
        | libxtables12           | 1.8.2-4                |         | deb        |
        | libzstd1               | 1.3.8+dfsg-3+deb10u2   |         | deb        |
        | login                  | 1:4.5-1.1              |         | deb        |
        | mawk                   | 1.3.3-17+b3            |         | deb        |
        | mount                  | 2.33.1-0.1             |         | deb        |
        | ncurses-base           | 6.1+20181013-2+deb10u2 |         | deb        |
        | ncurses-bin            | 6.1+20181013-2+deb10u2 |         | deb        |
        | passwd                 | 1:4.5-1.1              |         | deb        |
        | perl-base              | 5.28.1-6+deb10u1       |         | deb        |
        | sed                    | 4.7-1                  |         | deb        |
        | sysvinit-utils         | 2.93-8                 |         | deb        |
        | tar                    | 1.30+dfsg-6            |         | deb        |
        | tzdata                 | 2021a-0+deb10u1        |         | deb        |
        | util-linux             | 2.33.1-0.1             |         | deb        |
        | zlib1g                 | 1:1.2.11.dfsg-1        |         | deb        |
        +------------------------+------------------------+---------+------------+

With scancode inventory:

        Packages found in Layer:                                                                                                                                                              
        +------------------------+------------------------+---------+------------+                                                                                                            
        | Package                | Version                | License | Pkg Format |                                                                                                            
        +------------------------+------------------------+---------+------------+                                                                                                            
        | adduser                | 3.118                  |         | deb        |                                                                                                            
        | apt                    | 1.8.2.2                |         | deb        |                                                                                                            
        | base-files             | 10.3+deb10u9           |         | deb        |                                                                                                            
        | base-passwd            | 3.5.46                 |         | deb        |                                                                                                            
        | bash                   | 5.0-4                  |         | deb        |                                                                                                            
        | bsdutils               | 1:2.33.1-0.1           |         | deb        |                                                                                                            
        | coreutils              | 8.30-3                 |         | deb        |                                                                                                            
        | dash                   | 0.5.10.2-5             |         | deb        |                                                                                                            
        | debconf                | 1.5.71                 |         | deb        |
        | debian-archive-keyring | 2019.1+deb10u1         |         | deb        |
        | debianutils            | 4.8.6.1                |         | deb        |
        | diffutils              | 1:3.7-3                |         | deb        |
        | dpkg                   | 1.19.7                 |         | deb        |
        | e2fsprogs              | 1.44.5-1+deb10u3       |         | deb        |
        | fdisk                  | 2.33.1-0.1             |         | deb        |
        | findutils              | 4.6.0+git+20190209-2   |         | deb        |
        | gcc-8-base             | 8.3.0-6                |         | deb        |
        | gpgv                   | 2.2.12-1+deb10u1       |         | deb        |
        | grep                   | 3.3-1                  |         | deb        |
        | gzip                   | 1.9-3                  |         | deb        |
        | hostname               | 3.21                   |         | deb        |
        | init-system-helpers    | 1.56+nmu1              |         | deb        |
        | iproute2               | 4.20.0-2+deb10u1       |         | deb        |
        | iputils-ping           | 3:20180629-2+deb10u2   |         | deb        |
        | libacl1                | 2.2.53-4               |         | deb        |
        | libapt-pkg5.0          | 1.8.2.2                |         | deb        |
        | libattr1               | 1:2.4.48-4             |         | deb        |
        | libaudit-common        | 1:2.8.4-3              |         | deb        |
        | libaudit1              | 1:2.8.4-3              |         | deb        |
        | libblkid1              | 2.33.1-0.1             |         | deb        |
        | libbz2-1.0             | 1.0.6-9.2~deb10u1      |         | deb        |
        | libc-bin               | 2.28-10                |         | deb        |
        | libc6                  | 2.28-10                |         | deb        |
        | libcap-ng0             | 0.7.9-2                |         | deb        |
        | libcap2                | 1:2.25-2               |         | deb        |
        | libcap2-bin            | 1:2.25-2               |         | deb        |
        | libcom-err2            | 1.44.5-1+deb10u3       |         | deb        |
        | libdb5.3               | 5.3.28+dfsg1-0.5       |         | deb        |
        | libdebconfclient0      | 0.249                  |         | deb        |
        | libelf1                | 0.176-1.1              |         | deb        |
        | libext2fs2             | 1.44.5-1+deb10u3       |         | deb        |
        | libfdisk1              | 2.33.1-0.1             |         | deb        |
        | libffi6                | 3.2.1-9                |         | deb        |
        | libgcc1                | 1:8.3.0-6              |         | deb        |
       | libgnutls30            | 3.6.7-4+deb10u6        |         | deb        |                                                                                                   [21/1072]
        | libgpg-error0          | 1.35-1                 |         | deb        |
        | libhogweed4            | 3.4.1-1                |         | deb        |
        | libidn2-0              | 2.0.5-1+deb10u1        |         | deb        |
        | liblz4-1               | 1.8.3-1                |         | deb        |
        | liblzma5               | 5.2.4-1                |         | deb        |
        | libmnl0                | 1.0.4-2                |         | deb        |
        | libmount1              | 2.33.1-0.1             |         | deb        |
        | libncursesw6           | 6.1+20181013-2+deb10u2 |         | deb        |
        | libnettle6             | 3.4.1-1                |         | deb        |
        | libp11-kit0            | 0.23.15-2+deb10u1      |         | deb        |
        | libpam-modules         | 1.3.1-5                |         | deb        |
        | libpam-modules-bin     | 1.3.1-5                |         | deb        |
        | libpam-runtime         | 1.3.1-5                |         | deb        |
        | libpam0g               | 1.3.1-5                |         | deb        |
        | libpcre3               | 2:8.39-12              |         | deb        |
        | libseccomp2            | 2.3.3-4                |         | deb        |
        | libselinux1            | 2.8-1+b1               |         | deb        |
        | libsemanage-common     | 2.8-2                  |         | deb        |
        | libsemanage1           | 2.8-2                  |         | deb        |
        | libsepol1              | 2.8-1                  |         | deb        |
        | libsmartcols1          | 2.33.1-0.1             |         | deb        |
        | libss2                 | 1.44.5-1+deb10u3       |         | deb        |
        | libstdc++6             | 8.3.0-6                |         | deb        |
        | libsystemd0            | 241-7~deb10u7          |         | deb        |
        | libtasn1-6             | 4.13-3                 |         | deb        |
        | libtinfo6              | 6.1+20181013-2+deb10u2 |         | deb        |
        | libudev1               | 241-7~deb10u7          |         | deb        |
        | libunistring2          | 0.9.10-1               |         | deb        |
        | libuuid1               | 2.33.1-0.1             |         | deb        |
        | libxtables12           | 1.8.2-4                |         | deb        |
        | libzstd1               | 1.3.8+dfsg-3+deb10u2   |         | deb        |
        | login                  | 1:4.5-1.1              |         | deb        |
        | mawk                   | 1.3.3-17+b3            |         | deb        |
        | mount                  | 2.33.1-0.1             |         | deb        |
        | ncurses-base           | 6.1+20181013-2+deb10u2 |         | deb        |
        | ncurses-bin            | 6.1+20181013-2+deb10u2 |         | deb        |
        | passwd                 | 1:4.5-1.1              |         | deb        |
        | perl-base              | 5.28.1-6+deb10u1       |         | deb        |
        | sed                    | 4.7-1                  |         | deb        |
        | sysvinit-utils         | 2.93-8                 |         | deb        |
        | tar                    | 1.30+dfsg-6            |         | deb        |
        | tzdata                 | 2021a-0+deb10u1        |         | deb        |
        | util-linux             | 2.33.1-0.1             |         | deb        |
        | zlib1g                 | 1:1.2.11.dfsg-1        |         | deb        |
        +------------------------+------------------------+---------+------------+

Debian doesn't have package level license metadata.

@rnjudge
Copy link
Contributor

rnjudge commented Aug 11, 2021

Nisha and I discussed this and the issue is reproducible when you:

  1. Clear the cache
  2. Run tern+scancode on a debian:buster image
  3. Run regular tern on debian:buster image
  4. Run tern+scancode again on debian:buster -- you will not see package level info in this report even though package level metadata is present in the cache (from step 3)

What's strange is that this seems to be unique to debian:buster? For step 2, this could be explained if scancode is not reporting package info for this image and therefore, Tern doesn't either (however, this would be a bug to open with scancode project). But for step 4 I would expect package data to be present in the report even if scancode didn't find package metadata previously to store in the cache, since Tern did find package metadata in step 3 that would be in the cache.

@nishakm
Copy link
Contributor

nishakm commented Aug 11, 2021

I am still unable to reproduce this issue :(. I have attached my output files here.

  1. Ran tern -r report -x scancode -i debian:buster -o tern_scancode_nocache.txt
  2. Ran tern report -i debian:buster -o tern_cache.txt
  3. Ran tern report -x scancode -i debian:buster -o tern_scancode_cache.txt

I get package level info in all 3 scenarios. Also, from the log, it looks like the layers are loaded from the cache as expected.
tern_scancode_nocache.txt
tern_cache.txt
tern_scancode_cache.txt

@nishakm
Copy link
Contributor

nishakm commented Aug 11, 2021

It's probably worth noting my scancode version:

ScanCode version 21.8.4

@nishakm
Copy link
Contributor

nishakm commented Aug 12, 2021

Bingo! -r's operation is different from -c!

@rnjudge
Copy link
Contributor

rnjudge commented Aug 12, 2021

I am also running scancode 21.6.7. I will update for future debug on this.

@nishakm
Copy link
Contributor

nishakm commented Aug 12, 2021

Looks like scancode errors out with a multiprocessing.context.TimeoutError at func(self, timeout=timeout or 3600) `. I'm going to try using the default timeout (3600) and see what that'll do.

@nishakm
Copy link
Contributor

nishakm commented Aug 12, 2021

OK, I am having a hard time running scancode on the debian images. It's possible that the package inventory on debian images is just broken for scancode :(

@nishakm
Copy link
Contributor

nishakm commented Aug 16, 2021

This issue with scancode is still open: nexB/scancode-toolkit#2467

@nishakm
Copy link
Contributor

nishakm commented Aug 16, 2021

@rnjudge @JustinWonjaePark It looks like tern's operation of augmenting the results of inventory runs by different methods works as expected, i.e. Running tern using the default method and then running scancode does not remove the data collected by running the default method. However, the -r option should remove all the data and it doesn't do that. Can I change the title of this issue to reflect as such, or would you prefer if I file a new issue?

@JustinWonjaePark
Copy link
Author

@nishakm I am ok with changing the title as you need.

@nishakm nishakm changed the title Running tern with and without Scancode outputs different package information Running Tern with -r and -c gives different results Aug 17, 2021
nishakm pushed a commit to nishakm/tern that referenced this issue Aug 31, 2021
If the redo flag is set, the executor should not load data from
the cache. Hence we pass the redo flag to the load_from_cache
function used in the scancode executor.

Fixes tern-tools#999

Signed-off-by: Nisha K <nishak@vmware.com>
nishakm pushed a commit to nishakm/tern that referenced this issue Sep 2, 2021
If the redo flag is set, the executor should not load data from
the cache. Hence we pass the redo flag to the load_from_cache
function used in the scancode executor.

Fixes tern-tools#999

Signed-off-by: Nisha K <nishak@vmware.com>
rnjudge pushed a commit that referenced this issue Sep 2, 2021
If the redo flag is set, the executor should not load data from
the cache. Hence we pass the redo flag to the load_from_cache
function used in the scancode executor.

Fixes #999

Signed-off-by: Nisha K <nishak@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something went wrong
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants