Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]"conda list | jake iq -c" produces different results in comparison to "jake iq" and "nexus-iq-cli" #50

Closed
Nantero1 opened this issue Feb 11, 2021 · 9 comments
Labels
bug Something isn't working

Comments

@Nantero1
Copy link

Nantero1 commented Feb 11, 2021

jake v0.2.66
Nexus IQ Server 103

Describe the bug
Running conda list | jake iq -c leads to completely different results in comparison to jake iq without stdout piping

Example environment environment.yml:

name: testenv
channels:
  - conda-forge
  - defaults
dependencies:
  - python
  - pyjwt>=1.6.4,<2.0
  - pandas
  - pip
  - openssl
  - py
  - bleach
  - pip:
    - numpy

To Reproduce

  1. Create and activate a conda environment including some packages from conda-forge, for example "pandas".
conda env create -f environment.yml
conda activate testenv
pip install jake
  1. Setup a Nexus-IQ-Server.
  2. Submit test results to the Nexus IQ server via conda list | jake iq -c and jake iq.
  3. Check the results in the printed nexus-iq link. Notice, that the vulnarability result is completely different.

I expect the result to be identical. It shouldn't matter if I pipe the output of conda list to the tool or let Jake find out the dependencies in the currently activated environment.

Screenshots
conda list | jake iq -c results in:
image

jake iq results in:
image

@Nantero1 Nantero1 added the bug Something isn't working label Feb 11, 2021
@Nantero1
Copy link
Author

Nantero1 commented Feb 16, 2021

The nexus-iq-cli v1.105.0-01 produces the same result as the conda list | jake iq -c command. For bigger environemtns, the jake iq command seems to find in general more issues.

@Nantero1
Copy link
Author

Nantero1 commented Feb 16, 2021

Maybe it is related to #42
Please also notice the additional (.tar.gz) in case of the option 2 - jake iq scan. The nexus-iq-cli report (no screenshot attached), doesn't show this tar.gz ending next to the package name, as the result is identical to the less verbose conda list | jake iq -c result.

@Nantero1 Nantero1 changed the title [BUG]"conda list | jake iq -c" produces different results in comparison to "jake iq" [BUG]"conda list | jake iq -c" produces different results in comparison to "jake iq" and "nexus-iq-cli" Feb 16, 2021
@DarthHater
Copy link
Member

A difference here is that jake iq is evaluating your python environment, not conda.

$ jake iq --help
Usage: jake iq [OPTIONS]

  EXTRA SPECIAL MOVE

  Allows you to perform scans backed by Sonatype's Nexus IQ Server

  Example usage:

      Python scan: jake iq -a <AppId>

      Conda scan: conda list | jake iq -a <AppId> -c

The conda one (which we were not able to do programmatically easily, because of Conda's API), is done via piping in, as you see with conda list. If you do NOT do that, then jake queries the python environment to see what is actually loaded, and the versions.

Does that make sense?

@Nantero1
Copy link
Author

Nantero1 commented Feb 17, 2021

Hi, thanks for answering. It does not fully make sense yet, because:

  1. Why is pandas 1.2.2 a security risk in one case (jake iq), but not in the other case (conda list | jake iq -c). Please see the attached screenshots.
    image
    Same for pip and all other packages regognized by jake iq, but ignored by conda list | jake iq -c. Please note, the packages are in the list, but not recognized as a risk. Please compare to the screenshots above.
    image

  2. vice versa is also an issue, openssl is recognized by conda list | jake iq -c, but not by jake iq.

  3. conda activate testenv loads the conda environment, therefore jake iq should evaluate the loaded conda enviroment.

@DarthHater
Copy link
Member

So the difference I suspect is that when you scan it as conda, it's using conda as it's data source, and when you scan it not as conda aka as just jake on it's own, it's using pypi as it's data source. I would suspect that the data in IQ for PyPI is more comprehensive than it is for Conda. The other thing that comes to mind is that the source of the security risk is a Sonatype ID, not a public CVE. I think for Conda none of the packages have Sonatype ID's, etc... It likely sounds a bit complicated, given the heavy overlap between the two ecosystems. I'm going to send this issue to a few people internally to take a gander at, for the record!

@DarthHater
Copy link
Member

So my suspicion has been mostly confirmed, and comes down to data differences. I agree it's confusing though, and I've escalated it to see if anything can be done, at least in better explaining the data differences.

@Nantero1
Copy link
Author

Nantero1 commented Feb 22, 2021

Thanks for escalating and finding the root cause of the problem :) ! Now it can be addressed.

It would be really nice, if the databases become one, instead of having them separate and incomplete.

To add one more thing: The nexus-iq-cli (the JAR, shipped with the Nexus-IQ-bundle) can use both, the less comprehensive conda database and the more complete pypi db for evaluation of the packages. It recognizes the db to use by the filename and format -- requirements.txt vs conda.txt

UPDATE: Your investigation allowed me to use a workaround. I export both, the pip and conda environment and scan them with the nexus-iq-cli by specifying the temporary directory in which both environments were exported as text files. The result is more complete, but many packages are listed multiple times. For example pandas is listed a dozen times:
image

UPDATE2: An even more complete report is created, if I convert my conda env to match the syntax of a pip env requirements file and scan both files:

conda activate testenv
mkdir -p temp_env_dir
conda list -e > temp_env_dir/conda.txt
conda list -e > temp_env_dir/requirements.txt
sed -i -E "s/^(.*\=.*)(\=.*)/\1/" temp_env_dir/requirements.txt
sed -i 's/=/==/g' temp_env_dir/requirements.txt
java -jar nexus-iq-cli.jar -i <YOUR_APP_ID> -s <NEXUS_IQ_SERVER_URL> -
a <USWERNAME>:<PASSWORD> temp_env_dir

Would be nice if jake could scan both, pip and conda with ONE scan OR conda and pip become one database.

@DarthHater
Copy link
Member

Panda's is listed so many times because it COULD be any number of those results. Essentially we only know what we know about your environment. The reason they are all brought back is because it's coordinate based matching, and not on a hash, so we can't tell for example if you are using panda's for python 3.9, for linux, etc...

There are some legal reasons it becomes difficult to merge Conda and PyPI, mainly in that the Conda repository has a TOS that prevents scraping information (I am not a lawyer, but that's my basic understanding). There's realistically only so much that can be done with that in place.

Love the feedback, by the way!

Jake is open source obviously, if you feel really strongly about anything, totally down for PR's, etc... (if you got the time, I also know how that goes :))

@madpah
Copy link
Collaborator

madpah commented Jan 17, 2022

Closing due to inactivity - please re-open if the issue persists.

@madpah madpah closed this as completed Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants