In [1]:
#| hide
import kglab
import pandas as pd
from sbom_analysis.core import *

# numpy

Let's look at an incredibly common and well developed python data science package [numpy](https://github.com/numpy/numpy) to analyze how accurate and complete our SBOM generation tools are.

## Known Dependencies

According to numpy's installation website:
> "NumPy doesn’t depend on any other Python packages, however, it does depend on an accelerated linear algebra library - typically Intel MKL or OpenBLAS. Users don’t have to worry about installing those (they’re automatically included in all NumPy install methods)."

To test this let's pretend we are creating an empty conda environmennt with just numpy to see what packages will be installed

In [10]:
!echo 'n' | conda create -n numpy numpy

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.1
  latest version: 23.5.0

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=23.5.0



## Package Plan ##

  environment location: /afs/crc.nd.edu/user/p/painswor/.conda/envs/numpy

  added / updated specs:
    - numpy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2023.01.10 |       h06a4308_0         120 KB
    intel-openmp-2023.1.0      |   hdb19cb5_46305        17.1 MB
    libffi-3.4.4               |       h6a678d5_0         142 KB
    libuuid-1.41.5             |       h5eee18b_0          27 KB
    mkl-2023.1.0               |   h6d00ec8_46342       171.5 MB
    mkl-service-2.4.0          |  py311h5eee18b_1          54 KB
    mkl_f

Looking at the output above, we see a number of dependencies.  Most of these are required for python (which is required for numpy) But some are those C/C++ libraries needed for linear algebra.  So, how will a generated SBOM reflect these?

## microsoft/sbom-tool 

The sbom-tool queries [PyPi](https://pypi.org/) for all package dependencies.  Since numpy is uploaded to PyPi, we can query it directly before running the tool on the repo:

In [11]:
import requests
response = requests.get('https://pypi.org/pypi/numpy/json')
data = response.json()
print(data['info']['requires_dist'])

None


PyPi shows no deps for numpy, which makes sense as there are no python dependencies.  **This means when using the sbom-tool on a pacakge that depends on numpy, no child dependencies will be found for numpy.**

Running the tool against the repo cloned as is produced **0 packages**.  As established the sbom-tool uses [microsoft/component-detection](https://github.com/microsoft/component-detection) to detect all packages used in the repository.  This tool looks for either `setup.py` or `requirements.txt`.  

In the numpy repo there are:

- `setup.py`
- `build_requirements.txt`
- `doc_requirements.txt`
- `linter_requirements.txt`
- `release_requirements.txt`
- `test_requirments.txt`