In [None]:
#| hide
import kglab
import pandas as pd
from sbom_analysis.core import *

# PyTorch

SBOM Source: [pytorch/pytorch](https://github.com/pytorch/pytorch) generated using [microsoft/sbom-tool](https://github.com/microsoft/sbom-tool)

RDF Source: Generated using [pyspdxtools](https://github.com/spdx/tools-python)

## Generated SBOM

The `SBOM Tool` is designed to scan the build components path, which typically refers to the source folder, in order to locate project files such as `*.csproj`, `requirements.txt` or `*.lock`. By analyzing these files, the tool determines the components that were were used to build the project. During this process, the `SBOM Tool` uses [`ComponentDetection`](https://github.com/microsoft/component-detection) to perform the scanning of components and their dependencies.

In the pytorch repo the [`ComponentDetection`](https://github.com/microsoft/component-detection) could detect the following project files:

- `./ios/TestApp/Gemfile.lock`
- `./tools/build/bazel/requirements.txt`
- `./functorch/docs/requirements.txt`
- `./caffe2/requirements.txt`
- `./requirements.txt`
- `./docs/requirements.txt`
- `./scripts/release_notes/requirements.txt`
- `./docs/cpp/requirements.txt`


For each project file, a dependency graph is defined, but unfortunately this graph is not being used in the SBOM file generation.

In total, **225 packages** were found: **135 Pip** and **90 ruby** packages.

No files detected. 	**255 relationships** of the type `DEPENDS_ON`, one for each different package. 

### SBOM Analysis

Let's analyze how accurate and complete the SBOM for this project is.

First import the KG form of the SBOM as specified in the header above.

In [None]:
kg = kglab.KnowledgeGraph()
kg.load_rdf("../../sboms/rdf/pytorch.rdf.xml", format="xml")

<kglab.kglab.KnowledgeGraph>

### Basic Metadata

Let's start by looking at the overall size of the KG

In [None]:
show_metadata(kg)

Total Triples: 4076
Distinct Entities: 680
Distinct Properties: 24


In [None]:
show_measures(kg)

edges 4076
nodes 692


Already this looks small

### Packages

In [None]:
package_schema(kg)

Unnamed: 0,property
0,spdx:copyrightText
1,spdx:downloadLocation
2,spdx:externalRef
3,spdx:filesAnalyzed
4,spdx:licenseConcluded
5,spdx:licenseDeclared
6,spdx:licenseInfoFromFiles
7,spdx:name
8,spdx:packageVerificationCode
9,spdx:relationship


In [None]:
packages = get_package_data(kg)
packages

Unnamed: 0,package,annotations,attributionTexts,checksums,copyrightText,downloadLocation,externalRefs,hasFiles,licenseConcluded,licenseDeclared,licenseInfoFromFiles,name,packageVerificationCode,supplier,versionInfo,relationships
0,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,spdx:noassertion,,,spdx:noassertion,spdx:noassertion,,PyTorch,_:N5158fb3e1d49434ba603b3e1cac432b9,Organization: pytorch,2.0.1,"Na5371014d0744fa889463b66da2a4b13, N3d105ed8f1..."
1,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,https://rubygems.org/,N072d33db8b0640c0a67f046442d71f6e,,spdx:noassertion,spdx:noassertion,,faraday-net_http_persistent,,NOASSERTION,1.2.0,
2,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,https://rubygems.org/,N2b4cf4bd658240ea9faa89424f36c4a8,,spdx:noassertion,spdx:noassertion,,emoji_regex,,NOASSERTION,3.2.3,
3,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,spdx:noassertion,N15a4af78e7084b20a9c5b2dc81cabd4d,,spdx:noassertion,spdx:noassertion,,matplotlib,,NOASSERTION,3.6.0,
4,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,spdx:noassertion,N68929099a2814d318464ac8f437c2e77,,spdx:noassertion,spdx:noassertion,,tensorboard,,NOASSERTION,2.10.0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
221,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,https://rubygems.org/,Nca4704d9fec1409992a888bca8d96405,,spdx:noassertion,spdx:noassertion,,simctl,,NOASSERTION,1.6.8,
222,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,https://rubygems.org/,Nb4a1fb4ffd11456faef76db1892e9ce8,,spdx:noassertion,spdx:noassertion,,nanaimo,,NOASSERTION,0.3.0,
223,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,spdx:noassertion,N9af52b914e874d928dfc4c169352e544,,spdx:noassertion,spdx:noassertion,,tqdm,,NOASSERTION,4.65.0,
224,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,,,,NOASSERTION,https://rubygems.org/,Nfd513ea41d344679b39c5f5611158eb6,,spdx:noassertion,spdx:noassertion,,unf,,NOASSERTION,0.1.4,


### Files

In [None]:
file_schema(kg)

In [None]:
get_files_data(kg)

First thing to point out here, is there are no relationships specified with any of these files. It could be important if these files are being used as libraries within the project to specify that as a relationship.

### Relationships

Finally let's look at what relationships are specified in the KG

In [None]:
rels = get_relationship_data(kg)
rels 

Unnamed: 0,element,elementType,relationshipType,relatedElement,relatedElementType
0,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
1,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
2,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
3,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
4,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
...,...,...,...,...,...
221,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
222,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
223,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
224,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package


In [None]:
rels.describe()

Unnamed: 0,element,elementType,relationshipType,relatedElement,relatedElementType
count,226,226,226,226,226
unique,2,2,2,226,1
top,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
freq,225,225,225,1,226


It looks like most relationshiips are of the `spdx:relationshipType_contains` type.  Let's filter those out to see what remains.

In [None]:
rels[~rels['relationshipType'].str.contains('spdx:relationshipType_contains')]

Unnamed: 0,element,elementType,relationshipType,relatedElement,relatedElementType
0,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
1,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
2,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
3,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
4,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
...,...,...,...,...,...
221,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
222,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
223,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package
224,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package,spdx:relationshipType_dependsOn,<https://spdx.org/spdxdocs/sbom-tool-1.1.2-49b...,spdx:Package


**Relationship graph visualization**

In [None]:
# get the relationship graph to be visualized
graph = visualize_relationship_graph(kg)

# optional: set the physics layout of the network
graph.force_atlas_2based()
graph.set_edge_smooth('dynamic')

# show graph
graph.show("../figs/fig06.relationship_full.html")

../figs/fig06.relationship_full.html


The color of the nodes in the graph refer to the element type in the spdx specification:

In [None]:
display_relationship_graph_legend()

Unnamed: 0,SPDX Type,Node Color
0,File,Yellow
1,Package,Blue
2,SPDXDocument,Red


### Other Elements

Here's what the KG contains other than Packages, Files, and Relationships

In [None]:
query = """
SELECT DISTINCT ?type
WHERE {
    ?element rdf:type ?type
}
"""
kg.query_as_df(query)

Unnamed: 0,type
0,spdx:SpdxDocument
1,spdx:Relationship
2,spdx:Package
3,spdx:ExternalRef
4,spdx:PackageVerificationCode
5,spdx:CreationInfo


This is nothing specific for AI workflows yet