In [None]:
#| hide
import kglab
import pandas as pd
from sbom_analysis.core import *
import pandas as pd

pd.set_option("display.precision", 2)
#pd.set_option('display.max_colwidth', None)

# [X](https://X)


On this page, we will analyze the SBOM generated by the [X](https://X) tool for the [PyTorch](https://github.com/pytorch/pytorch) GitHub Repository. The overall analysis for X is available [here](../../tool_analysis/tools_overall_analysis.qmd#fossa).


The SPDX SBOM was generated in the JSON format and converted to RDF/XML using [pyspdxtools](https://github.com/spdx/tools-python).
It is a valid spdx file and can be validated using the [spdx online validator](https://tools.spdx.org/app/validate/).

## SBOM size

In [None]:

kg = kglab.KnowledgeGraph()
kg.load_rdf("../../../data/tools_cs1/sboms/rdf/pytorch-X-spdx23.rdf.xml", format="xml")

print("Files:", len(get_files_data(kg)))
print("Packages:", len(get_package_data(kg)))
print("relationships:", len(get_relationship_data(kg)))

## Is this SBOM NTIA minimum element conformant? False


| Individual elements                        | Status |
|-------------------------------------------|--------|
| All component names provided?              | True   |
| All component versions provided?           | False  |
| All component identifiers provided?        | True   |
| All component suppliers provided?          | False  |
| SBOM author name provided?                 | True   |
| SBOM creation timestamp provided?          | True   |
| Dependency relationships provided?         | True   |

**Components missing a version**: click, filelock, fsspec, jinja2, networkx, numpy, psutil, PyGithub, pytest, pytest-xdist, PyYAML, requests, setuptools, tqdm, typing-extensions

**Components missing an supplier**: git@github.com:pytorch/pytorch.git, astunparse, boto3, breathe, bs4, certifi, charset-normalizer, click, cmake, coremltools, docutils, enum34, exhale, expecttest, expecttest, fastlane, filelock, filelock, flake8, flake8-bugbear, flake8-comprehensions, flake8-executable, flake8-logging-format, flake8-pyi, flatbuffers, fsspec, future, ghstack, hypothesis, hypothesis, hypothesis, idna, ipython, jinja2, jinja2, jinja2, junitparser, libopenblas, librosa, lintrunner, matplotlib, mccabe, mpmath, mypy, myst-nb, myst-parser, networkx, networkx, networkx, ninja, ninja, numba, numba, numba, numba, numba, numpy, numpy, nvidia-ml-py, opt-einsum, protobuf, psutil, psutil, pycodestyle, pyflakes, PyGithub, Pygments, pytest, pytest, pytest-cpp, pytest-flakefinder, pytest-rerunfailures, pytest-shard, pytest-xdist, pytest-xdist, python-etcd, PyYAML, PyYAML, requests, requests, rich, rockset, scikit-image, scikit-image, scipy, scipy, scipy, scipy, setuptools, setuptools, six, Sphinx, sphinxcontrib-katex, sphinx-copybutton, sphinx-panels, sympy, sympy, tb-nightly, tensorboard, tqdm, types-dataclasses, typing-extensions, typing-extensions, unittest-xml-reporting, urllib3, xdoctest

Source: [ntia_checker](https://tools.spdx.org/app/ntia_checker/)

## Quality Score

In [None]:
dir_qs = "../../../data/tools_cs1/sbomqs/"
sbomqs_df, feature_qscores = sbomqs_scores(dir_qs)
display_qscores_with_descriptions(feature_qscores, tool_list=['X'])

## Dependencies

In [None]:
# get the relationship graph to be visualized
graph = visualize_relationship_graph(kg)

# optional: set the physics layout of the network
graph.force_atlas_2based()
graph.set_edge_smooth('dynamic')

# show graph
graph.show("../../figs/cs1-X.relationship_full.html")



**note_1:**

The graph has twice as many edges.
This occurs because the main repository has two SPDX IDs, and since every package is related to the main repository, there are two edges for each relationship.

**note_2:**

The sbom file contains packages with duplicate SPDX IDs, for example:

```bash        
SPDXID	:	SPDXRef-custom-38450-git-github.com-pytorch-pytorch.git-fbbde8df69577fa52a6e354b930a2fe4e921ae92
name	:	git@github.com:pytorch/pytorch.git
versionInfo	:	fbbde8df69577fa52a6e354b930a2fe4e921ae92
filesAnalyzed	:	true
downloadLocation	:	NOASSERTION
originator	:	Organization: Custom (provided build)
licenseDeclared	:	NONE
copyrightText	:	NONE
licenseConcluded	:	NOASSERTION

SPDXID	:	SPDXRef-custom-38450-git-github.com-pytorch-pytorch.git-fbbde8df69577fa52a6e354b930a2fe4e921ae92
name	:	38450/git@github.com:pytorch/pytorch.git
versionInfo	:	fbbde8df69577fa52a6e354b930a2fe4e921ae92
downloadLocation	:	NOASSERTION
comment	:	Incomplete dependency
supplier	:	Organization: Custom (provided build)
filesAnalyzed	:	false
```   

**note_3:**

Between the same two nodes, there are dependencies in two directions (dependsOn and dependsOf), for example:

```bash
spdxElementId	:	SPDXRef-custom-38450-git-github.com-pytorch-pytorch.git-fbbde8df69577fa52a6e354b930a2fe4e921ae92
relationshipType	:	DEPENDS_ON
relatedSpdxElement	:	SPDXRef-pip-astunparse-1.6.3

spdxElementId	:	SPDXRef-pip-astunparse-1.6.3
relationshipType	:	DEPENDENCY_OF
relatedSpdxElement	:	SPDXRef-custom-38450-git-github.com-pytorch-pytorch.git-fbbde8df69577fa52a6e354b930a2fe4e921ae92
```
    

In [None]:
import kglab
from pyvis.network import Network


if True:
    def get_node_title(elmName: str, elmType: str, elmVersion: str, elemPurpose: str) -> str:
        """
        Create a node title. 
        The title will be the node hover text.
        """
        nodeTitle = f"{elmType}: {elmName}"
        if elmVersion:
            nodeTitle += f"\nVersion:{elmVersion}"
        if elemPurpose:
            nodeTitle += "\nPurpose: " + elemPurpose.split("purpose_")[1]
        return nodeTitle

    def get_node_label(elmName: str, elmVersion: str) -> str:
        """
        Create a node label. 
        The label will be the text under the node.
        """
        nodeLabel = elmName
        if elmVersion: nodeLabel += "==" + elmVersion
        return nodeLabel
    
       
    VIS_STYLE = { 
        'SpdxDocument': {
            "color": "#DE3163",
            "size": 20,
        },
        'Package': {
            "color": "#99ccff",
            "size": 20,
        },
        'File': {
            "color": "#FFBF00",
            "size": 15,
        },
    }
        
    SPDX_NS = "http://spdx.org/rdf/terms#"
    QUERY = """
    PREFIX spdx:<http://spdx.org/rdf/terms#>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    
    SELECT
        ?element
        ?elementName
        ?elementType
        ?elementVersionInfo
        ?elementPrimaryPackagePurpose
        ?relatedElement
        ?relationshipType
        ?relatedElementName
        ?relatedElementType
        ?relatedElementVersionInfo
        ?relatedElementPrimaryPackagePurpose
        
    WHERE {
        ?element spdx:relationship ?relationship .
        ?element rdf:type ?elementType .
        ?relationship spdx:relatedSpdxElement ?relatedElement .
        ?relationship spdx:relationshipType ?relationshipType .
        ?relatedElement rdf:type ?relatedElementType .
        
        OPTIONAL { ?element spdx:name ?elementName . }
        OPTIONAL { ?element spdx:fileName ?elementName . }
        OPTIONAL { ?element spdx:primaryPackagePurpose ?elementPrimaryPackagePurpose . }
        OPTIONAL { ?relatedElement spdx:name ?relatedElementName . }
        OPTIONAL { ?relatedElement spdx:fileName ?relatedElementName . }
        OPTIONAL { ?element spdx:versionInfo ?elementVersionInfo .}
        OPTIONAL { ?relatedElement spdx:versionInfo ?relatedElementVersionInfo .}
        OPTIONAL { ?relatedElement spdx:primaryPackagePurpose ?relatedElementPrimaryPackagePurpose . }
    }
    """
    
    # run query
    query_result = kg.query(QUERY)

    # create a graph of the relationships using Network
    relationship_graph = Network(notebook=True, directed=True, cdn_resources="remote")
    
    # update the graph of the relationships based on the query_result
    for row in query_result:
        
        # element
        elementName = str(row.elementName)
        elementType = str(row.elementType).split(SPDX_NS)[-1]
        elementVersionInfo = row.elementVersionInfo
        elementPrimaryPackagePurpose= row.elementPrimaryPackagePurpose
        
        # relationship
        relationshipTypeName = row.relationshipType.split("relationshipType_")[1]
        
        # relatedElement        
        relatedElementName = str(row.relatedElementName)
        relatedElementType = str(row.relatedElementType).split(SPDX_NS)[-1]
        relatedElementVersionInfo = row.relatedElementVersionInfo
        relatedElementPrimaryPackagePurpose = row.relatedElementPrimaryPackagePurpose

        ## update graph
        # element Node info
        elementNodeId = row.element 
        elementNodeLabel = get_node_label(elementName, elementVersionInfo)
        elementNodeTitle = get_node_title(elementName, elementType, 
                                          elementVersionInfo, elementPrimaryPackagePurpose)
        elementNodeColor = VIS_STYLE[elementType]['color']
        elementNodeSize = VIS_STYLE[elementType]['size']        
        
        # relatedElement Node info
        relatedElementNodeId = row.relatedElement 
        relatedElementNodeLabel = get_node_label(relatedElementName, relatedElementVersionInfo)
        relatedElementNodeTitle = get_node_title(relatedElementName, relatedElementType, 
                                                 relatedElementVersionInfo, relatedElementPrimaryPackagePurpose)
        relatedElementNodeColor = VIS_STYLE[relatedElementType]['color']
        relatedElementNodeSize = VIS_STYLE[relatedElementType]['size']            
        
        # add nodes (elementName, relatedElementName) to the graph
        relationship_graph.add_node(elementNodeId,
                                    label = elementNodeLabel,
                                    title = elementNodeTitle,
                                    color = elementNodeColor,
                                    size = elementNodeSize
                                   )
        relationship_graph.add_node(relatedElementNodeId,
                                    label = relatedElementNodeLabel,
                                    title = relatedElementNodeTitle,
                                    color = relatedElementNodeColor,
                                    size = relatedElementNodeSize
                                   )
        # and edge(relatedElementName) to the graph
        relationship_graph.add_edge(elementNodeId,
                                    relatedElementNodeId,
                                    title = relationshipTypeName,
                                    label = relationshipTypeName # text over the edge
                                   )      
        
        print(elementName)
        print(relatedElementName)
        print(relatedElementName)
        print()
        
        
relationship_graph