# The Software Bill of Materials (SBOM) Use Case

Following on from the data loading in `loader.ipynb`, this notebook demonstrates how to perform **Software Bill of Materials (SBOM) Analysis** using Cypher queries on the Neo4j graph database.

In this case, we are showing how a security team can achieve full **Code-to-Cloud traceability**. We move from identifying vulnerabilities in direct dependencies to uncovering "hidden" risks in **transitive dependencies**. Finally, we demonstrate how to map these software components to the running infrastructure to visualize the real-world exposure of your software supply chain.

### Key Capabilities Demonstrated:

* **Transitive Risk Detection**: Navigating the recursive `DEPENDENCY_OF` relationships to find vulnerabilities buried multiple layers deep in the stack.
* **Widespread Impact Analysis**: Instantly identifying every application across the enterprise that utilizes a specific vulnerable library or version.
* **Contextual Prioritization**: Determining which library patches are most urgent by correlating the SBOM with internet-facing infrastructure and "Crown Jewel" data.

In [1]:
from dotenv import load_dotenv
import os
from neo4j import GraphDatabase

load_dotenv()

# Connection details
URI = os.getenv("NEO4J_URI", "bolt://localhost:7687")
AUTH = (os.getenv("NEO4J_USER", "neo4j"), os.getenv("NEO4J_PASSWORD", "password"))
DB = os.getenv("NEO4J_DB", "nvd")

# A helper method to run Cypher queries
def run_cypher(query, parameters=None):
    driver = GraphDatabase.driver(URI, auth=AUTH)
    try:
        with driver.session(database=DB) as session:
            result = session.run(query, parameters or {})
            # .data() converts the stream into a list of dictionaries
            return result.data() 
    finally:
        driver.close()

## The Transitive Risk Discovery

This query covers the question "Which applications are at risk because they indirectly use the 'vulnerable-codec' library" ?

In [None]:
import pandas as pd

query = """
MATCH (v:CVE)-[:IDENTIFIED_IN]->(targetLib:Library)
MATCH path = (targetLib)-[:DEPENDENCY_OF*1..3]->(art:BuildArtifact)-[:RUNNING_AS]->(app:Application)
RETURN v.id AS CVE, 
       targetLib.name AS Vulnerable_Lib, 
       app.name AS Affected_App, 
       nodes(path) AS Dependency_Chain
"""

results = run_cypher(query)
# Print the results as an html table
df = pd.DataFrame(results)
df

Unnamed: 0,Recent_CVE,Exposed_App,Hostname,IP
0,CVE-2021-44228,CustomerFacingAPI,api-gateway-01,34.201.1.5
1,CVE-2021-45046,CustomerFacingAPI,api-gateway-01,34.201.1.5


## Reachability Analysis for SBOM

The final query demonstrates how to trace the path from a vulnerable library all the way to the running compute instances in the cloud, answer the question "Show me the full path from a poisoned library to the internet-facing compute instances running affected applications".

In [None]:
query = """
MATCH (v:CVE)-[:IDENTIFIED_IN]->(l:Library)
MATCH path = (l)-[:DEPENDENCY_OF*1..3]->(:BuildArtifact)-[:RUNNING_AS]->(app:Application)-[:HOSTED_ON]->(ins:ComputeInstance)
WHERE ins.public_ip IS NOT NULL
RETURN v.id AS CVE, app.name AS App, ins.name AS Server, ins.public_ip AS IP
"""

results = run_cypher(query)
df = pd.DataFrame(results)
df

Unnamed: 0,Identity,Policy,Target_Resource
0,service-account-prod-s3,DataLakeFullAccess,acme-customer-pii-data
