# Task 2: Follow-up from Perspectives on Python-oriented SBOM Generation Tools
Task 1 analysed current SBOM generation practices in the Python ecosystem and identified a fundamental limitation: existing tools rely primarily on metadata files or semi-dynamic installation processes, leading to incomplete dependency discovery and inconsistent results. Through critical synthesis of recent academic work, the first part motivates a new SBOM generation approach that shifts the focus from metadata-centric analysis to source-code–level dependency discovery.
This part is now dedicated to the implementation of our novel approach. Using Python’s Abstract Syntax Tree (AST) to extract imports directly from source code and construct a dependency graph, we complement existing methods and seek to achieve 100% completeness. 
The first section is about the choice of data structure appropriate for graph problems. The second section is about the project's chosen datasets, why and how they've been merged. The third one is about the AST analysis itself, focused on the 1st dataset, whilst the second part is mostly about the results' analysis of the 2nd dataset. The final section discusses limitations as well as everything related to asymptotic analysis.

## Nota bene
1. In the first task, under Comparison and metrics, we suggest correctness will be also assessed. This was a mistake; the focus of this project remains exclusively about completeness, as in capturing all dependencies.  
2. This project revolving around packages, to run the AST analyser and obtain the expected results, one needs to copy the ***whole*** repository[https://github.com/rtafurthgarcia/COM713], because the datasets also contain the dependencies!

## 1 Data structures 
One could conceive the data structure defining how packages relate to one another naively as a tree, where each package may have children-packages. In fact, it is often referred in the literature as a dependency tree. However, 
this term misrepresents the true nature of software dependencies, for packages rarely have a single parent relationship or even form (problematic) cycles, per Tellnes' own introduction of the problem[1]. Thus, in professional software development, dependency graphs are the norm. For this reason, the core data structure featured in this document will be shaped by graph theory. Goodrich et al.[2] provide us with four possible data structures: 
1. edge lists 
2. adjacency list
3. adjacency map
4. adjacency matrix 

Whilst in the task n°1 it had been concluded that the adjacency list had to be used for NodeVisitor depended on it, 
it turned out that `ast.NodeVisitor` doesnt really expect a specific type. We were thus free to choose any of those data structures. One could choose the edge list on a whim, for it is easy to implement, and has the most optimal for the three functions we would use (`vertices()` being O(n), `insert_vertices()` and `insert_edges()` being O(1)), as described in [2, p.627], to construct the graph and compare it against others. But doing so would be a mistake; we cannot forget some projects might have cycles and for this reason we need to be able to avoid duplicates. Replacing the lists by sets would mean both the getter and setter necessary to check for duplicates before adding are O(n) too[3]. Therefore it would have best to use a adjacency map, for its getter `get_edge()` is O(1), but our dataset needs to exported and merged too, implying serialisation is a major challenge for behavior-heavy objects like graphs. Bad experience has been made about this last part, as visible in our chatgpt transcript[4]. Hence we reverted back to the first option that was the edge lists, but with sets instead, for we will only implement a small subset of functions where O(1) applies. Sets allow for easy comparisons, perfect for later analyses. It is important to note that python's sets make us of hashmaps, which for insertion and look-ups are worth O(1) too. We follow Goodrich et al.'s edge list implementation otherwise[3].

In [23]:
from dataclasses import dataclass, field
from typing import Any

@dataclass(eq=False, frozen=True, unsafe_hash=True)
class Package():
    name: str

    @staticmethod
    # O(c) => "simple" deserialisation
    def from_dict(d: dict) -> "Package":
        return Package(**d)

    def __eq__(self, value) -> bool:
        return self.name == value.name

@dataclass(eq=True, frozen=True)
class ImportStatement():
    who_imports: Package
    who_is_imported: Package

    @staticmethod
    # O(c) => constant time
    def from_dict(d: dict) -> "ImportStatement":
        return ImportStatement(
            who_imports=Package.from_dict(d["imports"]),
            who_is_imported=Package.from_dict(d["imported"]),
        )
    
@dataclass(eq=True, frozen=True)
class DependencyGraph():
    packages: set[Package] = field(default_factory=set)
    import_statements: set[ImportStatement] = field(default_factory=set)

    # O(c) => simple append, constant time
    def insert_package(self, package_name: str, level: int) -> Package:
        new_package = Package(package_name)
        self.packages.add(new_package)

        return new_package

    # O(c) => simple append, constant time
    def insert_importstatement(self, imports: Package, imported: Package):
        new_importstatement = ImportStatement(imports, imported)
        self.import_statements.add(new_importstatement)

        return new_importstatement
    
    # O(n) => linear time because we have to look through the whole list
    def imports(self, package: Package) -> list[ImportStatement]:
        return [statement for statement in self.import_statements if statement.who_imports == package]
    
    @staticmethod
    # O(2n) => linear time still but we have to loop through two(2) lists
    def from_dict(d: dict) -> "DependencyGraph":
        return DependencyGraph(
            packages=set(Package.from_dict(p) for p in d["packages"]),
            import_statements=set(
                ImportStatement.from_dict(i)
                for i in d["import_statements"]
            ),
        )

@dataclass
class PackageAnalysis:
    source_path: str
    graphs: dict[str, DependencyGraph]
    raw_packages_from_metadata: list[str]
    packages_path: str
    ground_truth: DependencyGraph | None

    @staticmethod
    # O(n^2) => quadratic time, and looping
    # 1 for instantiation
    # 1 for return
    def from_dict(d: dict) -> "PackageAnalysis":
        return PackageAnalysis(
            source_path=d["source_path"],
            graphs={
                k: DependencyGraph.from_dict(v)
                for k, v in d["graphs"].items()
            },
            raw_packages_from_metadata=d["raw_packages_from_metadata"],
            packages_path=d["packages_path"],
            ground_truth=(
                DependencyGraph.from_dict(d["ground_truth"])
                if d["ground_truth"] is not None
                else None
            ),
        )

@dataclass
class Dataset:
    package_analyses: dict[str, PackageAnalysis] = field(default_factory=dict)

    @staticmethod
    # O(n^3) => cubic time because we have to loop through multiple nested objects
    def from_dict(d: dict) -> "Dataset":
        return Dataset(
            package_analyses={
                k: PackageAnalysis.from_dict(v)
                for k, v in d["package_analyses"].items()
            }
        )

## 2.0 Datasets and setup 
In the subsequent task, we examined and selected two datasets that allow us to draw a comparison between our tools and our new approach. Both datasets contain the source code of different pacakges to analyse. Due to the sheer size of those datasets (multiple GBs), they are not included in this file directly and can be consulted on the (project's github repository)[https://github.com/rtafurthgarcia/COM713].
Our datasets (`\ds1` and `\ds2`) share the same structure:
- `\packages` contains the packages to analyse and to generate SBOMs from, 
- `\sbom` contains the generated SBOMs generated by each tool for each package, and serve as comparison source.

### 2.1 Dataset n°1
Dataset n1 (ds1) is a copy from Cofano et al[6]. Dependencies are read from `requirements.txt` from `\sbom`. This dataset contains no ground truth, and only `\sbom` can be used to draw a comparison between our new approach and the other tools. 
Here, `COM713`, a package specially crafted to avoid detection by regular metadata-only tools has been added in the `requirements.txt` files. This will allow us to test that our AST analyser is immune to parser confusion.

### 2.2 Dataset n°2
Dataset n2 (ds2) is a copy from Jia et al's[7] dataset. `\deptree_gt` contains the ground truth as json files for each package to compare with the other tools real performance in `\sbom`. 

### 2.3 Preprocessing and merging
These two datasets have been parsed and merged externally; the process required a cyclonedx library that couldnt be attached to this project. However, if curious as to how it worked, you can peek into the `merge.py` file and see how it got done. It required serialising our dataclasses from above into json files. By merging is meant combining all previous relevant dataset files into one, but we keep both datasets distinct for practical reasons. Deserialising those datasets meant implementing `from_dict` functions as to retroactively convert all nested objects into their original types.

In [24]:
import json

dataset1: Dataset
dataset2: Dataset

with open("merged_ds1.json", "r") as dataset_file:
    dataset1 =  Dataset.from_dict(json.loads(dataset_file.read())) 

with open("merged_ds2.json", "r") as dataset_file:
    dataset2 =  Dataset.from_dict(json.loads(dataset_file.read())) 

## 3. AST Algorithm
Our AST algorithm follows this logic: 

```
                                                                               ┌───────────────────┐     
                                                                               │                   │     
                                                                ┌────────┐     │                   │     
                                                                │        │     │ Print the SBOM    │     
                                                                │ Start ◄┼─────┼ and export it as a│     
                                                                │        │     │ file             ◄┼┐    
                                                                └────────┘     │                   ││    
                                                                               └───────────────────┘│    
                                                                                                    │    
                                                                                                   No    
                                                                                                    │    
                                                                                                    │    
                                                                                                 xxx│x   
                                                                             xxxxx              xx   xx  
                  ┌────────────────┐        ┌────────────────────┐          xx   xx            xx     xx 
                  │                │        │                    │         xx     xx          xx Any   xx
┌────────┐        │                │        │ Look through the   │        xx Any   xx         x  .py    x
│        │        │  Read source   │        │ AST for import     │        x  import ───No─────►x left? xx
│ Start ─┼────────┼► file (.py),   ┼────────► statements         ┼────────►x left? xx          xx     xx 
│        │        │  build the AST │        │                    │         xx     xx            xx   xx  
└────────┘        │                │        │                    │          xx   xx              xx xx   
                  └──────▲─────────┘        └────────▲───────────┘           xx xx                │xx    
                         │                           │                        x│x                 │      
                         │                           │                         │                  │      
                         │                           │                         │                  │      
                         │                           │                        Yes                Yes     
                         │                  ┌────────┼───────────┐             │                  │      
                         │                  │                    │             │                  │      
                         │                  │  Add to graph      │             │                  │      
                         │                  │                   ◄│─────────────┘                  │      
                         │                  │                    │                                │      
                         │                  │                    │      ┌────────────────────┐    │      
                         │                  └────────────────────┘      │                    │    │      
                         │                                              │  Move to next      │    │      
                         └──────────────────────────────────────────────│  source file,      │────┘      
                                                                        │  mark file as read │           
                                                                        │                    │           
                                                                        └────────────────────┘           
```

In [25]:
import ast
import os 
import sys
from pathlib import Path
from importlib.util import find_spec
from importlib.metadata import packages_distributions

class PackageAnalyser(ast.NodeVisitor):
    """
    Parses the package's sourcecode, build an abstract syntax tree, and from this, identifies the imports 
    that will allow to dive deeper into each imported package. This process is what generates our dependency graph, 
    where each Vertex or Package will be an entry in our SBOM.
    """
    def __init__(self, source_path: str, root: str):
        self.source_path = source_path
        self.root = root
        self.graph = DependencyGraph()
        self.current_file = ""
        self.distribution_packages = packages_distributions()
        self._visited_nodes = {}
        self.current_level = 0

        if os.path.exists(self.source_path):
            if (os.path.isfile(self.source_path)) and source_path.endswith(".py"):
                #self.graph.insert_package(root_package_name)
                self.current_file = self.source_path
            elif (os.path.isdir(self.source_path)):
                try:
                    self.current_file = [file for file in os.listdir(self.source_path) if file.endswith(".py")][0]
                except:
                    print("Cannot find py for" + source_path)
                    exit(1)

    """
    Each visit_x function is called each time a node of a node of our abstract syntax tree is visited
    _Module -> is called each time a new module is visited
    _Import -> is called each time a new import is visited
    _ImportFrom -> is called each time a a new from ... import is visited 
    """
    def visit_Module(self, node: ast.Module) -> None:
        self._visited_nodes[self.current_file] = self.current_level
        path = Path(self.current_file)

        # we avoid the __init__ and whatnot, and we try to guess the package name based on the parent directory
        if (path.stem.startswith("__")):
            if (path.parent.is_dir() and path.parent.stem in self.distribution_packages):
                self.current_package = self.graph.insert_package(path.parent.stem, self.current_level)
        #else:
        
        #self._parse_and_visit(self.current_file) # This makes it a recursive process!
            #package_path = self._find_package_path(package_name)
        self.generic_visit(node)
    

    # O(n!) Really bad, _parse_and_visit will both read, build the AST tree, and look for 
    # further files to parse by calling itself during the "visiting" of the tree
    def visit_Import(self, node: ast.Import) -> None:
        package_name = node.names[0].name # the first name works

        if (not self._is_separate_package(package_name)):
            return
        
        package_path = self._find_package_path(package_name)
        self._parse_and_visit(package_path)
        
        # ultimately, we only want to add to our graph the distribution packages
        if (package_name in self.distribution_packages and Package(package_name) not in self.graph.packages):
            package_path = self._find_package_path(package_name)
            self._parse_and_visit(package_path) # This makes it a recursive process!
            new_package = self.graph.insert_package(package_name, self.current_level)
            self.graph.insert_importstatement(
                self.current_package, 
                new_package)
            self.current_package = new_package
            self.generic_visit(node)

    # O(n!) Really bad, _parse_and_visit will both read, build the AST tree, and look for 
    # further files to parse by calling itself during the "visiting" of the tree
    def visit_ImportFrom(self, node: ast.ImportFrom) -> None:
        package_name = node.module or "" # we want the x from 'from x import y', not the y

        if (not self._is_separate_package(package_name)):
            return
        
        # ultimately, we only want to add to our graph the distribution packages
        if (package_name in self.distribution_packages and Package(package_name) not in self.graph.packages):
            package_path = self._find_package_path(package_name)
            self._parse_and_visit(package_path) # This makes it a recursive process!
            new_package = self.graph.insert_package(package_name, self.current_level)
            self.graph.insert_importstatement(
                self.current_package, 
                new_package)
            self.current_package = new_package

            self.generic_visit(node)

    # O(1) nothing too special
    def _is_separate_package(self, node_name: str) -> bool:
        """
        Distinguish between just regular modules, proper modules from separate installed packages
        and stdlib packages
        """
        if node_name == '':
            return False

        if node_name in sys.stdlib_module_names:
            return False

        try:
            spec = find_spec(node_name)
            if spec is None or spec.submodule_search_locations is None:
                return False
            else:
                return True
        except:
            return False
    
    # O(c), constant time, nothing special
    def _find_package_path(self, node_name: str) -> str:
        spec = find_spec(node_name)       
        if (spec is None or spec.submodule_search_locations is None or spec.origin is None):
            raise FileNotFoundError("Couldn't find the package's og source")
        else:
            return spec.origin
        
    # O(n!) The root of all evil in our programme. 
    def _parse_and_visit(self, file_path: str, reset_level: bool = False) -> None:
        """
        Read the source, parse it to further build the AST, then explore its edges to discover other packages
        """
        if not file_path.endswith(".py"):
            return 
        
        if file_path in self._visited_nodes: # a look up is constant thx to the fact that its a hashmap!
            return 

        try:
            with open(file_path, "r") as source_file:
                if (reset_level is True):
                    self.current_level = 0
                    package_name = Path(self.current_file).stem
                    self.current_package = self.graph.insert_package(package_name, self.current_level)
                else:
                    self.current_level += 1
                code = source_file.read()
                tree = ast.parse(code)
                self.current_file = file_path
                self._visited_nodes[file_path] = self.current_level
                self.visit(tree)
        except:
            return
            
    
    # O(n^2) => quadratic due to the looping through each dirs and files
    def analyse(self) -> None:
        """
        Will parse the package's source code and build an AST from it, where every node from the tree will be visited
        """
        if (os.path.isfile(self.source_path)):
            self._parse_and_visit(self.source_path, True)
        elif (os.path.isdir(self.source_path)):
            for root, _, files in os.walk(self.source_path):
                for file in files:
                    file_path = os.path.join(root, file)
                    self._parse_and_visit(file_path, True)
    
    def print_packages(self) -> None:
        print("Project: " + self.root)
                
        for package in self.graph.packages:
            #if (package.level > 0):
            print(package.name)
        print("-------------------------")

Here is displayed the core packages of each project. 
`com713` is the most interesting one; this one was added in the dataset n°1 to confuse parsers
We see that our algorithm is thus perfectly capable of identifying distribution packages, even when "invisible" from the metadata readers.

In [26]:
for package, analysis in dataset1.package_analyses.items():
    sys.path.insert(0, os.path.abspath(analysis.packages_path)) # makes the main package's own packages available and thus importable
    analyser = PackageAnalyser(source_path=analysis.source_path, root=package)
    analyser.analyse()
    analyser.print_packages()
    sys.path.pop(0)
    

Project: pip-hatchling
numpy
main
com713
-------------------------
Project: pip-pdm
numpy
main
com713
-------------------------
Project: pip-setuptools
numpy
main
com713
-------------------------


For our second dataset, we can see that our algorithm is capable of iterating through each distribution package 
instead of simply relying on `requirements.txt`

In [27]:
for package, analysis in dataset2.package_analyses.items():
    sys.path.insert(0, os.path.abspath(analysis.packages_path)) # makes the main package's own packages available and thus importable
    analyser = PackageAnalyser(source_path=analysis.source_path, root=package)
    analyser.analyse()
    analyser.print_packages()
    sys.path.pop(0)

Project: apprise
apprise
-------------------------
Project: django-rest-framework
documentation
routers
generators
inspectors
validators
html
representation
drf_create_token
0001_initial
apps
serializer_helpers
django
viewsets
json
openapi
mediatypes
serializers
filters
authentication
status
humanize_datetime
0003_tokenproxy
charset_normalizer
metadata
coreschema
generics
cryptography
uritemplate
request
markdown
inflection
yaml
renderers
versioning
0002_auto_20160226_1747
utils
field_mapping
throttling
exceptions
pkg_resources
generateschema
requests
timezone
formatting
views
pygments
response
mixins
permissions
relations
__init__
settings
parsers
0004_alter_tokenproxy_options
pagination
negotiation
rest_framework
breadcrumbs
admin
model_meta
coreapi
urls
checks
urllib3
decorators
test
models
encoders
urlpatterns
reverse
-------------------------
Project: fastapi
templating
exception_handlers
starlette
param_functions
http
api_key
anyio
base
pydantic
websockets
pydantic_core
wsgi
oaut



Project: InstaPy
telegram_util
chardet
pyvirtualdisplay
event
clarifai_util
time_util
xpath
webdriverdownloader
selenium
charset_normalizer
regex
meaningcloud
relationship_tools
exceptions
commenters_util
pkg_resources
browser
apidisplaypurposes
requests
file_manager
plyer
unfollow_util
story_util
print_log_writer
settings
__init__
follow_util
bs4
login_util
constants
database_engine
like_util
monkey_patcher
emoji
urllib3
pods_util
feed_util
-------------------------




Project: keras
mnist
regression_metrics
torch_data_loader_adapter_test
serialization_lib
saving_api
einsum_dense
keras_tensor
linalg
adamax_test
absl
json_utils_test
adamax
numerical_utils
object_registration_test
conv_lstm
cloning_test
random_initializers
ftrl
remote_monitor_test
lion
stateless_scope_test
callback
average_pooling1d
optimizer
torch_optimizer
constraints
global_average_pooling1d
file_utils_test
dtype_policy_map
random_test
learning_rate_schedule_test
feature_space_test
rmsprop
adamw_test
functional
h5py
normalization_test
saving_lib
up_sampling3d
io_utils
compute_output_spec_test
cropping2d_test
image_test
layer_normalization
integer_lookup
inception_v3
random_crop
lamb_test
accuracy_metrics
swap_ema_weights
embedding
backend_utils_test
json_utils
conv_lstm3d
learning_rate_schedule
variables
string_lookup_test
tf_dataset_adapter
nn
zero_padding3d_test
regularizers
zero_padding2d_test
export_lib_test
generator_data_adapter_test
torch_nadam
random_brightness
random_contra



Project: scancode-toolkit
boolean
distro
validators
nuget
plugin_license_policy
seq
cache
match
windows
pymaven
help
match_hash
_make
readme
cryptography
rpm
plugin_ignore_copyrights
toml
debian_copyright
spec
match_set
misc
six
fingerprints
go_mod
pdf
licenses_reference
match_aho
facet
rpm_installed
todo
lxml
click
javaproperties
analysis
models
golang
finder
pypi_setup_py
swift
nevra
interrupt
plugin_url
chardet
classify
_cmp
filters
tracing
cocoapods
debian
cli_test_utils
msi
licensedcode_test_utils
finder_data
output_yaml
legal
rubygems
utils
output_jsonlines
plugin_copyright
bashlex
packageurl
jar_manifest
exceptions
phpcomposer
match_spdx_lid
classify_plugin
groovy_lexer
requests
plugincode
pygments
score
_compat
index
summarizer
fasteners
detection
plugin_package
ftfy
haxe
parameter_expansion
commoncode
jinja2
frontmatter
reindex
jsonstreams
tallies
godeps
win_pe
setters
chef
_version_info
copyrights_hint
plugin_license
output_spdx
dparse2
recognize
cran
_config
license_db
plugi



Project: ydata-profiling
common
compat
render_categorical
timeseries_index
imbalance_pandas
variable_info
expectation_algorithms
render_generic
typeguard
missing_spark
summary_spark
timeseries_index_spark
multimethod
render_real
duplicate
pairwise
describe_categorical_spark
dropdown
pkg_resources
matplotlib
describe
dataframe_pandas
render_boolean
numpy
missing_pandas
describe_generic_pandas
pandas
alerts
progress_bar
describe_boolean_spark
packaging
render_file
describe_supported_pandas
render_image
timeseries_index_pandas
describe_generic_spark
serialize_report
frequency_table
paths
markupsafe
yaml
utils
duplicates_spark
render_count
sample
imghdr_patch
frequency_table_small
describe_numeric_spark
requests
logger
scipy
render_complex
renderable
summarizer
describe_url_pandas
describe_text_pandas
jinja2
describe_supported_spark
describe_boolean_pandas
table_pandas
frequency_table_utils
PIL
html
table_spark
describe_path_pandas
duplicates_pandas
flavours
missing
describe_timeseries_pan

## 4. Time complexity and limitations
Whilst the data structure (an edge-set) has for upper bound a linear time O(n), 
the mere fact we have to go through each source file, parse the source code, build our tree, 
and parse other imported files makes our algorithm reach the worst case possible: O(n!)
Despite, Sedgewick et Wayne's guide on algorithms[5], where they suggest looping rather than recursing, no functioning looping-approach has been found; a possible one could have been to start from the installed packages and then look for the imports of the same packages in our source code, but that implies possibly heavier FS-interactions.

Also, due to time constraints, a further analysis and a comparison between the tools of dataset 2 and the performance of our own implementation has not been done. For now, it can only be noticed that the algorithm is capable of identifying 
transitive packages, as in, sub packages from other packages, but no metric such as recall or precision could be computed yet.


## AI-Use Compliance
ChatGPT, in the present assignment, has been used exclusively for three general tasks: 1. search for specific resources, 2. provide general explanations 3. help diagnose issues or help with debugging. In no case has it been used to generate any part of this document.

How could I make the distinction between a module and a package?
[a] https://chatgpt.com/share/695d993b-95e4-8008-9fd0-306ad8caed06

is there a way to distinguish between python default packages (like os, sys) and those installed from pip, or another source?
[b] https://chatgpt.com/share/695d9967-fdec-8008-8425-193decc0c476

can you recommend me a book that covers code modelling? as in, for each instruction, how does it influence big O etc.
[c] https://chatgpt.com/share/695d9985-cc94-8008-820f-5521bf31ffa8

Here, I've noticed that only the first objects (dataset1 and dataset2) are of the right type (Dataset), how can I make sure all sub variables are of the right type?: ...
[d] https://chatgpt.com/share/695d99a4-7870-8008-a90e-3ce5295f6956

can you recommend me a book that covers code modelling? as in, for each instruction, how does it influence big O etc.
[e] https://chatgpt.com/share/695d9985-cc94-8008-820f-5521bf31ffa8



## References
[1] J. Tellnes, « Dependencies: No Software is an Island », Master thesis, The University of Bergen, 2013. Available on: https://bora.uib.no/bora-xmlui/handle/1956/7540

[2] M. T. Goodrich, R. Tamassia, et M. H. Goldwasser, Data structures and algorithms in Python, 1st edition. Hoboken, N.J: Wiley, 2013.

[3] « TimeComplexity - Python Wiki ». Consulted the: 4 janvier 2026. [Online]. Available on: https://wiki.python.org/moin/TimeComplexity

[4] R. E. L. Tafurth Garcia, « ChatGPT - COM713 », Transcript. [Online]. Available: https://chatgpt.com/share/695ad5cb-35f8-8008-a3a0-d8b0302b4eb2

[5] R. Sedgewick and K. D. Wayne, Algorithms, 4th ed. Upper Saddle River: Addison-Wesley, 2011.

[6] S. Cofano, G. Benedetti, and M. Dell’Amico, ‘SBOM Generation Tools in the Python Ecosystem: an In-Detail Analysis’, in 2024 IEEE 23rd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Sanya, China: IEEE, Dec. 2024, pp. 427–434. doi: 10.1109/TrustCom63139.2024.00077. Dataset: https://github.com/serenacofano/SBOM-python-ecosystem

[7] C. Jia, N. Li, K. Yang, and M. Zhou, ‘SIT: An Accurate, Compliant SBOM Generator with Incremental Construction’, in 2025 IEEE/ACM 47th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Ottawa, ON, Canada: IEEE, Apr. 2025, pp. 13–16. doi: 10.1109/ICSE-Companion66252.2025.00013. Dataset: https://zenodo.org/records/13882428

