diff --git a/README.md b/README.md index 8ebde7870..da4498371 100644 --- a/README.md +++ b/README.md @@ -76,15 +76,19 @@ To learn how to define your own checks, see the steps in the [checks documentati * Behnaz Hassanshahi, Trong Nhan Mai, Alistair Michael, Benjamin Selwyn-Smith, Sophie Bates, and Padmanabhan Krishnan: [Macaron: A Logic-based Framework for Software Supply Chain Security Assurance](https://dl.acm.org/doi/abs/10.1145/3605770.3625213), SCORED 2023. Best paper award :trophy: +* Behnaz Hassanshahi, Trong Nhan Mai, Benjamin Selwyn-Smith, and Nicholas Allen: [Unlocking Reproducibility: Automating re-Build Process for Open-Source Software](https://arxiv.org/pdf/2509.08204), ASE Industry Showcase 2025. + * Ridwan Shariffdeen, Behnaz Hassanshahi, Martin Mirchev, Ali El Husseini, Abhik Roychoudhury [Detecting Python Malware in the Software Supply Chain with Program Analysis](https://labs.oracle.com/pls/apex/f?p=94065:10:11591088449483:11569), ICSE-SEIP 2025. * Jens Dietrich, Tim White, Behnaz Hassanshahi, Paddy Krishnan [Levels of Binary Equivalence for the Comparison of Binaries -from Alternative Builds](https://arxiv.org/pdf/2410.08427), pre-print on arXiv. +from Alternative Builds](https://arxiv.org/pdf/2410.08427), ICSME Industry Track 2025. * Jens Dietrich, Tim White, Valerio Terragni, Behnaz Hassanshahi [Towards Cross-Build Differential Testing](https://labs.oracle.com/pls/apex/f?p=94065:10:11591088449483:11549), ICST 2025. * Jens Dietrich, Tim White, Mohammad Mahdi Abdollahpour, Elliott Wen, Behnaz Hassanshahi [BinEq-A Benchmark of Compiled Java Programs to Assess Alternative Builds](https://dl.acm.org/doi/10.1145/3689944.3696162), SCORED 2024. +* Jens Dietrich and Behnaz Hassanshahi [DALEQ--Explainable Equivalence for Java Bytecode](https://arxiv.org/pdf/2508.01530), ASE Industry Showcase 2025. + ## Security Please consult the [security guide](./SECURITY.md) for our responsible security vulnerability disclosure process. diff --git a/docs/source/pages/cli_usage/command_analyze.rst b/docs/source/pages/cli_usage/command_analyze.rst index 2cea5b0e1..e4b37c5f4 100644 --- a/docs/source/pages/cli_usage/command_analyze.rst +++ b/docs/source/pages/cli_usage/command_analyze.rst @@ -20,11 +20,10 @@ Usage .. code-block:: shell usage: ./run_macaron.sh analyze - [-h] [-sbom SBOM_PATH] [-purl PURL] [-rp REPO_PATH] [-b BRANCH] - [-d DIGEST] [-pe PROVENANCE_EXPECTATION] - [--deps-depth DEPS_DEPTH] [-g TEMPLATE_PATH] - [--python-venv PYTHON_VENV] - [--local-maven-repo LOCAL_MAVEN_REPO] + [-h] [-sbom SBOM_PATH] [-rp REPO_PATH] [-purl PACKAGE_URL] + [-b BRANCH] [-d DIGEST] [-pe PROVENANCE_EXPECTATION] [-pf PROVENANCE_FILE] + [--deps-depth DEPS_DEPTH] [-g TEMPLATE_PATH] [--python-venv PYTHON_VENV] + [--local-maven-repo LOCAL_MAVEN_REPO] [--force-analyze-source] ------- Options @@ -32,33 +31,34 @@ Options .. option:: -h, --help - Show this help message and exit + Show this help message and exit. .. option:: -sbom SBOM_PATH, --sbom-path SBOM_PATH - The path to the SBOM of the analysis target. + The path to the Software Bill of Materials (SBOM) of the analysis target. + If this option is set, dependency resolution must be enabled by using the + `--deps-depth` option. -.. option:: -purl PACKAGE_URL, --package-url PACKAGE_URL +.. option:: -rp REPO_PATH, --repo-path REPO_PATH - The PURL string used to uniquely identify the target software component for analysis. Note: this PURL string can be - consequently used in the policies passed - to the policy engine for the same target. + The path to the repository, which can be either local or remote. -.. option:: -rp REPO_PATH, --repo-path REPO_PATH +.. option:: -purl PACKAGE_URL, --package-url PACKAGE_URL - The path to the repository, can be local or remote + The Package URL (PURL) string used to uniquely identify the target software component for analysis. + This PURL string can also be used in the policies passed to the policy engine for the same target. .. option:: -b BRANCH, --branch BRANCH - The branch of the repository that we want to checkout. If not set, Macaron will use the default branch + The branch of the repository that you want to check out. If not set, Macaron will use the default branch. .. option:: -d DIGEST, --digest DIGEST - The digest of the commit we want to checkout in the branch. If not set, Macaron will use the latest commit + The digest of the commit you want to check out in the branch. If not set, Macaron will use the latest commit. .. option:: -pe PROVENANCE_EXPECTATION, --provenance-expectation PROVENANCE_EXPECTATION - The path to provenance expectation file or directory. + The path to the provenance expectation file or directory. .. option:: -pf PROVENANCE_FILE, --provenance-file PROVENANCE_FILE @@ -66,19 +66,26 @@ Options .. option:: --deps-depth DEPS_DEPTH - The depth of the dependency resolution. 0: disable, 1: direct dependencies, inf: all transitive dependencies. (Default: 0) + The depth of the dependency resolution. Possible values are: + + - `0`: Disable dependency resolution. + - `1`: Resolve direct dependencies only. + - `inf`: Resolve all transitive dependencies (default: `0`). + + **Note**: If `--sbom-path` or `--python-venv` is set, this option must be specified. .. option:: -g TEMPLATE_PATH, --template-path TEMPLATE_PATH - The path to the Jinja2 html template (please make sure to use .html or .j2 extensions). + The path to the Jinja2 HTML template file. Please ensure that the file has either `.html` or `.j2` extensions. -.. option:: --python-venv PYTHON_VENV +.. option:: --python-venv PYTHON_VENV The path to the Python virtual environment of the target software component. + If this option is set, dependency resolution must be enabled with `--deps-depth`. .. option:: --local-maven-repo LOCAL_MAVEN_REPO - The path to the local .m2 directory. If this option is not used, Macaron will use the default location at $HOME/.m2 + The path to the local `.m2` Maven repository. If this option is not used, Macaron will use the default location at `$HOME/.m2`. .. option:: --verify-provenance @@ -86,7 +93,7 @@ Options .. option:: --force-analyze-source - Forces PyPI sourcecode analysis to run regardless of other heuristic results. + Forces PyPI source code analysis to run, regardless of other heuristic results. ----------- Environment diff --git a/docs/source/pages/cli_usage/command_gen_build_spec.rst b/docs/source/pages/cli_usage/command_gen_build_spec.rst new file mode 100644 index 000000000..0e886c220 --- /dev/null +++ b/docs/source/pages/cli_usage/command_gen_build_spec.rst @@ -0,0 +1,43 @@ +.. Copyright (c) 2025 - 2025, Oracle and/or its affiliates. All rights reserved. +.. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/. + +.. _gen-build-spec-command-cli: + +============================ +Generate Build Specification +============================ + +----------- +Description +----------- + +Generate a build specification for a given software component. + +----- +Usage +----- + +.. code-block:: shell + + usage: ./run_macaron.sh gen-build-spec [-h] -purl PACKAGE_URL --database DATABASE [--output-format OUTPUT_FORMAT] + +------- +Options +------- + +.. option:: -h, --help + + Show this help message and exit. + +.. option:: -purl PACKAGE_URL, --package-url PACKAGE_URL + + The PURL (Package URL) string of the software component for which the build specification is to be generated. + +.. option:: --database DATABASE + + Path to the database. + +.. option:: --output-format OUTPUT_FORMAT + + The desired output format for the build specification. The default format is `rc-buildspec`, which is the Reproducible-Central build specification. + Other formats may be available depending on your configuration. diff --git a/docs/source/pages/cli_usage/index.rst b/docs/source/pages/cli_usage/index.rst index 1c2ad7855..dc169c3a2 100644 --- a/docs/source/pages/cli_usage/index.rst +++ b/docs/source/pages/cli_usage/index.rst @@ -15,7 +15,8 @@ Usage .. code-block:: shell - usage: ./run_macaron.sh [-h] [-V] [-v] [-o OUTPUT_DIR] [-dp DEFAULTS_PATH] [-lr LOCAL_REPOS_PATH] {analyze,dump-defaults,verify-policy} ... + usage: ./run_macaron.sh [-h] [-V] [-v] [--disable-rich-output] [-o OUTPUT_DIR] [-dp DEFAULTS_PATH] [-lr LOCAL_REPOS_PATH] + {analyze,dump-defaults,verify-policy,find-source,gen-build-spec} ... Macaron's CLI has multiple common flags (e.g ``-h``, ``-V``) and different commands (e.g. ``analyze``), which have their own set of flags. @@ -27,27 +28,31 @@ Common Options .. option:: -h, --help - Show this help message and exit + Show this help message and exit. .. option:: -V, --version - Show Macaron's version number and exit + Show Macaron's version number and exit. .. option:: -v, --verbose - Run Macaron with more debug logs + Run Macaron with more debug logs to provide additional information for debugging. + +.. option:: --disable-rich-output + + Disable Rich UI output. This will turn off any rich formatting (e.g., colored output, tables, etc.) used in the terminal UI. .. option:: -o OUTPUT_DIR, --output-dir OUTPUT_DIR - The output destination path for Macaron + The output destination path for Macaron. This is where Macaron will store the results of the analysis. .. option:: -dp DEFAULTS_PATH, --defaults-path DEFAULTS_PATH - The path to the defaults configuration file. + The path to the defaults configuration file. This file can contain preset values for Macaron's options. .. option:: -lr LOCAL_REPOS_PATH, --local-repos-path LOCAL_REPOS_PATH - The directory where Macaron looks for already cloned repositories. + The directory where Macaron will look for already cloned repositories. This is useful for reusing locally stored repositories without re-cloning them. --------------------- Environment Variables diff --git a/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.cli_command_parser.rst b/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.cli_command_parser.rst new file mode 100644 index 000000000..5c9f60f7f --- /dev/null +++ b/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.cli_command_parser.rst @@ -0,0 +1,50 @@ +macaron.build\_spec\_generator.cli\_command\_parser package +=========================================================== + +.. automodule:: macaron.build_spec_generator.cli_command_parser + :members: + :show-inheritance: + :undoc-members: + +Submodules +---------- + +macaron.build\_spec\_generator.cli\_command\_parser.gradle\_cli\_command module +------------------------------------------------------------------------------- + +.. automodule:: macaron.build_spec_generator.cli_command_parser.gradle_cli_command + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.cli\_command\_parser.gradle\_cli\_parser module +------------------------------------------------------------------------------ + +.. automodule:: macaron.build_spec_generator.cli_command_parser.gradle_cli_parser + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.cli\_command\_parser.maven\_cli\_command module +------------------------------------------------------------------------------ + +.. automodule:: macaron.build_spec_generator.cli_command_parser.maven_cli_command + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.cli\_command\_parser.maven\_cli\_parser module +----------------------------------------------------------------------------- + +.. automodule:: macaron.build_spec_generator.cli_command_parser.maven_cli_parser + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.cli\_command\_parser.unparsed\_cli\_command module +--------------------------------------------------------------------------------- + +.. automodule:: macaron.build_spec_generator.cli_command_parser.unparsed_cli_command + :members: + :show-inheritance: + :undoc-members: diff --git a/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.reproducible_central.rst b/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.reproducible_central.rst new file mode 100644 index 000000000..6e3477c4a --- /dev/null +++ b/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.reproducible_central.rst @@ -0,0 +1,18 @@ +macaron.build\_spec\_generator.reproducible\_central package +============================================================ + +.. automodule:: macaron.build_spec_generator.reproducible_central + :members: + :show-inheritance: + :undoc-members: + +Submodules +---------- + +macaron.build\_spec\_generator.reproducible\_central.reproducible\_central module +--------------------------------------------------------------------------------- + +.. automodule:: macaron.build_spec_generator.reproducible_central.reproducible_central + :members: + :show-inheritance: + :undoc-members: diff --git a/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.rst b/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.rst new file mode 100644 index 000000000..f89734501 --- /dev/null +++ b/docs/source/pages/developers_guide/apidoc/macaron.build_spec_generator.rst @@ -0,0 +1,59 @@ +macaron.build\_spec\_generator package +====================================== + +.. automodule:: macaron.build_spec_generator + :members: + :show-inheritance: + :undoc-members: + +Subpackages +----------- + +.. toctree:: + :maxdepth: 1 + + macaron.build_spec_generator.cli_command_parser + macaron.build_spec_generator.reproducible_central + +Submodules +---------- + +macaron.build\_spec\_generator.build\_command\_patcher module +------------------------------------------------------------- + +.. automodule:: macaron.build_spec_generator.build_command_patcher + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.build\_spec\_generator module +------------------------------------------------------------ + +.. automodule:: macaron.build_spec_generator.build_spec_generator + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.jdk\_finder module +------------------------------------------------- + +.. automodule:: macaron.build_spec_generator.jdk_finder + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.jdk\_version\_normalizer module +-------------------------------------------------------------- + +.. automodule:: macaron.build_spec_generator.jdk_version_normalizer + :members: + :show-inheritance: + :undoc-members: + +macaron.build\_spec\_generator.macaron\_db\_extractor module +------------------------------------------------------------ + +.. automodule:: macaron.build_spec_generator.macaron_db_extractor + :members: + :show-inheritance: + :undoc-members: diff --git a/docs/source/pages/developers_guide/apidoc/macaron.malware_analyzer.pypi_heuristics.metadata.rst b/docs/source/pages/developers_guide/apidoc/macaron.malware_analyzer.pypi_heuristics.metadata.rst index eb7998118..625328e1a 100644 --- a/docs/source/pages/developers_guide/apidoc/macaron.malware_analyzer.pypi_heuristics.metadata.rst +++ b/docs/source/pages/developers_guide/apidoc/macaron.malware_analyzer.pypi_heuristics.metadata.rst @@ -57,6 +57,14 @@ macaron.malware\_analyzer.pypi\_heuristics.metadata.one\_release module :show-inheritance: :undoc-members: +macaron.malware\_analyzer.pypi\_heuristics.metadata.package\_description\_intent module +--------------------------------------------------------------------------------------- + +.. automodule:: macaron.malware_analyzer.pypi_heuristics.metadata.package_description_intent + :members: + :show-inheritance: + :undoc-members: + macaron.malware\_analyzer.pypi\_heuristics.metadata.similar\_projects module ---------------------------------------------------------------------------- @@ -73,6 +81,22 @@ macaron.malware\_analyzer.pypi\_heuristics.metadata.source\_code\_repo module :show-inheritance: :undoc-members: +macaron.malware\_analyzer.pypi\_heuristics.metadata.stub\_name module +--------------------------------------------------------------------- + +.. automodule:: macaron.malware_analyzer.pypi_heuristics.metadata.stub_name + :members: + :show-inheritance: + :undoc-members: + +macaron.malware\_analyzer.pypi\_heuristics.metadata.type\_stub\_file module +--------------------------------------------------------------------------- + +.. automodule:: macaron.malware_analyzer.pypi_heuristics.metadata.type_stub_file + :members: + :show-inheritance: + :undoc-members: + macaron.malware\_analyzer.pypi\_heuristics.metadata.typosquatting\_presence module ---------------------------------------------------------------------------------- diff --git a/docs/source/pages/developers_guide/apidoc/macaron.path_utils.rst b/docs/source/pages/developers_guide/apidoc/macaron.path_utils.rst new file mode 100644 index 000000000..33e6c94ae --- /dev/null +++ b/docs/source/pages/developers_guide/apidoc/macaron.path_utils.rst @@ -0,0 +1,18 @@ +macaron.path\_utils package +=========================== + +.. automodule:: macaron.path_utils + :members: + :show-inheritance: + :undoc-members: + +Submodules +---------- + +macaron.path\_utils.purl\_based\_path module +-------------------------------------------- + +.. automodule:: macaron.path_utils.purl_based_path + :members: + :show-inheritance: + :undoc-members: diff --git a/docs/source/pages/developers_guide/apidoc/macaron.rst b/docs/source/pages/developers_guide/apidoc/macaron.rst index 3793ea93b..8d40e83bf 100644 --- a/docs/source/pages/developers_guide/apidoc/macaron.rst +++ b/docs/source/pages/developers_guide/apidoc/macaron.rst @@ -12,6 +12,7 @@ Subpackages .. toctree:: :maxdepth: 1 + macaron.build_spec_generator macaron.code_analyzer macaron.config macaron.database @@ -19,6 +20,7 @@ Subpackages macaron.malware_analyzer macaron.output_reporter macaron.parsers + macaron.path_utils macaron.policy_engine macaron.provenance macaron.repo_finder @@ -29,6 +31,14 @@ Subpackages Submodules ---------- +macaron.console module +---------------------- + +.. automodule:: macaron.console + :members: + :show-inheritance: + :undoc-members: + macaron.environment\_variables module ------------------------------------- diff --git a/docs/source/pages/supported_technologies/index.rst b/docs/source/pages/supported_technologies/index.rst index 98bbca535..ff71dd18b 100644 --- a/docs/source/pages/supported_technologies/index.rst +++ b/docs/source/pages/supported_technologies/index.rst @@ -23,6 +23,13 @@ such as GitHub Actions workflows. * Go * Docker +.. _supported_build_gen_tools: + +------------------------------ +Build Specification Generation +------------------------------ + +* Maven and Gradle builds for Java artifacts .. _supported_git_services: diff --git a/docs/source/pages/tutorials/index.rst b/docs/source/pages/tutorials/index.rst index bc6bfdb28..6f6b3bf00 100644 --- a/docs/source/pages/tutorials/index.rst +++ b/docs/source/pages/tutorials/index.rst @@ -19,6 +19,7 @@ For the full list of supported technologies, such as CI services, registries, an commit_finder detect_malicious_package + rebuild_third_party_artifacts detect_vulnerable_github_actions provenance detect_malicious_java_dep diff --git a/docs/source/pages/tutorials/rebuild_third_party_artifacts.rst b/docs/source/pages/tutorials/rebuild_third_party_artifacts.rst new file mode 100644 index 000000000..e176e81aa --- /dev/null +++ b/docs/source/pages/tutorials/rebuild_third_party_artifacts.rst @@ -0,0 +1,170 @@ +.. Copyright (c) 2025 - 2025, Oracle and/or its affiliates. All rights reserved. +.. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/. + +.. _tutorial-gen-build-spec: + +********************************************************* +Rebuilding Third-Party Artifacts from Source with Macaron +********************************************************* + +In this tutorial, you'll learn how to use Macaron's new ``gen-build-spec`` command to automatically generate build specification (buildspec) files from analyzed software packages. +These buildspecs help document and automate the build process for packages, enabling reproducibility and ease of integration with infrastructures such as `Reproducible Central `_. For a more detailed description of this feature, refer to our accepted ASE 2025 Industry Showcase paper: `Unlocking Reproducibility: Automating the Re-Build Process for Open-Source Software `_. + +.. list-table:: + :widths: 25 + :header-rows: 1 + + * - Currently Supported packages + * - Maven packages built with Gradle or Maven + +.. contents:: :local: + +********** +Motivation +********** + +Modern software supply chains rely heavily on centralized repositories like Maven Central to provide easy access to libraries and components. However, a major challenge persists: there is often no clear or transparent link between the published binaries and the environments in which they were built. `Our study `_ shows that about 84% of popular Java artifacts on Maven Central aren’t built through transparent CI/CD pipelines, leaving users to blindly trust not only the source code but also opaque build processes that can introduce hidden risks. + +Addressing this lack of transparency is critical for improving supply chain security. Rebuilding software artifacts from source enables thorough code review, verifies that binaries match their sources, and ensures closer control over dependencies. Yet, recreating build environments is complex, especially with large dependency trees and varying configurations. Macaron tackles these challenges by automatically extracting build specifications from open CI/CD workflows, enhancing source detection, and making reproducible rebuilds more accessible. In doing so, it improves both security and transparency across the open-source software ecosystem. + +********** +Background +********** + +A build specification is a file that describes all necessary information to rebuild a package from source. This includes metadata such as the build tool, the specific build command to run, the language version, e.g., JDK for Java, and artifact coordinates. Macaron can now generate this file automatically for supported ecosystems, greatly simplifying build from source. + +The generated buildspec will be stored in an ecosystem- and PURL-specific path under the ``output/`` directory (see more under :ref:`Output Files Guide `). + +****************************** +Installation and Prerequisites +****************************** + +Skip this section if you already know how to install Macaron. + +.. toggle:: + + Please follow the instructions :ref:`here `. In summary, you need: + + * Docker + * the ``run_macaron.sh`` script to run the Macaron image. + + .. note:: At the moment, Docker alternatives (e.g. podman) are not supported. + + + You also need to provide Macaron with a GitHub token through the ``GITHUB_TOKEN`` environment variable. + + To obtain a GitHub Token: + + * Go to ``GitHub settings`` → ``Developer Settings`` (at the bottom of the left side pane) → ``Personal Access Tokens`` → ``Fine-grained personal access tokens`` → ``Generate new token``. Give your token a name and an expiry period. + * Under ``"Repository access"``, choosing ``"Public Repositories (read-only)"`` should be good enough in most cases. + + Now you should be good to run Macaron. For more details, see the documentation :ref:`here `. + +************************* +Step 1: Analyze a Package +************************* + +Before generating a buildspec, Macaron must first analyze the target package. For example, to analyze a Maven Java package: + +.. code-block:: shell + + ./run_macaron.sh analyze -purl pkg:maven/org.apache.hugegraph/computer-k8s@1.0.0 + +This command will inspect the source repository, CI/CD configuration, and extract build-related data into the local database at ``output/macaron.db``. + +******************************************* +Step 2: Generate a Build Specification File +******************************************* + +After analysis is complete, you can generate a buildspec for the package using the ``gen-build-spec`` command. For more details, refer to the :ref:`gen-build-spec-command-cli`. + +.. code-block:: shell + + ./run_macaron.sh gen-build-spec -purl pkg:maven/org.apache.hugegraph/computer-k8s@1.0.0 --database output/macaron.db + + +After execution, the buildspec will be created at: + +.. code-block:: text + + output//macaron.buildspec + +where ```` is the directory structure according to the PackageURL (PURL). + +In the example above, the buildspec is located at: + +.. code-block:: text + + output/maven/org_apache_hugegraph/computer-k8s/macaron.buildspec + +***************************************** +Step 3: Review and Use the Buildspec File +***************************************** + +The generated buildspec uses the `Reproducible Central buildspec `_ format, for example: + +.. code-block:: ini + + # Generated by Macaron version 0.18.0 + + groupId=org.apache.hugegraph + artifactId=computer-k8s + version=1.0.0 + gitRepo=https://github.com/apache/hugegraph-computer + gitTag=d2b95262091d6572cc12dcda57d89f9cd44ac88b + tool=mvn + jdk=8 + newline=lf + command="mvn -DskipTests=true -Dmaven.test.skip=true -Dmaven.site.skip=true -Drat.skip=true -Dmaven.javadoc.skip=true clean package" + buildinfo=target/computer-k8s-1.0.0.buildinfo + +You can now use this file to automate rebuilding artifacts, for example as part of the Reproducible Central infrastructure. + +************************ +Step 4: Build Validation +************************ + +Validating builds is a crucial post-build step that should be performed independently of the build process. Once a build is complete, it is essential to verify that the resulting artifacts meet the established expectations and accurately reflect the original source. Validation techniques vary, ranging from bitwise equivalence, where the artifacts must match exactly at the binary level, to semantic equivalence, which ensures functional similarity even when the binary outputs differ. Each approach offers distinct advantages depending on the specific context. + +For example, `Daleq `_ is a tool that disassembles Java bytecode into an intermediate representation to infer equivalence between Java classes. Daleq is developed based on recent `research `_ that proposes practical levels for establishing binary equivalence. To learn more about how Daleq works, see the `paper `_. + +******************************* +How It Works: Behind the Scenes +******************************* + +The ``gen-build-spec`` works as follows: + +- Extracts metadata and build information from Macaron’s local SQLite database. +- Parses and modifies build commands from CI/CD configurations to ensure compatibility with rebuild systems. +- Identifies the JDK version by parsing CI/CD configurations or extracting it from the ``META-INF/MANIFEST.MF`` file in Maven Central artifacts. +- Ensures that only the major JDK version is included, as required by the build specification format. + + +This feature is described in more detail in our accepted ASE 2025 Industry ShowCase paper: `Unlocking Reproducibility: Automating the Re-Build Process for Open-Source Software `_. + +*********************************** +Frequently Asked Questions (FAQs) +*********************************** + +*Q: What formats are supported for buildspec output?* +A: Currently, only ``rc-buildspec`` is supported. + +*Q: Do I need to analyze the package every time before generating a buildspec?* +A: No, you only need to analyze the package once unless you want to update the database with newer information. + +*Q: Can Macaron generate buildspecs for other ecosystems besides Maven?* +A: Ecosystem support is actively expanding. See :ref:`Supported Builds ` for the latest details. + +*********************************** +Future Work and Contributions +*********************************** + +We plan to support more ecosystems, deeper integration with artifact repositories, and more user-configurable buildspec options. Contributions are welcome! + +*********************************** +See Also +*********************************** + +- :ref:`Output Files Guide ` +- :ref:`installation-guide` +- :ref:`Supported Builds `