Skip to content

Conversation

@dmcilvaney
Copy link
Contributor

@dmcilvaney dmcilvaney commented Dec 11, 2023

Merge Checklist

All boxes should be checked before merging the PR (just tick any boxes which don't apply to this PR)

  • The toolchain has been rebuilt successfully (or no changes were made to it)
  • The toolchain/worker package manifests are up-to-date
  • Any updated packages successfully build (or no packages were changed)
  • Packages depending on static components modified in this PR (Golang, *-static subpackages, etc.) have had their Release tag incremented.
  • Package tests (%check section) have been verified with RUN_CHECK=y for existing SPEC files, or added to new SPEC files
  • All package sources are available
  • cgmanifest files are up-to-date and sorted (./cgmanifest.json, ./toolkit/scripts/toolchain/cgmanifest.json, .github/workflows/cgmanifest.json)
  • LICENSE-MAP files are up-to-date (./SPECS/LICENSES-AND-NOTICES/data/licenses.json, ./SPECS/LICENSES-AND-NOTICES/LICENSES-MAP.md, ./SPECS/LICENSES-AND-NOTICES/LICENSE-EXCEPTIONS.PHOTON)
  • All source files have up-to-date hashes in the *.signatures.json files
  • sudo make go-tidy-all and sudo make go-test-coverage pass
  • Documentation has been updated to match any changes to the build system
  • Ready to merge

Summary

Part of #6892, related to #6788

This is part of a series of PRs for the new toolchain prototype. Some PRs are fundamental bug fixes for 2.0, some are new features better targeted at 3.0, and some are early prototypes/RFCs.

Destination

2.0 - Maybe
3.0 - Yes
RFC - Yes

Specific PR summary:

Copied from #6788:

I'm investigating a new wrapper around the toolchain bootstrap scripts, I'd like some input on the various modes the toolchain can operate in.

The "build" mode can be one of Auto, Fast, Force, Never. Generally auto will try and get a toolchain as fast as possible and ideally should be the default going forward.

The archive is a path to a toolchain.tar.gz, and will take precedence over locally built/downloaded packages.

A new feature will allow automatic updating of the manifest files. This is future looking and likely won't be in V1.

There is also a global cache flag. It doesn't affect the overall behavior of the tools, just short-circuits certain build steps using pre-built parts available locally in the dev environment.

Background:
The prototype of the wrapper tool currently lives inside a new toolchainV2.mk which can fully replace the existing toolchain.mk. It calls a go tool which in turns calls the existing toolchain scripts.

Currently my intent is to leave the toolchain scripts unmodified. Future iterations of the tool can move logic out of the shell scripts into the go tool to simplify the process. The tool will examine the local environment and decide the optimal set of operations needed to generate a toolchain: downloading, building the raw bootstrap, and building he official rpms.

The go tool as it currently exists is not suitable for direct calling, since its a wrapper for the scripts it still needs to pass the dozens of parameters through to them, meaning it has a lot of parameters currently. Instead it provides a make recipe for producing the toolchain .rpms the same way the existing toolchain.mk does.

Main tool (toolchain.go)

toolchain.go is the main command line tool that takes responsibility for managing the toolchain. It takes as input the same set of arguments that are passed to both existing toolchain scripts. The tool behaves as follows:

  • Generate a configuration based on the value of REBUILD_TOOLCHAIN. One of always, fast, auto, never. This translates into a set of configurations:
type buildConfig struct {
	doBuild                 bool
	doDeltaBuild            bool
	doDownload              bool
	doCache                 bool
	doUpdateManifestArchive bool
	doUpdateManifestPMC     bool
	doUpdateManifestLocal   bool
	doArchive               bool
}
  • Currently doUpdateManifest* is unsupported, but in theory might update the manifest files to match some external source of truth.

  • doArchive extracts toolchain packages from a .tar.gz file, passed via TOOLCHAIN_ARCHIVE=.

  • The generall flow is as follows (generally after each step it will check if the toolchain rpms are ready and bail out early if so).

    1. Remove any extra rpms from the toolchain folder, we only want the exact set from the manifest
    2. If doArchive is set, skip to end
    3. if doDownload is set (yes unless never or always is used), the toolchain will attempt to download everything it can from PMC
    4. Build a bootstrap .tar.gz using the existing script. Ideally this is pulled from cache (see later)
    5. Build official .tar.gz. Again, ideally from cache (see later)
    6. If doDeltaBuild is set, copy all the matching rpms from ./build/toolchain_rpms into the build environment.
    7. Use the existing toolchain script to build all toolchain RPMs.
    8. Copy any that were build out to ./out/RPMS
  1. Extract rpms to ./build/toolchain_rpms if they are missing, either from the official .tar.gz or if provided the TOOLCHAIN_ARCHIVE. Validate them against the manifest files.
  2. Final validation of the toolchain rpms. If everything is present, success
archive.go

Handles .tar.gz (or .gz) archives. Can check the contents, or extract the contents. This package needs a bit of cleanup still, there is a lot of overlap.

bootstrap.go

A 1:1 wrapper around the existing raw bootstrap script.

cache.go

A rudimentary cache is included in the PR. It likely isn't truly fit for purpose but shows a possible implementation. It can vastly improve build speed for repeated local builds, especially for the raw bootstrap toolchain. Ideally this should be shared amongst all mariner builds on a system, but in the short term it can still provide benefits even if operating locally to a single checkout.

It has a simple mode of operation: for a give set of input files (order matters), measure those files and generate a hash. Use the hash as a key to store the output file. For example, the offical toolchain rpms take the scripts, manifest files, and raw toolchain archive as inputs. If those are all the same, the cache can provide a pre-built version of the toolchain archive.

A real implementation will need to handle errors better, likely track total size and eviction, etc.

downloader.go

Responsible for taking the manifest file and trying to download a copy of each rpm from the provides base URLs.
@SeanDougherty New xml repoquery goes here.

manifest.go

TODO, this is a stub implementation of a possible manifest update functionality

official.go

Wraps the official build script. It moves the rpms needed for incremental builds and the final build rpms to ./out/RPMS. Also responsible for cleaning up the toolchain chroot in the event of a failure.

Change Log
Does this affect the toolchain?

YES

Associated issues
Test Methodology
  • Local builds
  • Full tests TBD

@dmcilvaney dmcilvaney requested a review from a team as a code owner December 11, 2023 21:56
@microsoft-github-policy-service microsoft-github-policy-service bot added the main PR Destined for main label Dec 11, 2023
@dmcilvaney dmcilvaney added Tools go Pull requests that update Go code RFC Request For Comments 3.0-dev PRs Destined for AzureLinux 3.0 labels Dec 11, 2023
@@ -0,0 +1,254 @@
// Copyright (c) Microsoft Corporation.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this so it doesn't collide with the existing command line downlaoder tool?

}

// Remove any unepxected RPMs from the toolchain directory.
func CleanToolchainRpms(toolchainDir string, toolchainRPMs []string) (filesRemoved []string, err error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why in downlaoder?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.0-dev PRs Destined for AzureLinux 3.0 go Pull requests that update Go code main PR Destined for main RFC Request For Comments Tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant