tc-build: Rewrite #226

nathanchance · 2023-01-30T20:36:33Z

This is currently a work in progress: I believe build-llvm.py is feature complete but I still need to port build-binutils.py over and rewire the basic CI. As build-llvm.py is the main point of this repository, I wanted to push this for initial testing and review as soon as possible.

There are quite a few breaking changes that I made in the pursuit of simplicity and maintainability. See the commit message of the rewrite commit for the full list, I do not want to be constantly editing the commit message and the message of this pull request as things pop up that need to be mentioned. The rewrite commit hash is currently 2fb5c1b but it may change throughout review.

See that same commit message for some of the reasoning behind this; I think this is much more maintainable as everything is clearly separated and compartmentalized now. I apologize that the rewrite is a single commit instead of a series, I hope it will not be a burden to review. I tried to keep everything easily separated so that each area was not overwhelming to understand.

I welcome any and all bug reports but please keep them here rather than in the issues tab until this is merged.

cc @ConchuOD @dileks @kees, as I know you use the script at times, so I would like to avoid breaking your use cases.

ConchuOD · 2023-01-31T17:56:11Z

I gave my regular old toolchain build a go and that's working again. I haven't tried the PGO one yet, I'll give it a go tomorrow (if I remember!)

ConchuOD · 2023-01-31T18:01:41Z

I'm not overly sold on the deprecation & reuse of arguments, but I don't really view the backwards compat of this script as being particularly important. -b -> -r, -B -> -b might do something weird for some people I suppose. I just can't bring myself to care?

dileks · 2023-01-31T18:02:40Z

I would like to see that in an own Git branch.

Maybe with LLVM 16.0.0-rc2 I give it a try - stage1-only - cannot promise and planning with -rc3 or -rc4 a ThinLTO + PGO x86_64-defconfig optimized LLVM toolchain.

nathanchance · 2023-01-31T18:15:09Z

I gave my regular old toolchain build a go and that's working again. I haven't tried the PGO one yet, I'll give it a go tomorrow (if I remember!)

Appreciate it!

I'm not overly sold on the deprecation & reuse of arguments, but I don't really view the backwards compat of this script as being particular important. -b -> -r, -B -> -b might do something weird for some people I suppose. I just can't bring myself to care?

Yeah, I am not the biggest fan of it either but I think the amount of options that the script has accrued over time is a little on the larger side so if I am already rewriting the scripts, it seems reasonable to do a one-time deprecation period.

Perhaps we could add a one time notice to the script that will notify the user of the new options like:

if not (flag_file := Path(tc_build_folder, '.user_notified_of_options_break')).exists():
    tc_build.utils.print_warning('The options to this script has changed! Please review them with "--help"...')
    time.sleep(10)
    file_file.touch()

I would like to see that in an own Git branch.

I am not sure I understand this comment. Are you saying that even after this is merged, it should live a different branch than main?

nathanchance · 2023-01-31T22:22:44Z

Alright, I believe this is ready for full testing, as build-binutils.py has been ported over.

As I mentioned above, if there are concerns with backwards compatibility, we could warn the user once then write a git ignored file to disk to avoid warning in the future. At the same time, if the user is pulling a new update, I would expect them to glance at the commit log to see what new changes they are getting. I do not feel strongly about it, I'll do whatever people feel is right.

These scripts have gotten a little too unweildy, to the point where it is hard to follow the overall flow when making script wide changes. I have learned a lot about Python since I wrote this script so let's see if we can make something a little cleaner :) Signed-off-by: Nathan Chancellor <nathan@kernel.org>

The previous version of tc-build was hard to follow because data was passed from arguments down through a maze of functions, which made it hard to follow where decisions were being made about that data and make changes through the pipeline. This rewrite aims to avoid that complexity by having classes that encapsulate the logic of managing sources on disk and building these bits of software into clear and concise implementations. Now, the scripts focuses on handling the user's input, instantiating classes with the data provided or default data when necessary, then invoking class methods that work on this data internally, which makes it way simpler to see where things are getting modified or changed. Additionally, by using classes, we get the benefit of sharing more code than before, which makes everything much more maintainable in the long run. Right up front, this is documented a lot less through comments and docstrings than the previous iteration, as I wanted to try and make the logic self explanatory through descriptive class, method, and varviable names, but there could be places where there is still confusion. Holler if anything needs to be clarified via a code comment. Additionally, rather than being a 1:1 rewrite in terms of functionality, I made some internal changes that may have user visible effects: * Many options have dropped their default value in argparse, as they are options for a reason. This allows the script to easily determine what the user's intent is and make better internal decisions like managing sources. A notable option that no longer has a default value is '--install-folder'. Now, if '--install-folder' is not specified, the toolchain is just left alone in the build folder. I think this makes sense, as it allows us to eliminate '--install-stage1-only'. The user can now just specify '--build-stage1-only' and '--install-folder', resulting in a deterministic outcome. * Install folder is no longer specified by default, meaning that the toolchain just remains in the build folder once the build is completed. I think this makes more sense versus having a default installation folder but ignoring it for stage one but only if the user did not request installation. Now, if the user did not request explicit installation, they will not get it. * The '--projects' and '--targets' option now take a list of projects and targets, rather than the semicolon separated string that would be passed along directly to cmake. This allows for easy validation of targets and feels more natural with how option handling worked for the rest of the options. * Certain options have been eliminated that were deemed useless to port over: * '--incremental': I am not sure this has ever worked, as running cmake in a directory seems to be hit or miss at times. Getting rid of this option allows the script to elminate cleaning up previous versions of internal files. I do not expect eliminating this option to burden many people. If build time is a problem, the user is likely doing a single stage build, in which case they can take advantage of ccache. * '--install-stage1-only': As mentioned above, the presence of '--install-folder' is now what determines if the final stage is installed. * Certain options have been renamed due to dropping certain options or changing the meaning of the flag: * '-b' / '--branch' is now '-r' / '--ref'. '-b' is now short for '--build-folder' and '-B' is eliminated. * '--clang-vendor' is now '--vendor-string', as LLD supports a vendor string as well. * With '-i' / '--incremental' eliminated, '-I' / '--install-folder' now takes over that short option and '-I' is eliminated. * The name of the build folder of a particular stage is now a little bit more descriptive around that particular stage: "stage1" (normal) -> "bootstrap" "stage1" ('--build-stage1-only') -> "final" "stage2" (normal) -> "final" "stage2" ('--pgo') -> "instrumented" "stage3" ('--pgo') -> "final" * Fewer PGO kernel builds. Previously, if the user specified both 'kernel-defconfig' and 'kernel-allmodconfig', certain architectures would see duplicate builds because their 'allmodconfig'/'allyesconfig' is broken, so 'kernel/build.sh' would just build the defconfigs again, which unnecessarily extends the build time. To fix this, we have now a matrix of configuration targets ("defconfig", "allmodconfig", "allyesconfig"), which each have a list of builders. The LLVMKernelBuilder class handles generating this list of builders based on the desired configuration targets for benchmarking and the LLVM targets configured for the toolchain. A consequence of this matrix generation is that the situation of the user specifying both a "full" and "slim" configuration target becomes a little ambiguous; previously, they would have just both been built but now that we have this granularity available to us, it seems wise to take the conservative approach of honoring the "slim" variant and printing a warning. * binutils are no longer built when doing kernel builds if they are not found in PATH; instead, the kernel build is skipped with a warning. This is partially due to the installation folder change mentioned above since binutils would need to be installed to the kernel's build folder if no installation folder was specified, which would happen every time the user did PGO. This is wasteful in my opinion; the user should just install their distribution's bintuils or run build-binutils.py and pass that folder along via PATH: $ PATH=...:$PATH ./build-llvm.py ... It is not the end of the world if that kernel's backend does not get profiled. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

dileks · 2023-02-01T23:58:49Z

So you pushed several time your pull-226-patchset.

I would like to see:

tcbuild.github#Rewrite-v1
tcbuild.github#Rewrite-v2

...as branches beside main.

Easier for me to compare with main v1 v2 etc.
But might be my gusto - preferred style of working with Git.

git pull origin pull/226/head will take the latest one - and the previous versions?

BTW, git describe?
Might worth to give your rewrite a v1.0 :-)?

nathanchance · 2023-02-02T00:05:22Z

So you pushed several time your pull-226-patchset.

I would like to see:

tcbuild.github#Rewrite-v1 tcbuild.github#Rewrite-v2

...as branches beside main.

Easier for me to compare with main v1 v2 etc. But might be my gusto - preferred style of working with Git.

GitHub's UI has Compare buttons next to my force pushes if you want to see the diffs between my pushes but the rewrite has not been completed until this most recent push. The first force push was for the build-binutils.py rewrite and the second force push was updating ci.sh.

git pull origin pull/226/head will take the latest one - and the previous versions?

The previous versions were either buggy or incomplete, as I mention above. Consider this most recent push as v1.0, I have marked it as ready for review.

Any future changes that are needed for this rewrite will be done as individual commits on top.

LLVM_BUILD_CCACHE is now considered deprecated, so we should use cmake's compiler launcher variables. These variables have been supported since cmake 3.4, which is much older than LLVM's minimum required version, so we can wholesale replace LLVM_BUILD_CCACHE wirh thee variables, which will work for all LLVM releases. Link: https://discourse.llvm.org/t/llvm-ccache-build-is-deprecated/68431 Signed-off-by: Nathan Chancellor <nathan@kernel.org>

* origin/main: build-llvm.py: Fix setting LD fails due to inverted test-check (ClangBuiltLinux#228) This is not relevant for the rewrite, so this is just an empty merge to avoid conflicts. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

ConchuOD · 2023-02-23T11:00:22Z

Something I just noticed today Nathan, llvm and bintuils don't use the same args for providing a build dir - is this an opportunity to align the two?
llvm uses -b & binutils uses -B (as -b is for the binutils source)

nathanchance · 2023-02-23T16:51:08Z

is this an opportunity to align the two?

Sure thing, might as well when making breaking changes: a7edf39

…ild-folder' '-b' is the build folder shorthand in build-llvm.py. Use it for build-binutils.py as well, as people are more likely to change the build folder than the source being used. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

nathanchance · 2023-02-27T18:56:42Z

I have pushed away the ruff changes for the moment, nothing has functionally changed with the pull request.

This is an opinionated set of warnings from the full list that ruff supports. Link: https://beta.ruff.rs/docs/rules/ Signed-off-by: Nathan Chancellor <nathan@kernel.org>

This has been tastefully edited to omit using commas for function calls but for iterables, it makes sense. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

build-llvm.py:404:31: C400 [*] Unnecessary generator (rewrite as a `list` comprehension) Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Resolves ruff warning: tc_build/llvm.py:93:32: RUF005 [*] Consider `[self.tools.merge_fdata, *list(fdata_files)]` instead of concatenation Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build/llvm.py:232:9: SIM102 Use a single `if` statement instead of nested `if` statements Signed-off-by: Nathan Chancellor <nathan@kernel.org>

build-binutils.py:112:5: PLW2901 Outer for loop variable `target` overwritten by inner assignment target build-llvm.py:595:13: PLW2901 Outer for loop variable `pgo_target` overwritten by inner assignment target Signed-off-by: Nathan Chancellor <nathan@kernel.org>

As recently suggested by ruff: build-llvm.py:620:17: PLR5501 Consider using `elif` instead of `else` then `if` to remove one indentation level Signed-off-by: Nathan Chancellor <nathan@kernel.org>

This adds support for the ruff Python linter on top of the rewrite branch, which ensures that we get this right from the get go, rather than fixing on top. This is done as a separate series/merge to keep the history of the rewrite branch stable for testers. * rewrite-ruff: build-llvm.py: Refactor else + if into elif to reduce indentation build-*.py Fix new ruff warnings around iteration variables tc_build: llvm: Combine nested if statement tc_build: llvm: Ignore exception using contextlib.suppress() tc_build: llvm: Use unpacking instead of concatentation build-llvm.py: Use comprehension instead of constructor plus generator build-llvm.py: Use ternary expression for targets assignment tc-build: Run 'ruff --fix --select COM .' and adjust formatting for yapf Add a ruff configuration file Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

-Werror can cause kernel builds to fail, which is a poor experience when just building the kernel for profiling coverage, so it is disabled with 'KCFLAGS=-Wno-error'. Unfortunately, arch/powerpc has a separate configuration option to control building just that subdirectory with -Werror, which overrides the -Wno-error that was provided. Disable -Werror for powerpc using the special 'disable-werror.config' target, which merges the configuration to disable -Werror into the existing configuration, aligning powerpc with the global -Wno-error we apply for every other architecture. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

This matches our continuous integration and fixes building LLVM with PGO prior to 13.0.0. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

LLVM_VP_COUNTERS_PER_SITE does not exist in LLVM 11.x but the flags controlled by it are still needed when doing PGO with LLVM 11.x to avoid the same spew of warnings. Provide these flags ourselves via CMAKE_{C,CXX}_FLAGS. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

See the issue in the comments for more details. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

This gives builders flexibility with what exactly gets installed. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Changing the default target triple potentially ties a toolchain to a specific operating system. Allow opting out of this by setting DISTRIBUTING=1 in the environment Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

nickdesaulniers · 2023-03-17T17:33:17Z

tc_build/builder.py

+import subprocess
+
+
+class Folders:


It's not super common to have class names be plural. Having a class Folder, then having a list of instances of Folder objects with the member or variable identifier of folders would perhaps be more ergonomic.

It's not super common to have class names be plural.

Understandable.

Having a class Folder, then having a list of instances of Folder objects with the member or variable identifier of folders would perhaps be more ergonomic.

The whole point of the class is to keep the folders grouped together and easily identifiable. At that point, I might as well just turn this into a dictionary, right?

https://gist.github.com/nathanchance/06016b8ac2afcdd50e2fea6432136b33

I do not really have a strong opinion though.

tc_build/kernel.py

tc_build/llvm.py

tc_build/source.py

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

nathanchance · 2023-03-21T17:07:31Z

I plan to merge this in its current state in a few hours, I think it has had enough time for review and testing. I am happy to address any follow up comments with new pull requests, should there be any. Please file issues for any bugs found after the merge for easy tracking.

MajorP93 · 2023-03-21T18:35:06Z

@nathanchance Will there be differences for users? Will the command line arguments be the same?

nathanchance · 2023-03-21T18:44:28Z

@MajorP93 yes, there are some breaking changes, most of which you can read about in 8bc3594.

You can see the command line arguments for each here to prepare for any changes you may have to make:

tc-build/build-binutils.py

Lines 14 to 67 in 0c1452e

    
           parser.add_argument('-B', 
        
                               '--binutils-folder', 
        
                               help=''' 
        
                               By default, the script will download a copy of the binutils source in the src folder within 
        
                               the same folder as this script. If you have your own copy of the binutils source that you 
        
                               would like to build from, pass it to this parameter. It can be either an absolute or 
        
                               relative path. 
        
                               ''', 
        
                               type=str) 
        
           parser.add_argument('-b', 
        
                               '--build-folder', 
        
                               help=''' 
        
                               By default, the script will create a "build/binutils" folder in the same folder as this 
        
                               script then build each target in its own folder within that containing folder. If you 
        
                               would like the containing build folder to be somewhere else, pass it to this parameter. 
        
                               that done somewhere else, pass it to this parameter. It can be either an absolute or 
        
                               relative path. 
        
                               ''', 
        
                               type=str) 
        
           parser.add_argument('-i', 
        
                               '--install-folder', 
        
                               help=''' 
        
                               By default, the script will build binutils but stop before installing it. To install 
        
                               them into a prefix, pass it to this parameter. This can be either an absolute or 
        
                               relative path. 
        
                               ''', 
        
                               type=str) 
        
           parser.add_argument('-m', 
        
                               '--march', 
        
                               metavar='ARCH', 
        
                               help=''' 
        
                               Add -march=ARCH and -mtune=ARCH to CFLAGS to optimize the toolchain for the target 
        
                               host processor. 
        
                               ''', 
        
                               type=str) 
        
           parser.add_argument('--show-build-commands', 
        
                               help=''' 
        
                               By default, the script only shows the output of the comands it is running. When this option 
        
                               is enabled, the invocations of configure and make will be shown to help with reproducing 
        
                               issues outside of the script. 
        
                               ''', 
        
                               action='store_true') 
        
           parser.add_argument('-t', 
        
                               '--targets', 
        
                               help=''' 
        
                               The script can build binutils targeting arm-linux-gnueabi, aarch64-linux-gnu, 
        
                               mips-linux-gnu, mipsel-linux-gnu, powerpc-linux-gnu, powerpc64-linux-gnu, 
        
                               powerpc64le-linux-gnu, riscv64-linux-gnu, s390x-linux-gnu, and x86_64-linux-gnu. 
        
                               By default, it builds all supported targets ("all"). If you would like to build 
        
                               specific targets only, pass them to this script. It can be either the full target 
        
                               or just the first part (arm, aarch64, x86_64, etc). 
        
                               ''', 
        
                               nargs='+')

tc-build/build-llvm.py

Lines 26 to 399 in 0c1452e

    
           parser.add_argument('--assertions', 
        
                               help=textwrap.dedent('''\ 
        
                               In a release configuration, assertions are not enabled. Assertions can help catch 
        
                               issues when compiling but it will increase compile times by 15-20%%. 
        
                               '''), 
        
                               action='store_true') 
        
           parser.add_argument('-b', 
        
                               '--build-folder', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script will create a "build/llvm" folder in the same folder as this 
        
                               script and build each requested stage within that containing folder. To change the 
        
                               location of the containing build folder, pass it to this parameter. This can be either 
        
                               an absolute or relative path. 
        
                               '''), 
        
                               type=str) 
        
           parser.add_argument('--bolt', 
        
                               help=textwrap.dedent('''\ 
        
                               Optimize the final clang binary with BOLT (Binary Optimization and Layout Tool), which can 
        
                               often improve compile time performance by 5-7%% on average. 
        
                               This is similar to Profile Guided Optimization (PGO) but it happens against the final 
        
                               binary that is built. The script will: 
        
                               1. Figure out if perf can be used with branch sampling. You can test this ahead of time by 
        
                                  running: 
        
                                  $ perf record --branch-filter any,u --event cycles:u --output /dev/null -- sleep 1 
        
                               2. If perf cannot be used, the clang binary will be instrumented by llvm-bolt, which will 
        
                                  result in a much slower clang binary. 
        
                                  NOTE #1: When this instrumentation is combined with a build of LLVM that has already 
        
                                           been PGO'd (i.e., the '--pgo' flag) without LLVM's internal assertions (i.e., 
        
                                           no '--assertions' flag), there might be a crash when attempting to run the 
        
                                           instrumented clang: 
        
                                           https://github.com/llvm/llvm-project/issues/55004 
        
                                           To avoid this, pass '--assertions' with '--bolt --pgo'. 
        
                                  NOTE #2: BOLT's instrumentation might not be compatible with architectures other than 
        
                                           x86_64 and build-llvm.py's implementation has only been validated on x86_64 
        
                                           machines: 
        
                                           https://github.com/llvm/llvm-project/issues/55005 
        
                                           BOLT itself only appears to support AArch64 and x86_64 as of LLVM commit 
        
                                           a0b8ab1ba3165d468792cf0032fce274c7d624e1. 
        
                               3. A kernel will be built and profiled. This will either be the host architecture's 
        
                                  defconfig or the first target's defconfig if '--targets' is specified without support 
        
                                  for the host architecture. The profiling data will be quite large, so it is imperative 
        
                                  that you have ample disk space and memory when attempting to do this. With instrumentation, 
        
                                  a profile will be generated for each invocation (PID) of clang, so this data could easily 
        
                                  be a couple hundred gigabytes large. 
        
                               4. The clang binary will be optimized with BOLT using the profile generated above. This can 
        
                                  take some time. 
        
                                  NOTE #3: Versions of BOLT without commit 7d7771f34d14 ("[BOLT] Compact legacy profiles") 
        
                                           will use significantly more memory during this stage if instrumentation is used 
        
                                           because the merged profile is not as slim as it could be. Either upgrade to a 
        
                                           version of LLVM that contains that change or pick it yourself, switch to perf if 
        
                                           your machine supports it, upgrade the amount of memory you have (if possible), 
        
                                           or run build-llvm.py without '--bolt'. 
        
                               '''), 
        
                               action='store_true') 
        
           opt_options.add_argument('--build-stage1-only', 
        
                                    help=textwrap.dedent('''\ 
        
                               By default, the script does a multi-stage build: it builds a more lightweight version of 
        
                               LLVM first (stage 1) then uses that build to build the full toolchain (stage 2). This 
        
                               is also known as bootstrapping. 
        
                               This option avoids that, building the first stage as if it were the final stage. Note, 
        
                               this option is more intended for quick testing and verification of issues and not regular 
        
                               use. However, if your system is slow or can't handle 2+ stage builds, you may need this flag. 
        
                                    '''), 
        
                                    action='store_true') 
        
           # yapf: disable 
        
           parser.add_argument('--build-type', 
        
                               metavar='BUILD_TYPE', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script does a Release build; Debug may be useful for tracking down 
        
                               particularly nasty bugs. 
        
                               See https://llvm.org/docs/GettingStarted.html#compiling-the-llvm-suite-source-code for 
        
                               more information. 
        
                               '''), 
        
                               type=str, 
        
                               choices=['Release', 'Debug', 'RelWithDebInfo', 'MinSizeRel']) 
        
           # yapf: enable 
        
           parser.add_argument('--check-targets', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, no testing is run on the toolchain. If you would like to run unit/regression 
        
                               tests, use this parameter to specify a list of check targets to run with ninja. Common 
        
                               ones include check-llvm, check-clang, and check-lld. 
        
                               The values passed to this parameter will be automatically concatenated with 'check-'. 
        
                               Example: '--check-targets clang llvm' will make ninja invokve 'check-clang' and 'check-llvm'. 
        
                               '''), 
        
                               nargs='+') 
        
           parser.add_argument('-D', 
        
                               '--defines', 
        
                               help=textwrap.dedent('''\ 
        
                               Specify additional cmake values. These will be applied to all cmake invocations. 
        
                               Example: -D LLVM_PARALLEL_COMPILE_JOBS=2 LLVM_PARALLEL_LINK_JOBS=2 
        
                               See https://llvm.org/docs/CMake.html for various cmake values. Note that some of 
        
                               the options to this script correspond to cmake values. 
        
                               '''), 
        
                               nargs='+') 
        
           parser.add_argument('-f', 
        
                               '--full-toolchain', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script tunes LLVM for building the Linux kernel by disabling several 
        
                               projects, targets, and configuration options, which speeds up build times but limits 
        
                               how the toolchain could be used. 
        
                               With this option, all projects and targets are enabled and the script tries to avoid 
        
                               unnecessarily turning off configuration options. The '--projects' and '--targets' options 
        
                               to the script can still be used to change the list of projects and targets. This is 
        
                               useful when using the script to do upstream LLVM development or trying to use LLVM as a 
        
                               system-wide toolchain. 
        
                               '''), 
        
                               action='store_true') 
        
           parser.add_argument('-i', 
        
                               '--install-folder', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script will leave the toolchain in its build folder. To install it 
        
                               outside the build folder for persistent use, pass the installation location that you 
        
                               desire to this parameter. This can be either an absolute or relative path. 
        
                               '''), 
        
                               type=str) 
        
           parser.add_argument('--install-targets', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script will just run the 'install' target to install the toolchain to 
        
                               the desired prefix. To produce a slimmer toolchain, specify the desired targets to 
        
                               install using this options. 
        
                               The values passed to this parameter will be automatically prepended with 'install-'. 
        
                               Example: '--install-targets clang lld' will make ninja invoke 'install-clang' and 
        
                                        'install-lld'. 
        
                               '''), 
        
                               nargs='+') 
        
           parser.add_argument('-l', 
        
                               '--llvm-folder', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script will clone the llvm-project into the tc-build repo. If you have 
        
                               another LLVM checkout that you would like to work out of, pass it to this parameter. 
        
                               This can either be an absolute or relative path. Implies '--no-update'. When this 
        
                               option is supplied, '--ref' and '--use-good-revison' do nothing, as the script does 
        
                               not manipulate a repository it does not own. 
        
                               '''), 
        
                               type=str) 
        
           parser.add_argument('-L', 
        
                               '--linux-folder', 
        
                               help=textwrap.dedent('''\ 
        
                               If building with PGO, use this kernel source for building profiles instead of downloading 
        
                               a tarball from kernel.org. This should be the full or relative path to a complete kernel 
        
                               source directory, not a tarball or zip file. 
        
                               '''), 
        
                               type=str) 
        
           parser.add_argument('--lto', 
        
                               metavar='LTO_TYPE', 
        
                               help=textwrap.dedent('''\ 
        
                               Build the final compiler with either ThinLTO (thin) or full LTO (full), which can 
        
                               often improve compile time performance by 3-5%% on average. 
        
                               Only use full LTO if you have more than 64 GB of memory. ThinLTO uses way less memory, 
        
                               compiles faster because it is fully multithreaded, and it has almost identical 
        
                               performance (within 1%% usually) to full LTO. The compile time impact of ThinLTO is about 
        
                               5x the speed of a '--build-stage1-only' build and 3.5x the speed of a default build. LTO 
        
                               is much worse and is not worth considering unless you have a server available to build on. 
        
                               This option should not be used with '--build-stage1-only' unless you know that your 
        
                               host compiler and linker support it. See the two links below for more information. 
        
                               https://llvm.org/docs/LinkTimeOptimization.html 
        
                               https://clang.llvm.org/docs/ThinLTO.html 
        
                               '''), 
        
                               type=str, 
        
                               choices=['thin', 'full']) 
        
           parser.add_argument('-n', 
        
                               '--no-update', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script always updates the LLVM repo before building. This prevents 
        
                               that, which can be helpful during something like bisecting or manually managing the 
        
                               repo to pin it to a particular revision. 
        
                               '''), 
        
                               action='store_true') 
        
           parser.add_argument('--no-ccache', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script adds LLVM_CCACHE_BUILD to the cmake options so that ccache is 
        
                               used for the stage one build. This helps speed up compiles but it is only useful for 
        
                               stage one, which is built using the host compiler, which usually does not change, 
        
                               resulting in more cache hits. Subsequent stages will be always completely clean builds 
        
                               since ccache will have no hits due to using a new compiler and it will unnecessarily 
        
                               fill up the cache with files that will never be called again due to changing compilers 
        
                               on the next build. This option prevents ccache from being used even at stage one, which 
        
                               could be useful for benchmarking clean builds. 
        
                               '''), 
        
                               action='store_true') 
        
           parser.add_argument('-p', 
        
                               '--projects', 
        
                               help=textwrap.dedent('''\ 
        
                               Currently, the script only enables the clang, compiler-rt, lld, and polly folders in LLVM. 
        
                               If you would like to override this, you can use this parameter and supply a list that is 
        
                               supported by LLVM_ENABLE_PROJECTS. 
        
                               See step #5 here: https://llvm.org/docs/GettingStarted.html#getting-started-quickly-a-summary 
        
                               Example: -p clang lld polly 
        
                               '''), 
        
                               nargs='+') 
        
           opt_options.add_argument('--pgo', 
        
                                    metavar='PGO_BENCHMARK', 
        
                                    help=textwrap.dedent('''\ 
        
                               Build the final compiler with Profile Guided Optimization, which can often improve compile 
        
                               time performance by 15-20%% on average. The script will: 
        
                               1. Build a small bootstrap compiler like usual (stage 1). 
        
                               2. Build an instrumented compiler with that compiler (stage 2). 
        
                               3. Run the specified benchmark(s). 
        
                                  kernel-defconfig, kernel-allmodconfig, kernel-allyesconfig: 
        
                                  Download and extract kernel source from kernel.org (unless '--linux-folder' is 
        
                                  specified) and build some kernels based on the requested config with the instrumented 
        
                                  compiler (based on the '--targets' option). If there is a build error with one of the 
        
                                  kernels, build-llvm.py will fail as well. 
        
                                  kernel-defconfig-slim, kernel-allmodconfig-slim, kernel-allyesconfig-slim: 
        
                                  Same as above but only one kernel will be built. If the host architecture is in the list 
        
                                  of targets, that architecture's requested config will be built; otherwise, the config of 
        
                                  the first architecture in '--targets' will be built. This will result in a less optimized 
        
                                  toolchain than the full variant above but it will result in less time spent profiling, 
        
                                  which means less build time overall. This might be worthwhile if you want to take advantage 
        
                                  of PGO on slower machines. 
        
                                  llvm: 
        
                                  The script will run the LLVM tests if they were requested via '--check-targets' then 
        
                                  build a full LLVM toolchain with the instrumented compiler. 
        
                               4. Build a final compiler with the profile data generated from step 3 (stage 3). 
        
                               Due to the nature of this process, '--build-stage1-only' cannot be used. There will be 
        
                               three distinct LLVM build folders/compilers and several kernel builds done by default so 
        
                               ensure that you have enough space on your disk to hold this (25GB should be enough) and the 
        
                               time/patience to build three toolchains and kernels (will often take 5x the amount of time 
        
                               as '--build-stage1-only' and 4x the amount of time as the default two-stage build that the 
        
                               script does). When combined with '--lto', the compile time impact is about 9-10x of a one or 
        
                               two stage builds. 
        
                               See https://llvm.org/docs/HowToBuildWithPGO.html for more information. 
        
                                    '''), 
        
                                    nargs='+', 
        
                                    choices=[ 
        
                                        'kernel-defconfig', 
        
                                        'kernel-allmodconfig', 
        
                                        'kernel-allyesconfig', 
        
                                        'kernel-defconfig-slim', 
        
                                        'kernel-allmodconfig-slim', 
        
                                        'kernel-allyesconfig-slim', 
        
                                        'llvm', 
        
                                    ]) 
        
           parser.add_argument('--quiet-cmake', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script shows all output from cmake. When this option is enabled, the 
        
                               invocations of cmake will only show warnings and errors. 
        
                               '''), 
        
                               action='store_true') 
        
           parser.add_argument('-r', 
        
                               '--ref', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script builds the main branch (tip of tree) of LLVM. If you would 
        
                               like to build an older branch, use this parameter. This may be helpful in tracking 
        
                               down an older bug to properly bisect. This value is just passed along to 'git checkout' 
        
                               so it can be a branch name, tag name, or hash (unless '--shallow-clone' is used, which 
        
                               means a hash cannot be used because GitHub does not allow it). This will have no effect 
        
                               if '--llvm-folder' is provided, as the script does not manipulate a repository that it 
        
                               does not own. 
        
                               '''), 
        
                               default='main', 
        
                               type=str) 
        
           clone_options.add_argument('-s', 
        
                                      '--shallow-clone', 
        
                                      help=textwrap.dedent('''\ 
        
                               Only fetch the required objects and omit history when cloning the LLVM repo. This 
        
                               option is only used for the initial clone, not subsequent fetches. This can break 
        
                               the script's ability to automatically update the repo to newer revisions or branches 
        
                               so be careful using this. This option is really designed for continuous integration 
        
                               runs, where a one off clone is necessary. A better option is usually managing the repo 
        
                               yourself: 
        
                               https://github.com/ClangBuiltLinux/tc-build#build-llvmpy 
        
                               A couple of notes: 
        
                               1. This cannot be used with '--use-good-revision'. 
        
                               2. When no '--branch' is specified, only main is fetched. To work with other branches, 
        
                                  a branch other than main needs to be specified when the repo is first cloned. 
        
                                      '''), 
        
                                      action='store_true') 
        
           parser.add_argument('--show-build-commands', 
        
                               help=textwrap.dedent('''\ 
        
                               By default, the script only shows the output of the comands it is running. When this option 
        
                               is enabled, the invocations of cmake, ninja, and make will be shown to help with 
        
                               reproducing issues outside of the script. 
        
                               '''), 
        
                               action='store_true') 
        
           parser.add_argument('-t', 
        
                               '--targets', 
        
                               help=textwrap.dedent('''\ 
        
                               LLVM is multitargeted by default. Currently, this script only enables the arm32, aarch64, 
        
                               bpf, mips, powerpc, riscv, s390, and x86 backends because that's what the Linux kernel is 
        
                               currently concerned with. If you would like to override this, you can use this parameter 
        
                               and supply a list of targets supported by LLVM_TARGETS_TO_BUILD: 
        
                               https://llvm.org/docs/CMake.html#llvm-specific-variables 
        
                               Example: -t AArch64 ARM X86 
        
                               '''), 
        
                               nargs='+') 
        
           clone_options.add_argument('--use-good-revision', 
        
                                      help=textwrap.dedent('''\ 
        
                               By default, the script updates LLVM to the latest tip of tree revision, which may at times be 
        
                               broken or not work right. With this option, it will checkout a known good revision of LLVM 
        
                               that builds and works properly. If you use this option often, please remember to update the 
        
                               script as the known good revision will change. 
        
                               NOTE: This option cannot be used with '--shallow-clone'. 
        
                                      '''), 
        
                                      action='store_const', 
        
                                      const=GOOD_REVISION, 
        
                                      dest='ref') 
        
           parser.add_argument('--vendor-string', 
        
                               help=textwrap.dedent('''\ 
        
                               Add this value to the clang and ld.lld version string (like "Apple clang version..." 
        
                               or "Android clang version..."). Useful when reverting or applying patches on top 
        
                               of upstream clang to differentiate a toolchain built with this script from 
        
                               upstream clang or to distinguish a toolchain built with this script from the 
        
                               system's clang. Defaults to ClangBuiltLinux, can be set to an empty string to 
        
                               override this and have no vendor in the version string. 
        
                               '''), 
        
                               type=str, 
        
                               default='ClangBuiltLinux')

You can see the changes that I had to make for my own personal wrappers here:

nathanchance/env@a8ec5ce

Fixes the following trace back when using '--use-good-revision': Traceback (most recent call last): File ".../tc-build/./build-llvm.py", line 460, in <module> llvm_source.update(args.ref) File ".../tc-build/tc_build/llvm.py", line 541, in update if local_ref and local_ref.startswith('refs/heads/'): UnboundLocalError: local variable 'local_ref' referenced before assignment Fixes: 5cf3ec4 ("tc_build: llvm: Ignore exception using contextlib.suppress()") Signed-off-by: Nathan Chancellor <nathan@kernel.org>

MajorP93 · 2023-03-22T22:21:36Z

@nathanchance Ok I see. Thanks for the explanation.

nathanchance requested review from msfjarvis, nickdesaulniers and stephenhines as code owners January 30, 2023 20:36

nathanchance marked this pull request as draft January 30, 2023 20:37

nathanchance mentioned this pull request Jan 31, 2023

Add support for LoongArch #219

Merged

nathanchance force-pushed the rewrite branch from 9c522fd to 4a6d1fa Compare January 31, 2023 22:19

nathanchance mentioned this pull request Feb 1, 2023

Alpine/Musl segfault when building BOLT in stage 3 #227

Closed

nathanchance force-pushed the rewrite branch from 4a6d1fa to 8bc3594 Compare February 1, 2023 21:32

nathanchance marked this pull request as ready for review February 2, 2023 00:05

nathanchance added 2 commits February 13, 2023 18:17

nathanchance force-pushed the rewrite branch from d17a20e to 888565c Compare February 27, 2023 18:56

nathanchance added 5 commits February 27, 2023 13:54

Add a ruff configuration file

d3c49cf

This is an opinionated set of warnings from the full list that ruff supports. Link: https://beta.ruff.rs/docs/rules/ Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc-build: Run 'ruff --fix --select COM .' and adjust formatting for yapf

c596c8e

This has been tastefully edited to omit using commas for function calls but for iterables, it makes sense. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

build-llvm.py: Use ternary expression for targets assignment

69413a5

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

build-llvm.py: Use comprehension instead of constructor plus generator

d6bbf22

build-llvm.py:404:31: C400 [*] Unnecessary generator (rewrite as a `list` comprehension) Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: llvm: Use unpacking instead of concatentation

d1d73c7

Resolves ruff warning: tc_build/llvm.py:93:32: RUF005 [*] Consider `[self.tools.merge_fdata, *list(fdata_files)]` instead of concatenation Signed-off-by: Nathan Chancellor <nathan@kernel.org>

nathanchance added 13 commits February 27, 2023 13:55

tc_build: llvm: Ignore exception using contextlib.suppress()

5cf3ec4

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: llvm: Combine nested if statement

15882f6

tc_build/llvm.py:232:9: SIM102 Use a single `if` statement instead of nested `if` statements Signed-off-by: Nathan Chancellor <nathan@kernel.org>

build-llvm.py: Refactor else + if into elif to reduce indentation

3c23920

As recently suggested by ruff: build-llvm.py:620:17: PLR5501 Consider using `elif` instead of `else` then `if` to remove one indentation level Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: kernel: Fix function name in S390KernelBuilder

9fbccf3

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: kernel: Restrict integrated assembler with RISC-V

e630799

This matches our continuous integration and fixes building LLVM with PGO prior to 13.0.0. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: kernel: Use ld.bfd for powerpc64le builds with ld.lld 11.x

df566f1

See the issue in the comments for more details. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Add '--install-targets'

3a3c8e9

This gives builders flexibility with what exactly gets installed. Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: llvm: Allow opting out of default target triple

dd6ad72

Changing the default target triple potentially ties a toolchain to a specific operating system. Allow opting out of this by setting DISTRIBUTING=1 in the environment Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: llvm: Fix omission of CMAKE_RANLIB

09d14a6

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

nickdesaulniers approved these changes Mar 17, 2023

View reviewed changes

nathanchance added 3 commits March 17, 2023 11:25

tc_build: llvm: Simplify host_target_is_enabled()

4fd6541

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: source: Use an identifier for magic number

01b65b9

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

tc_build: kernel: Simplify adding builders in LLVMKernelBuilder

0c1452e

Signed-off-by: Nathan Chancellor <nathan@kernel.org>

nathanchance merged commit 726b4a1 into ClangBuiltLinux:main Mar 21, 2023

nathanchance deleted the rewrite branch March 21, 2023 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tc-build: Rewrite #226

tc-build: Rewrite #226

nathanchance commented Jan 30, 2023

ConchuOD commented Jan 31, 2023

ConchuOD commented Jan 31, 2023 •

edited

Loading

dileks commented Jan 31, 2023

nathanchance commented Jan 31, 2023

nathanchance commented Jan 31, 2023

dileks commented Feb 1, 2023

nathanchance commented Feb 2, 2023

ConchuOD commented Feb 23, 2023

nathanchance commented Feb 23, 2023

nathanchance commented Feb 27, 2023

nickdesaulniers Mar 17, 2023

nathanchance Mar 17, 2023

nathanchance commented Mar 21, 2023

MajorP93 commented Mar 21, 2023

nathanchance commented Mar 21, 2023

MajorP93 commented Mar 22, 2023

tc-build: Rewrite #226

tc-build: Rewrite #226

Conversation

nathanchance commented Jan 30, 2023

ConchuOD commented Jan 31, 2023

ConchuOD commented Jan 31, 2023 • edited Loading

dileks commented Jan 31, 2023

nathanchance commented Jan 31, 2023

nathanchance commented Jan 31, 2023

dileks commented Feb 1, 2023

nathanchance commented Feb 2, 2023

ConchuOD commented Feb 23, 2023

nathanchance commented Feb 23, 2023

nathanchance commented Feb 27, 2023

nickdesaulniers Mar 17, 2023

Choose a reason for hiding this comment

nathanchance Mar 17, 2023

Choose a reason for hiding this comment

nathanchance commented Mar 21, 2023

MajorP93 commented Mar 21, 2023

nathanchance commented Mar 21, 2023

MajorP93 commented Mar 22, 2023

ConchuOD commented Jan 31, 2023 •

edited

Loading