Skip to content

NO-JIRA: hack/build.sh: parallelize the build if MAKEFLAGS is set#10364

Open
jhixson74 wants to merge 1 commit intoopenshift:mainfrom
jhixson74:main-parallelize-build
Open

NO-JIRA: hack/build.sh: parallelize the build if MAKEFLAGS is set#10364
jhixson74 wants to merge 1 commit intoopenshift:mainfrom
jhixson74:main-parallelize-build

Conversation

@jhixson74
Copy link
Member

@jhixson74 jhixson74 commented Mar 5, 2026

Parallelize build when setting MAKEFLAGS. Keep default behavior otherwise.

To test:
env MAKEFLAGS="-j $(nproc)" ./hack/build.sh

This will pass along the output of nproc, or if you manually specify a number. make will get -j , and go build will get -p . Nothing changes if not setting MAKEFLAGS with a -j option.

Also updated the way we create the cluster-api.zip file. Instead of deleting it and zipping all the providers everytime hack/build.sh is invoked, we just update specific providers when changed and only update that provider in cluster-api.zip.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 5, 2026
@openshift-ci-robot
Copy link
Contributor

@jhixson74: This pull request explicitly references no jira issue.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 5, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign barbacbd for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jhixson74
Copy link
Member Author

/cc @patrickdillon

@openshift-ci openshift-ci bot requested a review from patrickdillon March 5, 2026 23:14
@jhixson74 jhixson74 force-pushed the main-parallelize-build branch 2 times, most recently from cc295b9 to 0e8c398 Compare March 6, 2026 02:15
hack/build.sh Outdated

# shellcheck disable=SC2086
go build ${GOFLAGS} -gcflags "${GCFLAGS}" -ldflags "${LDFLAGS}" -tags "${TAGS}" -o "${OUTPUT}" ./cmd/openshift-install
go build ${GOBUILDJOBS} ${GOFLAGS} -gcflags "${GCFLAGS}" -ldflags "${LDFLAGS}" -tags "${TAGS}" -o "${OUTPUT}" ./cmd/openshift-install
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡I notice we currently have the GOFLAGS. Should we use that instead of defining a new one? For example, we run:

GOFLAGS='-p=16 --mod=vendor' ./hack/build.sh

With this, I guess we just need to adjust cluster-api/Makefile so that go build considers GOFLAGS and hack/build.sh to pass down the env var?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks when go generate runs. That's why I did it this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh whoops. Just curious, may I know how or why it failed (i.e. error message)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was asking because I haven't seen go generate failed in my local setup. Maybe, I am missing something 😓 Also, latest change might have broken the build in ci/prow/e2e-aws-ovn and ci/prow/images:

 + make '' -C cluster-api all
make: *** empty string invalid as file name.  Stop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bleh. Yeah, the shell linter wants variables quoted. But for cases like this, you intentionally don't quote them ;-) I will update.

Copy link
Member

@tthvo tthvo Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for go generate, it honors GOFLAGS. -p is not a valid option to go generate

Ah right, sorry GOFLAGS is a built-in Golang env var, which applies to all go commands🤦 I have one more suggestion if it makes it easier:

We can define a variable GOBUILDFLAGS for specificially for building and pass it directly to go build:

go build ${GOBUILDFLAGS} ${GOFLAGS} -gcflags "${GCFLAGS}" -ldflags "${LDFLAGS}" -tags "${TAGS}" -o "${OUTPUT}" ./cmd/openshift-install

Then we can run build with any supported go flags instead of parsing the number out:

GOBUILDFLAGS="-p $(nproc)" ./hack/build.sh

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that is a nice feature to have, it's outside of the scope here. This particular PR passes along the number of jobs to both make, and go build. For this to work, we now need to set GOBUILDFLAGS and MAKEFLAGS in the environment to keep both consistent. For now, I think this works fine as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I met you half way. I do like the name GOBUILDFLAGS better. I also modified it here so that can be set in the environment. If MAKEFLAGS is set with -j, it will automagically set GOBUILDFLAGS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right, we need it for both make and go. The changes look good to me, thanks!

Copy link
Member

@tthvo tthvo Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long thread :(

Could you help me understand the reason to also set -p for go build? I thought we just need to allocate >= 1 make job slots via -j <n>, right? On main, we can already do MAKEFLAGS="-j 16" ./hack/build.sh, which does parallelize the capi provider build.

For go, go help build shows 👇, which is, in most cases, already building packages in parallel?

-p n
		the number of programs, such as build commands or
		test binaries, that can be run in parallel.
		The default is GOMAXPROCS, normally the number of CPUs available.

@jhixson74 jhixson74 force-pushed the main-parallelize-build branch 2 times, most recently from b266623 to 1ec7cb3 Compare March 6, 2026 04:20
hack/build.sh Outdated
release)
LDFLAGS="${LDFLAGS} -s -w"
TAGS="${TAGS} release"
TAGS="${TAGS} release"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a small empty character after this line (i.e. reported as ␦):

$ cat hack/build.sh | grep -A3 -- 'release)'
release)
	LDFLAGS="${LDFLAGS} -s -w"
	TAGS="${TAGS} release"␦
	;;

Do you see it? Maybe we should remove it...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My VSCode also reported the same thing:

Image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blah. That explains why there is a difference showing. I'm not sure what I did here, but I'll nuke it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, ^Z

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I thought i was seeing things :D

hack/build.sh Outdated

# shellcheck disable=SC2086
go build ${GOFLAGS} -gcflags "${GCFLAGS}" -ldflags "${LDFLAGS}" -tags "${TAGS}" -o "${OUTPUT}" ./cmd/openshift-install
go build ${GOBUILDJOBS} ${GOFLAGS} -gcflags "${GCFLAGS}" -ldflags "${LDFLAGS}" -tags "${TAGS}" -o "${OUTPUT}" ./cmd/openshift-install
Copy link
Member

@tthvo tthvo Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for go generate, it honors GOFLAGS. -p is not a valid option to go generate

Ah right, sorry GOFLAGS is a built-in Golang env var, which applies to all go commands🤦 I have one more suggestion if it makes it easier:

We can define a variable GOBUILDFLAGS for specificially for building and pass it directly to go build:

go build ${GOBUILDFLAGS} ${GOFLAGS} -gcflags "${GCFLAGS}" -ldflags "${LDFLAGS}" -tags "${TAGS}" -o "${OUTPUT}" ./cmd/openshift-install

Then we can run build with any supported go flags instead of parsing the number out:

GOBUILDFLAGS="-p $(nproc)" ./hack/build.sh

@jhixson74 jhixson74 force-pushed the main-parallelize-build branch 3 times, most recently from 7ad7ce8 to 17958f0 Compare March 6, 2026 05:03
hack/build.sh Outdated
Comment on lines +26 to +28
if [ ${MAKEJOBS} -gt 0 ]; then
GOBUILDFLAGS="${GOBUILDFLAGS} -p ${MAKEJOBS}"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ ${MAKEJOBS} -gt 0 ]; then
GOBUILDFLAGS="${GOBUILDFLAGS} -p ${MAKEJOBS}"
fi
if [ -n "${MAKEJOBS}" ] && [ "${MAKEJOBS}" -gt 0 ]; then
GOBUILDFLAGS="${GOBUILDFLAGS} -p ${MAKEJOBS}"
fi

We should check for non-empty first before comparing the integer. If not, there is actually a silent failure:

+ '[' -gt 0 ']'
./hack/build.sh: line 26: [: -gt: unary operator expected

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the regex to be more bulletproof. Now we won't get a non-digit. We either get null, or the number of jobs passed to -j.

@jhixson74 jhixson74 force-pushed the main-parallelize-build branch 2 times, most recently from 80e9320 to c486ac7 Compare March 6, 2026 16:17
@tthvo
Copy link
Member

tthvo commented Mar 6, 2026

/test golint

Copy link
Member

@tthvo tthvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@jhixson74 May I get help with #10364 (comment) before tagging?

@jhixson74 jhixson74 force-pushed the main-parallelize-build branch from c486ac7 to 170159b Compare March 6, 2026 17:48
@jhixson74
Copy link
Member Author

Looks good to me!

@jhixson74 May I get help with #10364 (comment) before tagging?

I made one more optimization to building the capi providers. Please review ;-) go build -p works like make -j. It will create the number of threads specified. As you specified, if not, GOMAXPROCS is used, which is the number of cores.

@jhixson74
Copy link
Member Author

/hold

It appears go changes aren't being noticed.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2026
@jhixson74 jhixson74 force-pushed the main-parallelize-build branch from 170159b to 525f827 Compare March 6, 2026 18:01
@jhixson74
Copy link
Member Author

/hold cancel

zip exits with code 12 if there is nothing to do, so added check for that.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2026
@jhixson74
Copy link
Member Author

/test golint

Copy link
Member

@tthvo tthvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

The build is pretty fast now :D I guess e2e is passing the build means it's good :D

It will create the number of threads specified. As you specified, if not, GOMAXPROCS is used, which is the number of cores.

Oh thanks, I guess it's for making it consistent between make and go, but not a hard requirement to build in parallel...

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 6, 2026
@patrickdillon
Copy link
Contributor

Do you see improvements when using this? I'm curious about the results.

go build already uses all cores; the make jobs are running in serial, but each job is a go build so is parallelized across all cores (right?)

when I have tested (along time ago) I didn't see a lot of spare cpu space, except with the compression/zipping part

It's not really clear to me why we should link the number of make jobs to the go -p flag.

cd providers/$*; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $$2 }'|sed 's|"||g'`; fi; \
go build -gcflags $(GCFLAGS) -ldflags $(LDFLAGS) -o ../../bin/$(TARGET_OS_ARCH)/cluster-api-provider-$* "$$path";
go build $(GOBUILDFLAGS) -gcflags $(GCFLAGS) -ldflags $(LDFLAGS) -o ../../bin/$(TARGET_OS_ARCH)/cluster-api-provider-$* "$$path";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, outside of your changes, but I notice that we're apparently not ${GOFLAGS} in the capi builds... I wonder if CI is using those 👀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the build logs, it's drop -mod=vendor

2026-03-06T18:25:27.160016001Z go build  -gcflags "" -ldflags "-s -w" -o ../../bin/linux_amd64/cluster-api-provider-openstack "$path";

vs

26-03-06T18:27:51.399947790Z + go build -mod=vendor -gcflags '' -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=v1.4.21-pre-246-g170afd6c6ccd7679f4c55d91808988eeedbfe1fb-dirty -X github.com/openshift/installer/pkg/version.Commit=170afd6c6ccd7679f4c55d91808988eeedbfe1fb -X github.com/openshift/installer/

🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to not be an issue 😅

        -mod mode
                module download mode to use: readonly, vendor, or mod.
                By default, if a vendor directory is present and the go version in go.mod
                is 1.14 or higher, the go command acts as if -mod=vendor were set.

But we should probably fix that in case art changes anything in the future

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, #10364 (comment) makes sense. I was wondering the same thing :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the build logs, it's drop -mod=vendor

2026-03-06T18:25:27.160016001Z go build  -gcflags "" -ldflags "-s -w" -o ../../bin/linux_amd64/cluster-api-provider-openstack "$path";

vs

26-03-06T18:27:51.399947790Z + go build -mod=vendor -gcflags '' -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=v1.4.21-pre-246-g170afd6c6ccd7679f4c55d91808988eeedbfe1fb-dirty -X github.com/openshift/installer/pkg/version.Commit=170afd6c6ccd7679f4c55d91808988eeedbfe1fb -X github.com/openshift/installer/

🤔

I haven't done anything with GOFLAGS here. I intentionally left it alone.

@tthvo
Copy link
Member

tthvo commented Mar 6, 2026

Do you see improvements when using this? I'm curious about the results.
go build already uses all cores; the make jobs are running in serial, but each job is a go build so is parallelized across all cores (right?)

For me, running time MAKEFLAGS="-j $(nproc)" ./hack/build.sh, I noticed the capi providers are built concurrently instead of one at a time. This seems a bit faster against a -j 1 or default ./hack/build.sh.

$ time ./hack/build.sh
# providers are built one at a time
real	0m49.057s
user	0m55.159s
sys	0m15.100s

$ time MAKEFLAGS="-j $(nproc)" ./hack/build.sh
# providers are built concurrently
real	0m31.646s
user	1m4.316s
sys	0m15.624s

It's not really clear to me why we should link the number of make jobs to the go -p flag.

🤔 Right, I had the same question in #10364 (comment). I assumed it was syncing the make and go allowed threads. Thinking about it again, I am now unsure too :(

@jhixson74
Copy link
Member Author

Do you see improvements when using this? I'm curious about the results.

go build already uses all cores; the make jobs are running in serial, but each job is a go build so is parallelized across all cores (right?)

when I have tested (along time ago) I didn't see a lot of spare cpu space, except with the compression/zipping part

It's not really clear to me why we should link the number of make jobs to the go -p flag.

We can of course enable different knobs. None of this is on by default though, and is pretty much only useful to those of us working on it. Go uses all cores by default. Make does not. We can optimize this even more if we could get rid of build*.sh files and turn them into makefiles. What if I don't want all cores used? What if go is taking up too many resources on my laptop (hint: it does... all the time, and brings it to a screeching halt). That's why I've kept this little hack around for years. I'm attempting to (once again) bring it in ;-)

Would you like me to decouple the number of make jobs from go build jobs? I'm fine with that. Or should I just close this out?

@jhixson74 jhixson74 requested review from patrickdillon and tthvo March 6, 2026 22:04
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

@jhixson74: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants