Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dependency_support: boost: bump boost to v1.82 #164

Merged
merged 2 commits into from
Jun 13, 2023

Conversation

lpawelcz
Copy link
Collaborator

@lpawelcz lpawelcz commented May 24, 2023

This PR is the second one in the PR chain that consists of:

Each of the following PRs depend on previous ones which means that the correct merging order is:
#167 -> #164 -> #165
I will rebase the PRs one at the time as they are merged in order to avoid conflicts.

This PR bumps the build rules for boost effectively bumping boost dependency to v1.82.
Due to rename (BUILD.boost -> boost.BUILD) and small changes in the renamed file also the patches had to be updated.
Additionally, boost_library for boost/python adds to cc_library sources all files under libs/python/src which includes fabscript which is not a C++ source and that is causing build failures. I had to further modify add_python.patch to explicitly remove this file from the build configuration.

Bumping boost was required to make openroad target working as I kept getting build errors:

ERROR: /home/asdf/.cache/bazel/asdf/970c5c2433bb6038ab152477a024c421/external/boost/BUILD.bazel:1750:14: Compiling libs/thread/src/pthread/thread.cpp failed: (Exit 1): gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 312 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox
In file included from /usr/include/pthread.h:33,
                 from /usr/include/x86_64-linux-gnu/c++/11/bits/gthr-default.h:35,
                 from /usr/include/x86_64-linux-gnu/c++/11/bits/gthr.h:148,
                 from /usr/include/c++/11/ext/atomicity.h:35,
                 from /usr/include/c++/11/bits/basic_string.h:39,
                 from /usr/include/c++/11/string:55,
                 from external/boost/boost/thread/exceptions.hpp:20,
                 from external/boost/boost/thread/pthread/thread_data.hpp:10,
                 from external/boost/boost/thread/thread_only.hpp:17,
                 from external/boost/libs/thread/src/pthread/thread.cpp:11:
external/boost/boost/thread/pthread/thread_data.hpp:60:5: error: missing binary operator before token "("
   60 | #if PTHREAD_STACK_MIN > 0
      |     ^~~~~~~~~~~~~~~~~
Target @org_theopenroadproject//:openroad failed to build

Another required change was bumping spdlog. After bumping boost I encountered spdlog build errors in my local test setup that were caused by MINSIGSTKSZ no longer being a constant (as described in catchorg/Catch2#2421 and gabime/spdlog#2058)

@tcal-x
Copy link
Member

tcal-x commented May 26, 2023

/gcbrun

@QuantamHD
Copy link
Collaborator

QuantamHD commented May 26, 2023

Looks like there's something off with the Boost rules

Step #1: ERROR: Traceback (most recent call last):
Step #1: 	File "/builder/home/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/external/boost/BUILD.bazel", line 13, column 30, in <toplevel>
Step #1: 		_repo_dir = repository_name().removeprefix("@")
Step #1: Error: 'string' value has no field or method 'removeprefix'

This might be a bazel version issue.

@lpawelcz
Copy link
Collaborator Author

Oh yes, that is indeed bazel version issue. It looks like the rules for building bazel starting from nelhage/rules_boost@5729d34#diff-fa9d836e1eb513176a2b1b9e121bcde61bfed880a680a8690ef84fc65b92f0bdR13 will work only with bazel releases >= v5.1.0 after bazelbuild/starlark#185 was solved.
I wanted to check if we can solve this the other way around - by downgrading boost build rules just bellow the linked commit (so that we don't need the removeprefix() method) and staying at bazel v4.2.1 but I ran into other errors:

ERROR: Analysis of target '//:counter_place_and_route' failed; build aborted: @platforms//os:osx is not a valid select() condition for @org_bzip_bzip2//:bz2lib.
To inspect the select(), run: bazel query --output=build @org_bzip_bzip2//:bz2lib.
For more help, see https://docs.bazel.build/be/functions.html#select.

This error does not occur on bazel v5.1.0.
How hard would it be to bump bazel version here in bazel_rules_hdl? Is it only the matter of changing the container image from

- name: l.gcr.io/google/bazel:3.5.0
to gcr.io/bazel-public/bazel:latest (or better, to pin the version to v5.1.0 or newer)?

@lpawelcz lpawelcz mentioned this pull request Jun 1, 2023
@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/gcbrun

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

How hard would it be to bump bazel version here in bazel_rules_hdl? Is it only the matter of changing the container image from

I think we would also need to document the new required version here: https://github.com/hdl/bazel_rules_hdl/blob/main/README.md?plain=1#L4

@lpawelcz lpawelcz force-pushed the bump-boost branch 2 times, most recently from 5b8330c to 8b0feb5 Compare June 7, 2023 07:35
@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/gcbrun

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

different failure here:

Step #1: W: chmod 0700 of directory /var/lib/apt/lists/partial failed - SetupAPTPartialDirectory (1: Operation not permitted)
Step #1: E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied)
Step #1: E: Unable to lock directory /var/lib/apt/lists/

cloudbuild.yaml Outdated
@@ -42,12 +42,6 @@ steps:
apt-get update
apt-get install apt-transport-https ca-certificates -y
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove those as well, as we're not fetching bazel anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we can do that but I'll try that out. Those lines were added on OpenROAD bump 2 years ago: #56

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah interesting I though they were added because of the wget, I think we should be able to assume that the official bazel image has everything needed to resolve http_archive from the WORKSPACE.

@lpawelcz
Copy link
Collaborator Author

lpawelcz commented Jun 7, 2023

I don't have the access to build logs from Google Cloud Build but it looks like it fails on early stage of the build. It might be caused by changing the container image. I will look into this.

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/gcbrun

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

Now the job fails with:

Step #1: bash: line 1: build_invocation_id.txt: Permission denied

cloudbuild.yaml Outdated
wget https://github.com/bazelbuild/bazel/releases/download/4.2.1/bazel_nojdk-4.2.1-linux-x86_64
chmod +x bazel_nojdk-4.2.1-linux-x86_64
mv bazel_nojdk-4.2.1-linux-x86_64 /usr/local/lib/bazel/bin/bazel
# End Bazel upgrade hack.
python3 tools/generate_uuid.py > build_invocation_id.txt
Copy link
Collaborator

@proppy proppy Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this should be written to $TMPDIR or /workspace, i.e we shouldn't assume that the current directory is writable on this image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually since it is only used in variable substitution below, it's probably safe to just story to a local env var.

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/gcbrun

@lpawelcz
Copy link
Collaborator Author

lpawelcz commented Jun 7, 2023

I messed up with those env vars:

generic::invalid_argument: invalid value for 'build.substitutions': key in the template "INVOCATION_ID" is not a valid built-in substitution

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/gcbrun

cloudbuild.yaml Outdated
python3 tools/generate_uuid.py > build_invocation_id.txt
python3 tools/generate_uuid.py > test_invocation_id.txt
export BUILD_INVOCATION_ID=$(python3 tools/generate_uuid.py)
export TEST_INVOCATION_ID=$(python3 tools/generate_uuid.py)
echo STATUS: Reporting logs with invocation id $(cat invocation_id.txt) to GitHub.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invocation_id.txt probably does not exist. Not sure why it isn't failing on that

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/gcbrun

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

still:

generic::invalid_argument: invalid value for 'build.substitutions': key in the template "BUILD_INVOCATION_ID" is not a valid built-in substitution

@lpawelcz
Copy link
Collaborator Author

lpawelcz commented Jun 7, 2023

still:

generic::invalid_argument: invalid value for 'build.substitutions': key in the template "BUILD_INVOCATION_ID" is not a valid built-in substitution

yes, I will just go with writing files to /tmp

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

No need to use /tmp, I think env variables require to be specified as $$ in cloud build configuration, see:
https://cloud.google.com/build/docs/configuring-builds/substitute-variable-values for more details.

@lpawelcz
Copy link
Collaborator Author

lpawelcz commented Jun 7, 2023

It probably still throws:

FATAL: mkdir('/builder/home/.cache/bazel/_bazel_ubuntu'): (error: 13): Permission denied

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

Yes:

FATAL: mkdir('/builder/home/.cache/bazel/_bazel_ubuntu'): (error: 13): Permission denied

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/workspace should be safe to use (as it should be mounted as a volume).

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

/gcbrun

@lpawelcz
Copy link
Collaborator Author

lpawelcz commented Jun 7, 2023

The issue is that currently used docker image (l.gcr.io/google/bazel:3.5.0) has default user set to root, while gcr.io/bazel-public/bazel:5.1.0 has a regular user named ubuntu.
It looks like Google Cloud Build default workdir is /builder and that creates permission problems for ubuntu user in newer image.
In 754a29f I set the dir attribute to test which, according to the docs, should set the workdir path to /workspace/test
Error log from testing this change (https://github.com/hdl/bazel_rules_hdl/pull/164/checks?check_run_id=14075987268) would be quite interesting to see. I would expect error:

FATAL: mkdir('/workspace/test/home/.cache/bazel/_bazel_ubuntu'): (error: 13): Permission denied

I managed to find another repository with bazel docker images: https://console.cloud.google.com/artifacts/docker/cloud-builders/us/gcr.io/bazel which is based on https://github.com/GoogleCloudPlatform/cloud-builders/tree/master/bazel.
Those images have default user set to root but they are not version-tagged. I tested the latest tag and it has bazel v5.4.0 .
With f817bbe I was able to finally get some bazel output logs in CI run: https://app.buildbuddy.io/invocation/25d8b96a-0539-11ee-a19f-0242c0a80a02. I actually faced those errors locally so it looks like with this docker image we finally got working CI setup

@proppy
Copy link
Collaborator

proppy commented Jun 7, 2023

gcr.io/bazel-public/bazel:5.1.0
python3: can't open file 'tools/generate_uuid.py': [Errno 2] No such file or directory
python3: can't open file 'tools/generate_uuid.py': [Errno 2] No such file or directory
STATUS: Reporting logs with build invocation id and test invocation id to GitHub.
FATAL: mkdir('/builder/home/.cache/bazel/_bazel_ubuntu'): (error: 13): Permission denied

@lpawelcz
Copy link
Collaborator Author

lpawelcz commented Jun 7, 2023

Ok, thank you @proppy. It looks like dir doesn't change the /builder path. This setting must be responsible for the working directory where the repos are cloned, etc. I searched for any mention of /builder in Google Cloud Build docs and bazel_rules_hdl but I wasn't successful.

For the time being I will set the docker image to the latest one from https://console.cloud.google.com/artifacts/docker/cloud-builders/us/gcr.io/bazel and I will work on solving issues during the actual bazel_rules_hdl test.

@proppy
Copy link
Collaborator

proppy commented Jun 12, 2023

/gcbrun

@proppy
Copy link
Collaborator

proppy commented Jun 12, 2023

@lpawelcz did you also take a look at gcr.io/cloud-marketplace-containers/google/bazel, it was mentioned here: https://github.com/GoogleCloudPlatform/cloud-builders/blob/master/bazel/README.md

@proppy
Copy link
Collaborator

proppy commented Jun 12, 2023

Copy link
Collaborator

@proppy proppy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also document the other changes in the PR description (spdlog and libbacktrace deps)?

@proppy
Copy link
Collaborator

proppy commented Jun 12, 2023

@QuantamHD can you also review (I don't have merging right in this project)

@lpawelcz
Copy link
Collaborator Author

Docker images from cloud-marketplace-containers are outdated. The last image uploaded there is from 3 sept 2020.
Images from chainguard have non-root default user which will again cause problems with permissions.
I'm writing detailed updates for each of the PRs:

And I will post those shortly

@lpawelcz lpawelcz mentioned this pull request Jun 12, 2023
@lpawelcz
Copy link
Collaborator Author

I have updated the description. @proppy, @QuantamHD please take a look at this PR and also at #167 where I've changed the docker image in order to run CI with bazel v5.4.0

@lpawelcz lpawelcz requested a review from QuantamHD June 12, 2023 16:01
lpawelcz added a commit to antmicro/xls that referenced this pull request Jun 13, 2023
This is a temporary change to test the XLS workspace before
PRs in bazel_rules_hdl are merged:

* hdl/bazel_rules_hdl#167
* hdl/bazel_rules_hdl#164
* hdl/bazel_rules_hdl#165

Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
@proppy
Copy link
Collaborator

proppy commented Jun 13, 2023

LGTM, but @QuantamHD needs to hit the merge button :)

@QuantamHD
Copy link
Collaborator

Thanks for the contribution! Looks like there are merge conflicts to resolve

Bump build rules for boost effectively bumping boost dependency
to v1.82. Due to rename (BUILD.boost -> boost.BUILD) and small
changes in the renamed file also the patches had to be updated.
boost_library defined for python in add_python.patch needs to exclude
file 'fabscript' from the build because it is added by boost_library()
definition by default and that causes build errors.

Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
This fixes build errors caused by changes in MINSIGSTKSZ
definition.
related issues:
* gabime/spdlog#1923
* gabime/spdlog#2058

Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
@lpawelcz lpawelcz merged commit 0fd2072 into hdl:main Jun 13, 2023
4 checks passed
@lpawelcz lpawelcz deleted the bump-boost branch June 13, 2023 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants