Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8313374: --enable-ccache's CCACHE_BASEDIR breaks builds #15080

Closed

Conversation

jankratochvil
Copy link
Contributor

@jankratochvil jankratochvil commented Jul 29, 2023

https://bugs.openjdk.org/browse/JDK-8313374
--enable-ccache's CCACHE_BASEDIR breaks builds


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8313374: --enable-ccache's CCACHE_BASEDIR breaks builds (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15080/head:pull/15080
$ git checkout pull/15080

Update a local copy of the PR:
$ git checkout pull/15080
$ git pull https://git.openjdk.org/jdk.git pull/15080/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15080

View PR using the GUI difftool:
$ git pr show -t 15080

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15080.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 29, 2023

👋 Welcome back jkratochvil! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 29, 2023
@openjdk
Copy link

openjdk bot commented Jul 29, 2023

@jankratochvil The following label will be automatically applied to this pull request:

  • build

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the build build-dev@openjdk.org label Jul 29, 2023
@mlbridge
Copy link

mlbridge bot commented Jul 29, 2023

Webrevs

@dholmes-ora
Copy link
Member

@jankratochvil can please update the JBS issue to explain how you went from the problem description there to the proposed fix. Thanks

@TheShermanTanker
Copy link
Contributor

@jankratochvil The change here will force ccache to write absolute paths for all compilations, which may cause cached object file comparisons to fail sometimes, and I'm not sure we'd want that? Have you tried running make clean before recompilations? (Note that there are 2 different targets for this, make clean for cleaning build artifacts and make dist-clean which removes almost everything in the build directory).

Weird bugs can happen sometimes if clean is not run, as stated in the warning message each time configure is run. I also read the issue in the tracker, and I don't think removing --enable-ccache entirely is correct, as we need that option to check for certain things (like if ccache can handle precompiled headers) and also disable aliasing ccache as the compiler during the build for that exact reason. I'm also not too sure how removing CCACHE_BASEDIR helps fix the issue, since all it does it change the command passed to the real compiler, could you elaborate on that slightly more?

@TheShermanTanker
Copy link
Contributor

Woah! Slow down a little, I wasn't asking you to remove the change entirely, just to elaborate on what the change does to achieve the fix, since it's a little unclear to me

@jankratochvil
Copy link
Contributor Author

OK so the real problem is:

  • OpenJDK makefiles operate with absolute paths to the target objects files.
  • Dependency files (*.d) switch to relative paths to everything by turning on CCACHE_BASEDIR.
  • make does not find a relative vs. absolute filename of the target to be the same as make does just a string comparison, not filesystem inode comparison.

When trying to fix it I have found $(FILE_MACRO_CFLAGS) is on my system -fmacro-prefix-map=/home/user/jdk-src-dir/=. I do not see why it was implemented this way by JDK-8226346 (and I cannot see this Bug). So I have removed it.

@jankratochvil
Copy link
Contributor Author

Personally I am scared of CCACHE_BASEDIR, for example how debug info will look afterwards (I know there is some remapping but still). I was already troubleshooting/fixing multiple ccache bugs in the past so I find CCACHE_BASEDIR too bold. But if you want to keep it let's fix at least the known bugs.

@TheShermanTanker
Copy link
Contributor

I see, I don't think we have to keep CCACHE_BASEDIR if it is not required, but we should test it to see how the build reacts to such a change, since it does change the behaviour of ccache which is a little bit nerve wracking. I unfortunately cannot see the bug either, perhaps either David or @erikj79 could help with why both options were implemented this way

@jankratochvil
Copy link
Contributor Author

I see, I don't think we have to keep CCACHE_BASEDIR if it is not required,

BTW CCACHE_BASEDIR is a caching improvement - without it builds in different directories never share the cache. Some existing ccache users may find the CCACHE_BASEDIR removal as a regresson. Although I doubt anyone is using ccache with OpenJDK as it has been a PITA without working dependencies.

@dholmes-ora
Copy link
Member

@jankratochvil See the comments here:

# Prevent the __FILE__ macro from generating absolute paths into the built

and before the change you made in NativeCompilation.gmk to understand why this is done.

@TheShermanTanker
Copy link
Contributor

TheShermanTanker commented Jul 31, 2023

Ah, these changes were done for better reproducible builds, I had completely missed that. The relative paths mean that the compiled files aren't all using different absolute paths in their macros and so on, making the build more reproducible. Changing these would be rather problematic for a lot of things unfortunately

Edit: Nevermind, David beat me to it

@dholmes-ora
Copy link
Member

Ah, these changes were done for better reproducible builds

Actually this pre-dates that but yes relative paths do support reproducible builds. :)

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That certainly looks much clearer in terms of what is being done and why - thanks. Now just need someone with ccache experience to validate the actual fix.

@jankratochvil
Copy link
Contributor Author

I doubt there is anyone as it did not work. :-)

@jankratochvil
Copy link
Contributor Author

@TheShermanTanker made a suggestion:

ifeq ($(call Or, $(if $(CCACHE), true), $(call equals $(ALLOW_ABSOLUTE_PATHS_IN_OUTPUT), false))-$(FILE_MACRO_CFLAGS), true-)

I got it by mail but I do not see it here - has it been deleted? I admit I prefer how it is now, this oneliner looks a bit cryptic to me.

@TheShermanTanker
Copy link
Contributor

Yes, it's been deleted since I deemed it too verbose for this use case

@erikj79
Copy link
Member

erikj79 commented Aug 8, 2023

I understand why you need fix-deps-file when using ccache, but do you also need MakeCommandRelative?

I'm not surprised that ccache support has bit rotted over the years as we aren't seeing much benefit from it in practice, so it's probably not used much. In the ideal case, it certainly speeds up the build a lot, but that case is very rare, at least in our build scenarios. I wouldn't mind removing it at this point, but if we can fix it with a simple patch like this, then that works too. Note that trying to setup ccache correctly without support in the build is quite tricky to get right. That's why I thought it necessary to handle it explicitly in the makefiles.

@jankratochvil
Copy link
Contributor Author

jankratochvil commented Aug 8, 2023

I understand why you need fix-deps-file when using ccache, but do you also need MakeCommandRelative?

I did try it without MakeCommandRelative. But it did not work. As the relative paths created by ccache itself will look somehow like dir1/dir2/../../dir3/dir4 which again does not match the intended dir3/dir4.

I'm not surprised that ccache support has bit rotted over the years as we aren't seeing much benefit from it in practice,

I do see a big benefit from it. When changing branches by git checkout in a single directory it gets perfectly cached.

Note that trying to setup ccache correctly without support in the build is quite tricky to get right. That's why I thought it necessary to handle it explicitly in the makefiles.

It is not tricky, I am happily using ccache for many projects including LLVM, GCC, GDB and others. You only do not have it cached across different directories (without CCACHE_BASEDIR) but as I have only 2-3 build trees for each of the projects (my mind can handle working only on 2-3 different branches at once) it does not matter much.

@kimbarrett
Copy link

I doubt there is anyone as it did not work. :-)

Um, I've been using it for years without noticing any problems, and measured a frequent speedup for my normal use.

@openjdk
Copy link

openjdk bot commented Aug 9, 2023

@jankratochvil This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8313374: --enable-ccache's CCACHE_BASEDIR breaks builds

Reviewed-by: erikj

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 110 new commits pushed to the master branch:

  • 360f65d: 8314022: Problem-list tests failing with jtreg 7.3
  • 0eb0997: 8288936: Wrong lock ordering writing G1HeapRegionTypeChange JFR event
  • 19ae62a: 8311170: Simplify and modernize equals and hashCode in security area
  • e9f751a: 8311247: Some cpp files are compiled with -std:c11 flag
  • 213d3c4: 8313891: JFR: Incorrect exception message for RecordedObject::getInt
  • 0e2c72d: 8313796: AsyncGetCallTrace crash on unreadable interpreter method pointer
  • 52ec4bc: 8303056: Improve support for Unicode characters and digits in JavaDoc search
  • 9cf12bb: 8313922: Remove unused WorkerPolicy::_debug_perturbation
  • 6e3cc13: 8312467: relax the builddir check in make/autoconf/basic.m4
  • 77e5739: 8310118: Resource files should be moved to appropriate directories
  • ... and 100 more: https://git.openjdk.org/jdk/compare/ad34be1f329edc8e7155983835cc70d733c014b8...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dholmes-ora, @TheShermanTanker, @erikj79) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 9, 2023
@jankratochvil
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Aug 9, 2023
@openjdk
Copy link

openjdk bot commented Aug 9, 2023

@jankratochvil
Your change (at version c1082c4) is now ready to be sponsored by a Committer.

@kimbarrett
Copy link

I don't understand what problem this PR is fixing. I see assertions from the
PR author that using ccache is broken with OpenJDK, or it breaks dependencies,
or something. Yet I've been using ccache for years for my local development
builds, and have not encountered any problems, only build time improvements.

@jankratochvil
Copy link
Contributor Author

I don't understand what problem this PR is fixing.

Does the reproducer work for you? https://bugs.openjdk.org/browse/JDK-8313374

@kimbarrett
Copy link

I don't understand what problem this PR is fixing.

Does the reproducer work for you? https://bugs.openjdk.org/browse/JDK-8313374

Yes, it works for me. Or at least, with the commands and setup I normally use,
it works. One difference is that I do out-of-tree builds, so I retried with
an in-tree build, but that still worked. Another difference is that I'm using
an in-house (Oracle) build wrapper called jib, which does things like ensuring
the compiler and other tools we support are being used. I don't even remember
how to do a build without it.

@jankratochvil
Copy link
Contributor Author

Maybe your setup does not reproduce this problem. Still the most simple ccache setup does reproduce the problem. And the problem is known and described above. So why it should not be fixed?
I sure know how to workaround this bug. As this fix has not yet landed I use --disable-ccache and hide from the configure script I am in fact using ccache.

#! /bin/bash
# openjdk: export PATH="$(echo "$PATH"|sed 's#:/usr/lib64/ccache:#:'$HOME'/ccache:#')";bash configure --disable-precompiled-headers --disable-ccache
set -ex
rm -rf ~/ccache
mkdir ~/ccache
cd ~/ccache
for i in /usr/lib64/ccache/*;do
  j=`basename $i`
  echo -e '#! /bin/bash\nexport PATH="$(echo "$PATH"|sed s#:$HOME/ccache:#:#)"\nexec '$i' "$@"' >$j
  chmod +x $j
done
echo done

@erikj79
Copy link
Member

erikj79 commented Aug 14, 2023

I tried this myself today and here are my findings. If the output dir is a subdir of the CCACHE_BASEDIR the issue reproduces. The source/header files in the *.d files can be relative without impacting how make resolves them, but if the object files are relative, then make doesn't understand that they should match the absolute files used in the rest of the makefiles.

Kim is likely not seeing this because in Oracle builds, because we add a custom repository outside of the OpenJDK repository, we typically put the build dir outside of the OpenJDK repo, and so outside of CCACHE_BASEDIR. This is actually a bug with the ccache configuration. We should change the definition of CCACHE_BASEDIR to be $WORKSPACE_ROOT instead of $TOPDIR (otherwise source files in the Oracle repo will not be treated the same as source files in the OpenJDK repo).

So in summary, this fix is needed, and we have a different bug with ccache handling for Oracle builds that hid it from us.

@mlbridge
Copy link

mlbridge bot commented Aug 16, 2023

Mailing list message from Kim Barrett on build-dev:

On Aug 14, 2023, at 10:09 AM, Erik Joelsson <erikj at openjdk.org> wrote:

On Wed, 9 Aug 2023 12:36:59 GMT, Jan Kratochvil <jkratochvil at openjdk.org> wrote:

https://bugs.openjdk.org/browse/JDK-8313374
--enable-ccache's CCACHE_BASEDIR breaks builds

Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision:

Use true/false for REWRITE_PATHS_RELATIVE and initialize it
- suggested by Erik Joelsson

I tried this myself today and here are my findings. If the output dir is a subdir of the `CCACHE_BASEDIR` the issue reproduces. The source/header files in the *.d files can be relative without impacting how make resolves them, but if the object files are relative, then make doesn't understand that they should match the absolute files used in the rest of the makefiles.

Kim is likely not seeing this because in Oracle builds, because we add a custom repository outside of the OpenJDK repository, we typically put the build dir outside of the OpenJDK repo, and so outside of `CCACHE_BASEDIR`. This is actually a bug with the ccache configuration. We should change the definition of `CCACHE_BASEDIR` to be `$WORKSPACE_ROOT` instead of `$TOPDIR` (otherwise source files in the Oracle repo will not be treated the same as source files in the OpenJDK repo).

So in summary, this fix is needed, and we have a different bug with ccache handling for Oracle builds that hid it from us.

Thanks for chasing that down Erik. On that basis, I?ve no issues with the proposed change.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <https://mail.openjdk.org/pipermail/build-dev/attachments/20230815/435e056c/signature.asc>

@yan-too
Copy link

yan-too commented Aug 23, 2023

/sponsor

@openjdk
Copy link

openjdk bot commented Aug 23, 2023

Going to push as commit 571c435.
Since your change was applied there have been 261 commits pushed to the master branch:

  • d1de3d0: 8313901: [TESTBUG] test/hotspot/jtreg/compiler/codecache/CodeCacheFullCountTest.java fails with java.lang.VirtualMachineError
  • a0d0f21: 8314752: Use google test string comparison macros
  • 7e843c2: 8284772: GHA: Use GCC Major Version Dependencies Only
  • ba6cdbe: 8309214: sun/security/pkcs11/KeyStore/CertChainRemoval.java fails after 8301154
  • 9f4a9fe: 8312434: SPECjvm2008/xml.transform with CDS fails with "can't seal package nu.xom"
  • 7c169a4: 8312232: Remove sun.jvm.hotspot.runtime.VM.buildLongFromIntsPD()
  • 2eae13c: 8214248: (fs) Files:mismatch spec clarifications
  • ce1ded1: 8314749: Remove unimplemented _Copy_conjoint_oops_atomic
  • 32bf468: 8314274: G1: Fix -Wconversion warnings around G1CardSetArray::_data
  • eb06572: 8313408: Use SVG for BoxLayout example
  • ... and 251 more: https://git.openjdk.org/jdk/compare/ad34be1f329edc8e7155983835cc70d733c014b8...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 23, 2023
@openjdk openjdk bot closed this Aug 23, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Aug 23, 2023
@openjdk
Copy link

openjdk bot commented Aug 23, 2023

@yan-too @jankratochvil Pushed as commit 571c435.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

6 participants