Skip to content

JDK-8298448: UndefinedBehaviorSanitizer#11604

Closed
jcking wants to merge 11 commits intoopenjdk:masterfrom
jcking:ubsan
Closed

JDK-8298448: UndefinedBehaviorSanitizer#11604
jcking wants to merge 11 commits intoopenjdk:masterfrom
jcking:ubsan

Conversation

@jcking
Copy link
Contributor

@jcking jcking commented Dec 9, 2022

Allow building OpenJDK with UBSan. Currently the build fails when optimizing the image due to lots of undefined behavior (it invokes the built JVM). Follow up PRs will either replace the undefined behavior with well defined behavior or suppress errors which are intentional. The goal is to make OpenJDK more well defined and thus more portable across compilers and architectures.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11604/head:pull/11604
$ git checkout pull/11604

Update a local copy of the PR:
$ git checkout pull/11604
$ git pull https://git.openjdk.org/jdk pull/11604/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 11604

View PR using the GUI difftool:
$ git pr show -t 11604

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11604.diff

Signed-off-by: Justin King <jcking@google.com>
Signed-off-by: Justin King <jcking@google.com>
@bridgekeeper
Copy link

bridgekeeper bot commented Dec 9, 2022

👋 Welcome back jcking! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@jcking jcking changed the title JDK-8298448: Initial integration of UBSan JDK-8298448: UndefinedBehaviorSanitizer Dec 9, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 9, 2022
@openjdk
Copy link

openjdk bot commented Dec 9, 2022

@jcking The following labels will be automatically applied to this pull request:

  • build
  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Dec 9, 2022
@mlbridge
Copy link

mlbridge bot commented Dec 9, 2022

@robehn
Copy link
Contributor

robehn commented Dec 9, 2022

Two questions, this works fine for me:
UBSAN_CFLAGS="-fsanitize=undefined -fsanitize=float-divide-by-zero -Wno-stringop-truncation -Wno-format-overflow -fno-omit-frame-pointer" ; UBSAN_LDFLAGS="-fsanitize=undefined -fsanitize=float-divide-by-zero" ; sh configure --with-conf-name=udf --with-debug-level=fastdebug --with-native-debug-symbols=internal -with-version-opt=releaserobbin --with-gtest=/home/rehn/source/gtest181/ --disable-precompiled-headers --enable-dtrace --with-extra-cflags="$UBSAN_CFLAGS" --with-extra-cxxflags="$UBSAN_CFLAGS" --with-extra-ldflags="$UBSAN_LDFLAGS" && make images CONF=udf
I.e.build works ?

Secondly, the method __ubsan_default_options is in the java launcher.
libjvm.so is a "standalone product" that can be used by different launchers.
So I'm not sure it should be in launcher if it was needed.

@jcking
Copy link
Contributor Author

jcking commented Dec 9, 2022 via email

@robehn
Copy link
Contributor

robehn commented Dec 9, 2022

What version of GCC are you using?

gcc 11.3 with libubsan 11.2

Also it seem to big overlap with -Wcast-align(=strict) for the warnings/errors I see and I do like that warning.
Do you have an idea if the coverage are pretty much the same for this particular warning?
I.e. is anyone of them better than the other?

@theRealAph
Copy link
Contributor

What version of GCC are you using?

gcc 11.3 with libubsan 11.2

Also it seem to big overlap with -Wcast-align(=strict) for the warnings/errors I see and I do like that warning. Do you have an idea if the coverage are pretty much the same for this particular warning? I.e. is anyone of them better than the other?

You can't really compare them because Undefined Behaviour is a runtime thing rather than a compile-time thing. Something like x = *(char*)0; is legal in a conforming C++ program as long as it's never executed.

I vastly prefer UB Sanitizer because it detects a ton of stuff that static analysis can't. But there's plenty of room for both.

@theRealAph
Copy link
Contributor

theRealAph commented Dec 9, 2022

I'm really pleased to see this.

Im working on #10920, which fixes the last of the null pointer dereferences in x86/AArch64 HotSpot. Once that's done I'm thinking of enabling -fsanitize=null by default in debug builds, as long as it's not too slow.

@openjdk
Copy link

openjdk bot commented Dec 9, 2022

@jcking This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8298448: UndefinedBehaviorSanitizer

Reviewed-by: erikj, ihse

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 390 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@erikj79, @dholmes-ora, @magicus) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 9, 2022
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it requires much broader discussion as to whether OpenJDK is actively seen to endorse these tools. Why these tools? What if there are other tools, should we support them all?

I'm not saying use of these tools may not be useful, but actually incorporating them into OpenJDK is a decision that needs to be made at a higher-level IMO.

@jcking
Copy link
Contributor Author

jcking commented Dec 12, 2022

I think it requires much broader discussion as to whether OpenJDK is actively seen to endorse these tools. Why these tools? What if there are other tools, should we support them all?

I'm not saying use of these tools may not be useful, but actually incorporating them into OpenJDK is a decision that needs to be made at a higher-level IMO.

The sanitizers are integrated directly with GCC and Clang/LLVM and are used by projects such as the Linux kernel. They are also used by companies such as Facebook and Google, which IIRC maintain some of the largest closed source mono repositories on the planet. As the sanitizers are integrated directly with GCC and Clang/LLVM, they are extremely easy to use (no external dependencies), fast, and have no direct alternatives. An alternative would also need to be integrated with the compilers in order to be at par.

Additionally configuration options for using ASan already exist in OpenJDK, so that ship has kinda sailed.

If we feel strongly about a discussion, we should probably discuss all the sanitizers as a whole. However that discussion can be done in parallel, as ASan is already used. Just adding the options to OpenJDK does not mean it is endorsed.

@jcking
Copy link
Contributor Author

jcking commented Dec 12, 2022

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Dec 12, 2022
@openjdk
Copy link

openjdk bot commented Dec 12, 2022

@jcking
Your change (at version d878ede) is now ready to be sponsored by a Committer.

@dholmes-ora
Copy link
Member

@jcking this is not ready for integration. You have one review from build team. You have no reviews from core-libs for launcher change. You haven't even bothered to address the comments I made on the actual changes.

@jcking
Copy link
Contributor Author

jcking commented Dec 12, 2022

/reviewers 2

@openjdk
Copy link

openjdk bot commented Dec 12, 2022

@jcking
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@jcking
Copy link
Contributor Author

jcking commented Dec 12, 2022

@jcking this is not ready for integration. You have one review from build team. You have no reviews from core-libs for launcher change.

Added reviewers to be 2. I had assumed the bot would take care of getting reviews from both. Apparently it does not, will keep that in mind going forward.

You haven't even bothered to address the comments I made on the actual changes.

I addressed the comments after integrate, my bad.

@openjdk openjdk bot removed sponsor Pull request is ready to be sponsored ready Pull request is ready to be integrated labels Dec 12, 2022
@jcking
Copy link
Contributor Author

jcking commented Dec 13, 2022 via email

@robehn
Copy link
Contributor

robehn commented Dec 13, 2022

I guess the advantage to putting this in the build machinery (as opposed to using --with-extra-cflags=-fsanitize=undefined --with-extra-ldflags=-fsanitize=undefined) is that we can turn some of these onn by default once we've fixed each category of UB. Is that right?

It will take a while, look a bit on align issue, we have so much code which go from pointer to large -> small -> large, e.g.

    address addr = data() + offset;
    return (ImmutableOopMap*) addr;

In this case data() needs to return something with the same alignment as a ptr and offset must be in even in ptr steps.

@erikj79
Copy link
Member

erikj79 commented Dec 13, 2022

Nope. Some targets end up passing C++ flags to the C compiler, causing a failure.

Ah right, we (mis)use CFLAGS (instead of CXXFLAGS) in some SetupNativeCompilation calls when all source files are C++. In that case, your suggested patch makes sense to me.

@jcking
Copy link
Contributor Author

jcking commented Dec 14, 2022

I guess the advantage to putting this in the build machinery (as opposed to using --with-extra-cflags=-fsanitize=undefined --with-extra-ldflags=-fsanitize=undefined) is that we can turn some of these onn by default once we've fixed each category of UB. Is that right?

It will take a while, look a bit on align issue, we have so much code which go from pointer to large -> small -> large, e.g.

    address addr = data() + offset;
    return (ImmutableOopMap*) addr;

In this case data() needs to return something with the same alignment as a ptr and offset must be in even in ptr steps.

Yeah, some of the cases we may just have to suppress if they are not feasible to fix without lots of effort. My intention is to fix all the low hanging fruit, and then suppress the remaining cases. For the suppressed cases, I'll file P4 bugs for each one. Then if somebody feels like fixing them, they can do it.

@magicus
Copy link
Member

magicus commented Dec 16, 2022

Sorry for the late review.

"Magic" solutions like this is always a bit scary. But we have some precedence in terms of the "autoheaders" for assembly files, so I guess this is not worse.

However, I do think the included source files should be treated like the autoheaders, and reside in data rather than in src. The latter is intended for buildtools, even though they are a bit scattered at the moment (there is a long-standing JBS bug about fixing that), and these are not really intended to be compiled into a product by itself, rather as something "injected" into all builds. I'd recommend make/data/ubsan, and perhaps skip the leading __.

@magicus
Copy link
Member

magicus commented Dec 16, 2022

I much also check: is this really needed for all executables we make? I would have guessed that it would suffice with the "launchers", i.e. like java, javac, jar etc. These are all compiled from a single source file, src/java.base/share/native/launcher/main.c, with different defines depending on what Java classes it should actually launch, and with different output names.

Doing things this way will also affect non-launcher executables, like jabswitch, msiwrapper and all executable test binaries.

That might be correct, but I could not be certain of that from trying to read the backlog of discussion in this bug.

Signed-off-by: Justin King <jcking@google.com>
Signed-off-by: Justin King <jcking@google.com>
@jcking
Copy link
Contributor Author

jcking commented Dec 17, 2022

However, I do think the included source files should be treated like the autoheaders, and reside in data rather than in src. The latter is intended for buildtools, even though they are a bit scattered at the moment (there is a long-standing JBS bug about fixing that), and these are not really intended to be compiled into a product by itself, rather as something "injected" into all builds. I'd recommend make/data/ubsan, and perhaps skip the leading __.

Moved to the directory suggested.

@jcking
Copy link
Contributor Author

jcking commented Dec 17, 2022

I much also check: is this really needed for all executables we make? I would have guessed that it would suffice with the "launchers", i.e. like java, javac, jar etc. These are all compiled from a single source file, src/java.base/share/native/launcher/main.c, with different defines depending on what Java classes it should actually launch, and with different output names.

Doing things this way will also affect non-launcher executables, like jabswitch, msiwrapper and all executable test binaries.

That might be correct, but I could not be certain of that from trying to read the backlog of discussion in this bug.

For UBSan, the inclusion in every executable is intentional. For upcoming LSan/ASan stuff, it will be just for launchers. The reason is that UBSan is relaxed by default and won't terminate upon undefined behavior. LSan/ASan are strict by default, and for LSan to work we need to disable some things only when the JVM is being used. So for this commit all executables is correct. For ASan/LSan as you suggest only launchers will have them in that case.

@jcking jcking requested review from magicus and robehn and removed request for robehn December 27, 2022 07:54
@jcking
Copy link
Contributor Author

jcking commented Jan 4, 2023

Friendly poke.

@robehn
Copy link
Contributor

robehn commented Jan 5, 2023

I think @magicus is back next week. Since this is mainly build stuff I'll sit this one out.

@jcking
Copy link
Contributor Author

jcking commented Jan 5, 2023

Ah okay, no problem. Thanks for the update.

@magicus
Copy link
Member

magicus commented Jan 12, 2023

Hi, back from vacation today. Thanks for waiting on my input! This looks good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 12, 2023
@jcking
Copy link
Contributor Author

jcking commented Jan 12, 2023

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jan 12, 2023
@openjdk
Copy link

openjdk bot commented Jan 12, 2023

@jcking
Your change (at version fd4b963) is now ready to be sponsored by a Committer.

@magicus
Copy link
Member

magicus commented Jan 12, 2023

/sponsor

@openjdk
Copy link

openjdk bot commented Jan 12, 2023

Going to push as commit 7a85d95.
Since your change was applied there have been 391 commits pushed to the master branch:

  • 26890d1: 8296478: Rework 8282948 and 8282700 to use the new autoconf UTIL_ARG_WITH
  • cf00d09: 8299330: Minor improvements in MSYS2 Workflow handling
  • 48c8fb3: 8299978: Remove MethodHandleNatives.getMembers
  • 89d3f3d: 8299125: UnifiedOopRef in JFR leakprofiler should treat thread local oops correctly
  • 4b57334: 8278326: Socket close is not thread safe and other cleanup
  • 036c808: 8298482: Implement ParallelGC NUMAStats for Linux
  • 0ee8cac: 8299672: Enhance HeapDump JFR event
  • d716ec5: 8299179: ArrayFill with store on backedge needs to reduce length by 1
  • af8d3fb: 8264684: os::get_summary_cpu_info padded with spaces
  • 78b1686: 8227257: javax/swing/JFileChooser/4847375/bug4847375.java fails with AssertionError
  • ... and 381 more: https://git.openjdk.org/jdk/compare/d5cf18e7fb591185eecb042bfa015609ea7d15e0...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 12, 2023
@openjdk openjdk bot closed this Jan 12, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jan 12, 2023
@openjdk
Copy link

openjdk bot commented Jan 12, 2023

@magicus @jcking Pushed as commit 7a85d95.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

6 participants