-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8292269: Replace FileMapInfo::fail_continue() with Unified Logging #12419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back matsaave! A progress list of the required criteria for merging this PR into |
@matias9927 The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
src/hotspot/share/cds/filemap.cpp
Outdated
va_end(ap); | ||
} | ||
|
||
void FileMapInfo::fail_continue(LogLevelType level, const char *msg, ...) { | ||
void FileMapInfo::fail_continue_nowarn(const char *msg, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to name a method by what it does do rather than what it doesn't.
But I am finding the naming here problematic anyway as we do not always continue - the logic is basically possibly_fail_else_log_and_continue
. Maybe the simplest thing here is to only have the variant that takes the LogLevel
and so every callsite is explicit about whether it logs at warning or info level rather than try to accommodate warn/info into the name somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the fail_continue
is not the ideal name for what this is trying to accomplish: log a message and possibly fail.
That being said, I felt it was appropriate to abstract away the need to manage LogLevel
as only Info and Warning are used with fail_continue
. If someone does want to use any of the other channels, like debug, error, etc, they can still use the fail_continue
variant that excepts a LogLevel
, but they wouldn't have to worry about it otherwise. I think the binary option of warning vs no warning can help to keep it simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I return to "It is better to name a method by what it does do rather than what it doesn't." So fail_continue
and fail_continue_warn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the methods can be renamed to
FileMapInfo::notify_failure(LogLevelType level, const char *msg, ...);
FileMapInfo::notify_failure(const char *msg, ...); // use warning channel
In most cases we use the warning channel, so that should be the default.
The naming of notify_failure()
follows existing functions such as VM_GC_Operation::notify_gc_begin()
to indicate that something has happened and some action will happen as a result.
In our case, it's reasonable to expect that some logging may happen as a result of the failure, and the VM may even exit.
By the way, the VM exits only if -Xshare:on
is specified in the command-line. As specified in https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html, -Xshare:on
is indented for testing only and should not be used in production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be in line with @dholmes-ora's suggestion, would we use the names notify_failure
and notify_failure_warn
? That seems a bit redundant and ambiguous to me, so I propose notify_failure
for the warning channel and log_failure
for the info channel and other possible levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, after thinking about this more, I think we should rewrite all the calls to FileMapInfo::fail_continue()
with log_xxx()
calls. This will bring the CDS logging code up-to-date and we can get rid of the convoluted fail_continue()
function.
fail_continue()
was ancient code from the days when Unified Logging wasn't available. It currently does three things:
- (a) If
PrintSharedArchiveAndExit
orRequireSharedSpaces
are enabled, always log the message so it's visible to the user - (b) If
RequireSharedSpaces
is enabled, exit the VM immediately - Otherwise, log the message to the
cds
channel.
As mentioned in JDK-8292269, we want to log some of the cds failure messages to the info
channel, to avoid warning messages when the failure is expected (i.e., when -XX:-UseCompressedClassPointers
is specified). My proposal is change all the fail_continue()
calls to log_info(cds)
or log_warning(cds)
The reason for (a) is: PrintSharedArchiveAndExit
is for diagnosing problems with the CDS archive; RequireSharedSpaces
, which is driven by the -Xshare:on
command-line flag, which should be used only in the regression test suite (see here). In both these cases, we can simply force the cds
logging level to info
inside arguments.cpp (using LogConfiguration::configure_stdout()
).
For (b), we don't have to exit immediately when the error is detected. We can exit the VM in MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
when result != MAP_ARCHIVE_SUCCESS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, apart from the comment.
Are any of the tests affected? Some tests may be sensitive to whether the log is from warning or info.
src/hotspot/share/cds/filemap.cpp
Outdated
@@ -2591,7 +2553,7 @@ GrowableArray<const char*>* FileMapInfo::_non_existent_class_paths = nullptr; | |||
// Open the shared archive file, read and validate the header | |||
// information (version, boot classpath, etc.). If initialization | |||
// fails, shared spaces are disabled and the file is closed. [See | |||
// fail_continue.] | |||
// log_warning(cds).] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [See ...]
part of the comment should be removed.
@matias9927 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 116 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@iklam, @dholmes-ora, @calvinccheung) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
The PR description and the JBS issue should also be updated to reflect the latest design. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second Ioi's request to update the JBS issue etc to reflect the conversion from fail_continue to use of log_warning.
Otherwise this seems fine. I will assume tests pass with these changes.
Thanks.
@@ -1009,7 +1009,7 @@ void MetaspaceShared::initialize_runtime_shared_and_meta_spaces() { | |||
// The base archive cannot be mapped. We cannot dump the dynamic shared archive. | |||
AutoCreateSharedArchive = false; | |||
DynamicDumpSharedSpaces = false; | |||
FileMapInfo::fail_continue("Unable to map shared spaces"); | |||
log_info(cds)("Unable to map shared spaces"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to exit the VM if RequireSharedSpaces
is true (this replaces the logic in FileMapInfo::fail_continue). Otherwise, "java -Xshare:on" will continue execution even if the archive(s) fails to map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update looks good.
@matias9927 this pull request can not be integrated into git checkout cdsWarning_8292269
git fetch https://git.openjdk.org/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unapproving as this latest set of changes is more than just minor tweaks.
Not at all clear why these latest changes were necessary. The code seems messier now.
Why do the tests now explicitly need -Xshare:auto?
if (UseSharedSpaces && RequireSharedSpaces) { | ||
LogConfiguration::configure_stdout(LogLevel::Info, true, LOG_TAGS(cds)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
|
||
if (static_mapinfo != nullptr) { | ||
log_info(cds)("Core region alignment: " SIZE_FORMAT, static_mapinfo->core_region_alignment()); | ||
dynamic_mapinfo = open_dynamic_archive(); | ||
|
||
dynamic_mapinfo = open_dynamic_archive(has_failed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has_failed
seems unused in relation to the open_dynamic_archive
call.
@@ -974,7 +980,7 @@ FileMapInfo* MetaspaceShared::open_static_archive() { | |||
return mapinfo; | |||
} | |||
|
|||
FileMapInfo* MetaspaceShared::open_dynamic_archive() { | |||
FileMapInfo* MetaspaceShared::open_dynamic_archive(bool &has_failed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the null return suffice for failure checking, possibly in combination with a flag check?
That was my suggestion to Matias, but I realized there's a better way to do it. My new suggestion is here: Background: there are two kinds of CDS mapping errors: soft and hard.
Before this PR, both soft and hard errors are logged to stderr when In the PR, we have chosen to log the soft errors to the Some of the CDS tests check for the soft errors. Before this PR, these tests would find the expected messages in stderr. After this PR, we need to explicitly add |
Okay but we now seem to have diverged somewhat from the topic of "Log more CDS failure messages in the warning channel". Maybe this should be returned to draft whilst these details are fleshed out. |
I updated the JBS bug report to match the latest implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks.
A couple of minor queries on the tests.
One nit with the exit strategy - partly pre-existing - it is not clear why vm_exit_during_initialization
is used in some places and fail_stop
in others. It seems to me that fail_stop
may no longer be required given the code doesn't use the tty anyway. But consistency would be nice.
vm_exit_during_initialization("Unable to use shared archive."); | ||
} else if (RequireSharedSpaces) { | ||
FileMapInfo::fail_stop("Unable to map shared spaces"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks odd to use fail_stop
here and vm_exit_during_initialization
just before it. ??
@@ -253,7 +253,6 @@ private static void doTest(String baseArchiveName, String topArchiveName) throws | |||
"-XX:+RecordDynamicDumpInfo", | |||
"--patch-module", | |||
"foo.bar=xxx", | |||
"-Xshare:auto", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this?
@@ -64,6 +64,7 @@ static void test() throws Exception { | |||
"-XX:+WhiteBoxAPI", | |||
use_whitebox_jar, | |||
"-Xlog:cds=debug", | |||
"-Xshare:auto", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this?
One test was failing due to an unexpected output. This PR moves the vm exit to a later stage for most cases, but in order to preserve the pre-change behavior, an early exit was added for a specific case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good. Just one nit.
Thank you for the reviews and feedback @calvinccheung , @dholmes-ora , and @iklam ! |
@matias9927 |
/sponsor |
Going to push as commit cd4b88d.
Your commit was automatically rebased without conflicts. |
@calvinccheung @matias9927 Pushed as commit cd4b88d. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The current logging method FileMapInfo::fail_continue() method reports the reason when the CDS archive cannot be mapped and has variable behavior depending on the -Xshare mode. Logging calls are divided primarily between the "warning" and "info" channels, where the "info" channel was used to reduce the number of printed logs for expected and normal failures.
Logging will now shift toward Unified Logging, so
fail_continue()
will be replaced withlog_info(cds)
andlog_warning(cds)
. Some genuine failures will be logged to the warning channel for better visibility. Upon actual failures, the VM will exit at MetaspaceShared::initialize_runtime_shared_and_meta_spaces() unless it is otherwise necessary. Relevant tests are updated to accommodate this change. Verified with tier1-4 testsProgress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/12419/head:pull/12419
$ git checkout pull/12419
Update a local copy of the PR:
$ git checkout pull/12419
$ git pull https://git.openjdk.org/jdk pull/12419/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12419
View PR using the GUI difftool:
$ git pr show -t 12419
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12419.diff