Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore options file and vm support #16543

Merged
merged 1 commit into from
Feb 1, 2023
Merged

Conversation

tajila
Copy link
Contributor

@tajila tajila commented Jan 11, 2023

Restore options file and vm support

Introduce options file to provide the ability to supply cmdline options on restore. Initially only properties will be supported. Further support for JVM options will be added later.

Signed-off-by: Tobi Ajila atobia@ca.ibm.com

@tajila tajila force-pushed the criu2 branch 8 times, most recently from 8377116 to ded5726 Compare January 12, 2023 18:52
@tajila tajila marked this pull request as ready for review January 12, 2023 20:51
@tajila
Copy link
Contributor Author

tajila commented Jan 12, 2023

@DanHeidinga Please review these changes


J9InternalCheckpointHookAPI.registerPostRestoreHook(RESTORE_OPTIONS_FILE_PRIORITY,
"Restore options via options file:" + optionsFilePath, () -> { //$NON-NLS-1$
try (BufferedReader envFileReader = new BufferedReader(new FileReader(optionsFilePath))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optionsFileReader fits better here.

bool result = true;
J9JavaVM *vm = currentThread->javaVM;
J9JavaVMArgInfoList vmArgumentsList;
UDATA ignored;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable missed initialization though its value is to be ignored.

runtime/criusupport/criusupport.cpp Show resolved Hide resolved
for (i = 0; i < initArgs->nOptions; ++i) {
char * optionString = initArgs->options[i].optionString;

if (strncmp("-D", optionString, 2) == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (strncmp("-D", optionString, 2) == 0) {
if (0 == strncmp("-D", optionString, 2)) {

UDATA propNameLen = 0;

propValue = strchr(optionString + 2, '=');
if (propValue == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (propValue == NULL) {
if (NULL == propValue) {

goto done;
}

vm->checkpointState.restoreArgsList = createJvmInitArgs(vm->portLibrary, vm->vmArgsArray->actualVMArgs, &vmArgumentsList, &ignored);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no issue at the default mode -XX:+CRIURestoreNonPortableMode here though multiple checkpoints could be prepared w/ a bit of cleanup before creating a new one.

		if (NULL != vm->checkpointState.restoreArgsList) {
			destroyJvmInitArgs(vm->portLibrary, vm->checkpointState.restoreArgsList);
		}

@@ -572,3 +572,11 @@ J9NLS_JCL_CRIU_LOADING_LIBCRIU_FUNCTIONS_FAILED.explanation=CRIUSupport::checkpo
J9NLS_JCL_CRIU_LOADING_LIBCRIU_FUNCTIONS_FAILED.system_action=The JVM will throw a SystemCheckpointException.
J9NLS_JCL_CRIU_LOADING_LIBCRIU_FUNCTIONS_FAILED.user_response=Check your installation of criu with `criu check`.
# END NON-TRANSLATABLE

J9NLS_JCL_CRIU_LOADING_OPTIONS_FILE_FAILED=The JVM could load the options file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a not missed.

# START NON-TRANSLATABLE
J9NLS_JCL_CRIU_LOADING_OPTIONS_FILE_FAILED.sample_input_1=1
J9NLS_JCL_CRIU_LOADING_OPTIONS_FILE_FAILED.explanation=CRIUSupport::checkpointJVM failed.
J9NLS_JCL_CRIU_LOADING_OPTIONS_FILE_FAILED.system_action=The JVM will throw a SystemCheckpointException.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this JVMRestoreException according to

		if (false == loadRestoreArguements(currentThread, optionsFileChars)) {
			currentExceptionClass = vm->checkpointState.criuJVMRestoreExceptionClass;
			nlsMsgFormat = j9nls_lookup_message(J9NLS_DO_NOT_PRINT_MESSAGE_TAG | J9NLS_DO_NOT_APPEND_NEWLINE,
				J9NLS_JCL_CRIU_LOADING_OPTIONS_FILE_FAILED, NULL);
			goto wakeJavaThreadsWithExclusiveVMAccess;
		}

goto fail;
}
}
if (j2seVersion >= J2SE_V11) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be replaced w/ #if JAVA_SPEC_VERSION >= 11.

@tajila tajila force-pushed the criu2 branch 2 times, most recently from 50eadfd to 781d071 Compare January 18, 2023 14:17
Copy link
Member

@DanHeidinga DanHeidinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this PR as it appears to be setting the properties both natively and in a Java-level restore hook. What am I missing?

The VM collects the initial set of properties and then, during one of the JCL init stages, hands sets them into the class library code and the class libs become the master copy.

Assuming my knowledge on that is still correct, this PR looks like it needs some updates to the design

*/
public CRIUSupport registerRestoreOptionsFile(Path optionsFile) {
String optionsFilePath = optionsFile.toAbsolutePath().toString();
this.optionsFile = new StringBuilder("-Xoptionsfile=").append(optionsFilePath).toString(); //$NON-NLS-1$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use StringBuilder here? The straightforward "-Xoptionsfile="+optionsFilePath is both shorter and more straightforward given this is code that is likely executed once.

"Restore options via options file:" + optionsFilePath, () -> { //$NON-NLS-1$
try (BufferedReader optionsFileReader = new BufferedReader(new FileReader(optionsFilePath))) {
String entry = null;
while ((entry = optionsFileReader.readLine()) != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a Stream<String> lines() method on BufferedReader that handles the line reading for you:

br.lines().forEach(.......)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any luck with Dan's suggestion here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to do an explicit readLine to support line continuations. Documentation indicates this may cause undefined behaviour when used with streams

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need an explicit readLine? Use an extra variable as an accumulator something like this:

String options = null;
br.lines().forEach(l -> {
  l = l.trim(); 
  if (l.endsWith("\\")) { 
    option = l.substring(0, l.length()-1);
  } else { 
    if (option != null) { 
       l = option.concat(l); 
       option=null;
     } 
     System.out.println(l);
   }
});

@@ -341,6 +343,38 @@ releaseSafeOrExcusiveVMAccess(J9VMThread *currentThread, J9InternalVMFunctions *
}
}

static bool
loadRestoreArguements(J9VMThread *currentThread, const char *optionsFile)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
loadRestoreArguements(J9VMThread *currentThread, const char *optionsFile)
loadRestoreArguments(J9VMThread *currentThread, const char *optionsFile)

UDATA ignored = 0;

if (NULL != vm->checkpointState.restoreArgsList) {
destroyJvmInitArgs(vm->portLibrary, vm->checkpointState.restoreArgsList);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is destroy the right choice here? I assume this is handling the case where we're restoring from a previously restored checkpoint? If that's the case, then we should chain the previously set args so we can see all the previous values when servicing an issue

Comment on lines 543 to 547
/* Create the -D properties. This may override any of the writeable properties above.
Should the command line override read-only props? */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting:

Suggested change
/* Create the -D properties. This may override any of the writeable properties above.
Should the command line override read-only props? */
/* Create the -D properties. This may override any of the writeable properties above.
* Should the command line override read-only props?
*/

@tajila
Copy link
Contributor Author

tajila commented Jan 19, 2023

I'm confused by this PR as it appears to be setting the properties both natively and in a Java-level restore hook. What am I missing?

We need the System properties to be set in both because the native structure needs to be updated of things like jvmti GetSystemProperties (https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#GetSystemProperties). And the java-level for System::getProperties.

The VM collects the initial set of properties and then, during one of the JCL init stages, hands sets them into the class library code and the class libs become the master copy.

The reason it didnt reuse the init helper is that the native list and the java-level list arent always the same. Calling System.setProperty will not update the native list. So if at restore time I update the native list and use System::getPropertyList to rebuild the java list instead of parsing the file then there is a chance that it will overwrite some updates that were made between checkpoint and startup.

Perhaps thats an argument for not updating the native list at all.

@DanHeidinga
Copy link
Member

I'm confused by this PR as it appears to be setting the properties both natively and in a Java-level restore hook. What am I missing?

We need the System properties to be set in both because the native structure needs to be updated of things like jvmti GetSystemProperties (https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#GetSystemProperties). And the java-level for System::getProperties.

Interesting. Reading the JVMTI docs, it appears that JVMTI only allows setting vm-properties during the "On_Load" phase though they can be read after that. It also states that the JVMTI view and the System::getProperties view are likely to diverge during execution and that JNI should be used to get the accurate view.

Given that a restore is happening after the "On_Load" phase, I don't think it makes sense to update the VM view of the properties as they're already spec'd to diverge.

The JVMTI docs also state properties set on the command line are included so an argument could be made to add them to the vm-view but I think, as we've previously discussed, that since restore is a restore, not a start, that adding them may break anyone who expects the list to be static after the "On_Load" phase ends.

That's my long way of saying that I think we shouldn't update the VM view as we're not in the "On_Load" phase, it may break anyone relying on the spec behaviour, and philosophically, we're a restore not start. Oh, and it matches the existing spec'd behaviour of the java and vm views diverging.

The VM collects the initial set of properties and then, during one of the JCL init stages, hands sets them into the class library code and the class libs become the master copy.

The reason it didnt reuse the init helper is that the native list and the java-level list arent always the same. Calling System.setProperty will not update the native list. So if at restore time I update the native list and use System::getPropertyList to rebuild the java list instead of parsing the file then there is a chance that it will overwrite some updates that were made between checkpoint and startup.

Perhaps thats an argument for not updating the native list at all.

Another good argument! +1 to not update the native list.

@tajila tajila marked this pull request as draft January 20, 2023 20:02
@tajila tajila force-pushed the criu2 branch 11 times, most recently from 7c649f9 to 445db14 Compare January 25, 2023 19:36
@tajila
Copy link
Contributor Author

tajila commented Jan 27, 2023

jenkins test sanity xlinux jdk17

@tajila tajila force-pushed the criu2 branch 2 times, most recently from d04ae5b to 3b72514 Compare January 27, 2023 22:54
@tajila
Copy link
Contributor Author

tajila commented Jan 27, 2023

jenkins test sanity xlinux jdk17

@@ -638,6 +689,13 @@ Java_org_eclipse_openj9_criu_CRIUSupport_checkpointJVMImpl(JNIEnv *env,
/* We can only end up here if the CRIU restore was successful */
isAfterCheckpoint = TRUE;

if (false == loadRestoreArguments(currentThread, optionsFileChars)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check NULL != optionsFile to avoid loadRestoreArguments w/ an empty buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addXOptionsFile handles the empty optionsFileChars. I want to call loadRestoreArguments uncondiontally because Ill have a follow on PR that also looks for options in the env variables (IBM_JAVA_OPTIONS, etc.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPENJ9_JAVA_OPTIONS right? =)

@@ -530,6 +530,107 @@ getLibSubDir(J9JavaVM *vm, const char *subDir, char **value)
}

#define COM_SUN_MANAGEMENT "com.sun.management"

UDATA
loadCMDLineSystemProperties(J9JavaVM *vm, JavaVMInitArgs *initArgs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this refactoring still needed? I was kind of expecting it to be removed as we agreed to only support the Java-level ::setProperties call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no Ive removed the change

@tajila tajila force-pushed the criu2 branch 6 times, most recently from d492452 to 6c66535 Compare January 30, 2023 18:48
tajila added a commit to tajila/openj9 that referenced this pull request Jan 30, 2023
With CRIU support, options args can be found in the `vm->argsArray` or
the `vm->checkpointState.restoreArgsList`. The new helpers allow users
to specify which argsArray they want to query.

Related to eclipse-openj9#16543

Signed-off-by: Tobi Ajila <atobia@ca.ibm.com>
@tajila
Copy link
Contributor Author

tajila commented Jan 30, 2023

@DanHeidinga @JasonFengJ9 changes are ready for another look

Copy link
Member

@JasonFengJ9 JasonFengJ9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@DanHeidinga DanHeidinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the minor NPE comment, this looks good to me

propertyValue = propertyValue.substring(0, propertyValue.length() - 1);
propertyValue = propertyValue.trim();
String nextLine = optionsFileReader.readLine();
propertyValue += nextLine.trim();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a malformed file that has a line continuation mark on the last line, this will throw an NPE. Not sure it's a problem but it might be better to throw a JVMRestoreException in that case. (And the code does explicit null checks above, so likely worth doing here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added NPE to the catch block below, this way we also know the cause of the RestoreException

@tajila tajila force-pushed the criu2 branch 2 times, most recently from 336976e to e8a3cb4 Compare January 31, 2023 21:33
Introduce options file to provide the ability to supply cmdline options on restore. Initially only properties will be supported. Further support for JVM options will be added later.

Signed-off-by: Tobi Ajila <atobia@ca.ibm.com>
@tajila
Copy link
Contributor Author

tajila commented Jan 31, 2023

jenkins test sanity xlinux jdk17

@tajila
Copy link
Contributor Author

tajila commented Feb 1, 2023

@DanHeidinga any other concerns

@DanHeidinga DanHeidinga merged commit 219f2e8 into eclipse-openj9:master Feb 1, 2023
if (!entry.isEmpty()) {
if (entry.startsWith("-D")) { //$NON-NLS-1$
String entrySplit[] = entry.split("=", 2); //$NON-NLS-1$
String propertyValue = entrySplit[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw an exception if the line doesn't contain "="; probably not the best user experience.
Perhaps adding to the catch list is reasonable, but it would be helpful to include some of the troublesome input in the exception thrown there.

@keithc-ca keithc-ca mentioned this pull request Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants