New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GEODE-10327: Overhaul GfshRule to kill processes and save artifacts #7758
GEODE-10327: Overhaul GfshRule to kill processes and save artifacts #7758
Conversation
…pache#7731) PROBLEM Tests that use GfshRule leave behind orphaned processes and do not save artifacts for debugging failures. SOLUTION GfshRule needs to cleanup all processes it forks. It also needs to save off all runtime artifacts such as logging, stats, pid files, diskstores to enable debugging of test failures. DETAILS Enhance GfshRule and modify all tests using it for proper debugging and to prevent test pollution. Overhaul of GfshRule: * kill ALL geode processes during cleanup * use FolderRule to ensure all logs and files are properly saved off when a test fails * extract GfshExecutor from JUnit rule code * GfshExecutor allows a test to use any number of Geode versions with just one GfshRule * add Gfsh log level support for easier debugging * add support for new VmConfiguration to allow control over Geode and Java versions * overhaul API of GfshRule and companion classes for better consistency and design New FolderRule: * replaces TemporaryFolder and saves off all content when a test fails * creates root directory under the gradle worker instead of under temp Update HTTP session caching module tests: * use new FolderRule to save all artifacts when a test fails * use nio Paths for filesystem variables Update acceptance and upgrade tests that use GfshRule: * use new improved GfshRule and GfshExecutor * use new FolderRule instead of TemporaryFolder to save all artifacts when a test fails * use --disable-default-server in tests with no clients * fix flakiness of many tests by using random ports instead of default or hardcoded port values * reformat GfshRule API usage in tests to improve readability and consistency * add GfshStopper to provide common place to await process stop (stop locator/server is async so restarting with same ports is very prone to hitting BindExceptions) Update ProcessUtils: * extract NativeProcessUtils and make it public for direct use * rename InternalProcessUtils as ProcessUtilsProvider and move to its own class * rethrow IOExceptions as UncheckedIOExceptions * fix flakiness in NativeProcessUtilsTest by moving findAvailablePid into test method Minor changes: * improve code formatting and readability * convert from old io File to nio Path APIs as much as possible * close output streams to fix filesystem issues on Windows Fixes flaky test tickets: * DeployJarAcceptanceTest GEODE-9615 * possibly other tests that uses GfshRule Changes for resubmit: * log error message if unable to delete folder NOTES The jdk8, jdk17 and windows labels were used to run tests on more environments. This PR contains mostly test and framework changes. The only product code altered is ServerLauncher and several classes in org.apache.geode.internal.process, all of which is in geode-core.
* Restore missing default constructor to GfshRule * Ensure IO streams have proper error handling and don't leak * Polish error handling in GfshRule and FolderRule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opting-in to automated commit message feedback via COMMITWATCHERS. https://cwiki.apache.org/confluence/display/GEODE/Commit+Message+Format suggests that your first commit summary should be 50 characters or less (not including bug# and pr#) to avoid truncation in github and other tools (yours is 62 characters...longer is ok and just check that you've prioritized the most salient words before the 47-character mark).
These suggestions are totally optional. What matters is including enough detail for future contributors to understand what you intended and why.
windows-core-integration-test-openjdk8 failed. It's unrelated to the PR (test doesn't even use GfshRule). It appears to be flaky and needs a ticket filed against it.
|
All tests passed in acceptance-test-openjdk17 and upgrade-test-openjdk8. These jobs hit a concourse failure:
|
…7758) PROBLEM Tests that use GfshRule leave behind orphaned processes and do not save artifacts for debugging failures. SOLUTION GfshRule needs to cleanup all processes it forks. It also needs to save off all runtime artifacts such as logging, stats, pid files, diskstores to enable debugging of test failures. DETAILS Enhance GfshRule and modify all tests using it for proper debugging and to prevent test pollution. Overhaul of GfshRule: * kill ALL geode processes during cleanup * use FolderRule to ensure all logs and files are properly saved off when a test fails * extract GfshExecutor from JUnit rule code * GfshExecutor allows a test to use any number of Geode versions with just one GfshRule * add Gfsh log level support for easier debugging * add support for new VmConfiguration to allow control over Geode and Java versions * overhaul API of GfshRule and companion classes for better consistency and design New FolderRule: * replaces TemporaryFolder and saves off all content when a test fails * creates root directory under the gradle worker instead of under temp Update HTTP session caching module tests: * use new FolderRule to save all artifacts when a test fails * use nio Paths for filesystem variables Update acceptance and upgrade tests that use GfshRule: * use new improved GfshRule and GfshExecutor * use new FolderRule instead of TemporaryFolder to save all artifacts when a test fails * use --disable-default-server in tests with no clients * fix flakiness of many tests by using random ports instead of default or hardcoded port values * reformat GfshRule API usage in tests to improve readability and consistency * add GfshStopper to provide common place to await process stop (stop locator/server is async so restarting with same ports is very prone to hitting BindExceptions) Update ProcessUtils: * extract NativeProcessUtils and make it public for direct use * rename InternalProcessUtils as ProcessUtilsProvider and move to its own class * rethrow IOExceptions as UncheckedIOExceptions * fix flakiness in NativeProcessUtilsTest by moving findAvailablePid into test method Minor changes: * improve code formatting and readability * convert from old io File to nio Path APIs as much as possible * close output streams to fix filesystem issues on Windows Fixes flaky test tickets: * DeployJarAcceptanceTest GEODE-9615 * possibly other tests that uses GfshRule Changes for resubmit: * log error message if unable to delete folder * keep default constructor on GfshRule * ensure IO streams have proper error handling and don't cause failures on windows Changes to build pipelines: * make jdk17 tests gating NOTES The labels jdk8, jdk17, windows, windows-jdk8 and windows-jdk17 were used to run tests on more environments. This PR contains mostly test and framework changes. The only product code altered is ServerLauncher and several classes in org.apache.geode.internal.process, all of which is in geode-core. (cherry picked from commit 495f3b0)
PR needs to be squashed when merged to develop. There are three commits:
If you already reviewed one or both of the first PRs that were reverted, then you can focus on commits 2-3. The most important changes that should be present and reviewed:
GfshRule
should have a default constructor that usesFolderFactory
FolderRule
should no longer use try-finally blocks orMultipleFailureException
Folder.delete()
should log an error message instead of throwingIOException
orUncheckedIOException
ResourceUtils
methods for copying resources should all flow into the private methodResourceUtils.copyResourceToFile(URL, File)
which should close both theInputStream
and theOutputStream
. It does this using a try-with-resources block.GEODE-10327: Overhaul GfshRule to kill processes and save artifacts (#7731)
PROBLEM
Tests that use GfshRule leave behind orphaned processes and do not save
artifacts for debugging failures.
SOLUTION
GfshRule needs to cleanup all processes it forks. It also needs to save
off all runtime artifacts such as logging, stats, pid files, diskstores
to enable debugging of test failures.
DETAILS
Enhance GfshRule and modify all tests using it for proper debugging and
to prevent test pollution.
Overhaul of GfshRule:
when a test fails
just one GfshRule
Java versions
and design
New FolderRule:
Update HTTP session caching module tests:
Update acceptance and upgrade tests that use GfshRule:
when a test fails
or hardcoded port values
consistency
locator/server is async so restarting with same ports is very prone
to hitting BindExceptions)
Update ProcessUtils:
own class
into test method
Minor changes:
Fixes flaky test tickets:
Changes for resubmit:
NOTES
The jdk8, jdk17 and windows labels were used to run tests on more
environments.
This PR contains mostly test and framework changes. The only product
code altered is ServerLauncher and several classes in
org.apache.geode.internal.process, all of which is in geode-core.