-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GEODE-10155: Avoid threads hanging when function execution times-out #7493
Conversation
8f7c546
to
6463aaf
Compare
eafb2bc
to
bb5e95d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still reviewing this, but here are a few small comments.
.setPRSingleHopEnabled(false); | ||
if (connectTimeout > 0) { | ||
factory.setSocketConnectTimeout(connectTimeout); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setConnectionTimeout is already being done right above this, so this check isn't necessary.
PoolFactory factory = PoolManager.createFactory().addServer(host, port1) | ||
.addServer(host, port2).addServer(host, port3).setPingInterval(2000) | ||
.setSubscriptionEnabled(true).setSubscriptionRedundancy(-1).setReadTimeout(2000) | ||
.setSocketBufferSize(1000).setRetryAttempts(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need subscriptions in this test.
try { | ||
Thread.sleep(waitBetweenEntriesMs); | ||
} catch (InterruptedException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see other methods in this class are calling printStackTrace, but I'm not sure thats the best behavior here.
if (getId().equals(TEST_FUNCTION_SLOW)) { | ||
return false; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check can be moved down to the code below in the method like:
if (getId().equals(TEST_FUNCTION_NONHA_SERVER) || getId().equals(TEST_FUNCTION_NONHA_REGION)
|| getId().equals(TEST_FUNCTION_NONHA_NOP) || getId().equals(TEST_FUNCTION_NONHA)
|| getId().equals(TEST_FUNCTION_SLOW)) {
return false;
}
|
||
import java.util.concurrent.TimeUnit; | ||
|
||
import org.junit.Test; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Junit 5.
|
||
@Test | ||
public void whenResponseToClientInLastResultFailsEndResultsIsCalled_OnlyLocal_NotOnlyRemote() { | ||
// arrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments aren't adding to the readability, please remove them.
@Test | ||
public void whenResponseToClientInLastResultFailsEndResultsIsCalled_NotOnlyLocal_NotOnlyRemote() { | ||
// arrange | ||
Mockito.doThrow(new FunctionException()).when(serverToClientFunctionResultSender) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static import the Mockito and AssertJ builder methods.
Integer port1 = server1.invoke(() -> PRClientServerTestBase | ||
.createCacheServer(commonAttributes, localMaxMemoryServer1, maxThreads)); | ||
Integer port2 = server2.invoke(() -> PRClientServerTestBase | ||
.createCacheServer(commonAttributes, localMaxMemoryServer2, maxThreads)); | ||
Integer port3 = server3.invoke(() -> PRClientServerTestBase | ||
.createCacheServer(commonAttributes, localMaxMemoryServer3, maxThreads)); | ||
client.invoke(() -> PRClientServerTestBase.createNoSingleHopCacheClient( | ||
NetworkUtils.getServerHostName(server1.getHost()), port1, port2, port3, connectTimeout)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor nitpicks here (comment only):
- Change those
Integer
vars toint
and prefer primitives to wrapper types if possible. - I recommend using
String
hostname instance fields instead inlining thosegetServerHostName
andgetHost
calls in the RMI lambdas. Some of them misbehave when called in the dunitChildVMs
so this avoids all issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kirklund Are you ok with the changes I made after your review? Anything left?
try { | ||
sender.lastResult(new FunctionException()); | ||
} catch (FunctionException expected) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should always add assertions for expected exceptions:
Throwable thrown = catchThrowable(() -> {
sender.lastResult(new FunctionException());
}
assertThat(thrown)
.isInstanceOf(FunctionException.class)
.(any other assertions that are valuable?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not add the assertion because it was not something I wanted to verify in the test. Nevertheless, thanks to your comment I have seen that the exception is not thrown so I have removed it.
I have also changed the test cases so that instead of sending an exception, an object is sent.
try { | ||
sender.lastResult(new FunctionException(), true, rc, null); | ||
} catch (FunctionException expected) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another expected exception that needs assertion(s).
try { | ||
sender.lastResult(new FunctionException(), true, rc, null); | ||
} catch (FunctionException expected) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another expected exception that needs assertion(s).
try { | ||
sender.sendResult(new FunctionException()); | ||
} catch (FunctionException expected) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another expected exception that needs assertion(s).
try { | ||
sender.sendResult(new FunctionException()); | ||
} catch (FunctionException expected) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another expected exception that needs assertion(s).
try { | ||
sender.sendResult(new FunctionException()); | ||
} catch (FunctionException expected) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another expected exception that needs assertion(s).
|
||
@Test | ||
public void whenResponseToClientInLastResultFailsEndResultsIsCalled_OnlyLocal_NotOnlyRemote() { | ||
doThrow(new FunctionException()).when(serverToClientFunctionResultSender) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend either using a cause like new FunctionException(new CauseException("for test")
and then include the cause and cause message in the assertions or at least use a custom message like new FunctionException("for test")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. It is not applicable anymore due to the changes described in a previous comment.
* request will never be served because there would be not ServerConnection | ||
* threads available and the test case will time-out. | ||
*/ | ||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename test to current convention, PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest
.
factory.setScope(Scope.LOCAL); | ||
factory.setDataPolicy(DataPolicy.EMPTY); | ||
factory.setPoolName(p.getName()); | ||
RegionAttributes attrs = factory.create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't introduce more raw type usage. Preferably clean up all other raw types in existing test.
this(dm, pr, time, msg, function, bucketArray, | ||
(x, y) -> FunctionStatsManager.getFunctionStats((String) x, (InternalDistributedSystem) y)); | ||
} | ||
|
||
/** | ||
* Have to combine next two constructor in one and make a new class which will send Results back. | ||
* | ||
*/ | ||
public PartitionedRegionFunctionResultSender(DistributionManager dm, PartitionedRegion pr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this new constructor is used only testing it should be package private. Evaluate access level on each of these constructors.
this.msg = msg; | ||
this.dm = dm; | ||
this.pr = pr; | ||
this.time = time; | ||
this.function = function; | ||
this.bucketArray = bucketArray; | ||
|
||
this.functionStatsFunctionProvider = functionStatsFunctionProvider; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have all overloaded constructs call a single initializing constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ensure to leave the code better than you found it (reduced warnings). Please work to not introduce new warnings to the code.
// executions. | ||
await().until(() -> { | ||
client.invoke(() -> executeGet(PartitionedRegionName, "key")); | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the return be in this block?
Do you want to set a specific amount of time for the await here based on the idea that you have an expectation of time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the return be in this block?
Yes, it is needed as the method should return a boolean.Do you want to set a specific amount of time for the await here based on the idea that you have an expectation of time?
Not really. I am just checking that it did not hang forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the return have been outside of the await? That is what I was getting at once the await finishes, the return could be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remove the return you get a compilation error.
The idea is to test that the call previous to the return does not hang.
Do you have a better way to achieve it?
private Object executeSlowFunctionOnRegionNoFilter(Function function, String regionName, | ||
int functionTimeoutSecs) { | ||
FunctionService.registerFunction(function); | ||
Region region = cache.getRegion(regionName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wildcard added.
return region.get(key); | ||
} | ||
|
||
private Object executeSlowFunctionOnRegionNoFilter(Function function, String regionName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a type for Function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added wildcard.
FunctionService.registerFunction(function); | ||
Region region = cache.getRegion(regionName); | ||
|
||
Execution execution = FunctionService.onRegion(region); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typing? Execution<IN, OUT, AGG>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, function execution types are a mess. Best to just walk away.
CacheServerTestUtil.enableShufflingOfEndpoints(); | ||
} | ||
pool = (PoolImpl) p; | ||
AttributesFactory factory = new AttributesFactory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AttributesFactory<Object, Object> factory = new AttributesFactory<>(); ?
RegionAttributes<Object, Object> attrs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The least common base type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at what is being put in the region, it varies. I prefer Object to ?, but I will defer to your opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you aren't calling any of the methods on this instance that have matching typed parameters then you can get away with ?
but as soon as you try to invoke one with a value that does match type ?
it will fail to compile. Java! Better type templating is coming...
My general rule is use ?
when you don't care or know the type, use per-method generics when the type matters but is not know, use Object
when the type matters and it doesn't have any other common ancestor or we don't know all the possible types.
@@ -99,10 +101,11 @@ public final void postSetUp() throws Exception { | |||
server2 = host.getVM(1); | |||
server3 = host.getVM(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference, these can be cleaned up to be
Host host = Host.getHost(0);
server1 = VM.getVM(0);
server2 = VM.getVM(1);
server3 = VM.getVM(2);
client = VM.getVM(3);
return createCacheServer(commonAttributes, localMaxMemory, -1); | ||
} | ||
|
||
public static Integer createCacheServer(ArrayList commonAttributes, Integer localMaxMemory, | ||
public static Integer createCacheServer(ArrayList<?> commonAttributes, Integer localMaxMemory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commonAttributes is of type Object
@@ -120,7 +123,8 @@ protected boolean shouldRegisterFunctionsOnClient() { | |||
return ExecuteFunctionMethod.ExecuteFunctionByObject == functionExecutionType; | |||
} | |||
|
|||
ArrayList createCommonServerAttributes(String regionName, PartitionResolver pr, int red, | |||
ArrayList<Object> createCommonServerAttributes(String regionName, PartitionResolver<?, ?> pr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As seen here createCommonServerAttributes returns ArrayList
@@ -131,26 +135,26 @@ ArrayList createCommonServerAttributes(String regionName, PartitionResolver pr, | |||
return commonAttributes; | |||
} | |||
|
|||
public static Integer createCacheServer(ArrayList commonAttributes, Integer localMaxMemory) { | |||
public static Integer createCacheServer(ArrayList<?> commonAttributes, Integer localMaxMemory) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commonAttributes is of type Object
@@ -273,12 +259,12 @@ public static void createCacheClient(String host, Integer port1, Integer port2, | |||
factory.setScope(Scope.LOCAL); | |||
factory.setDataPolicy(DataPolicy.EMPTY); | |||
factory.setPoolName(p.getName()); | |||
RegionAttributes attrs = factory.create(); | |||
Region region = cache.createRegion(PartitionedRegionName, attrs); | |||
RegionAttributes<?, ?> attrs = factory.create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use Object, Object instead of ?,? because we only Objects(String, Integer, Boolean) in region.
If you take a look at #7608 you can see the level of cleanup we are hoping for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great step in the right direction for cleaning up the tests. I would ask that after you apply the conversions you check the results. If you used the Assertions2AssertJ plugin, or similar, it misses some things, especially around collection assertions.
Please use variations of the hasSize()
and contains()
assertions.
.../internal/cache/execute/PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest.java
Outdated
Show resolved
Hide resolved
} | ||
ResultCollector<?, ?> rc1 = executeOnAll(dataSet, Boolean.TRUE, function, isByName); | ||
List<?> resultList = (List<?>) rc1.getResult(); | ||
logger.info("Result size : " + resultList.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid adding more logging to tests. If it is worth logging it is worth asserting and you don't need both.
.../internal/cache/execute/PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest.java
Outdated
Show resolved
Hide resolved
.../internal/cache/execute/PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest.java
Outdated
Show resolved
Hide resolved
.../internal/cache/execute/PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest.java
Outdated
Show resolved
Hide resolved
.../internal/cache/execute/PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest.java
Outdated
Show resolved
Hide resolved
.../internal/cache/execute/PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest.java
Outdated
Show resolved
Hide resolved
.../internal/cache/execute/PRClientServerRegionFunctionExecutionNoSingleHopDistributedTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to approve now but would prefer you take a bit more cleanup before merging.
99a0ce7
to
dbd9953
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I only saw two problems of real concern what should probably be awaitility awaits rather than wait.pause followed by assertThat...
} | ||
Map<String, Integer> resultMap = region.getAll(testKeysList); | ||
assertThat(resultMap).containsExactlyInAnyOrderEntriesOf(origVals); | ||
Wait.pause(2000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use awaitility here? using a pause is dangerous.
} | ||
Map<String, Integer> resultMap = region.getAll(testKeysList); | ||
assertThat(resultMap).containsExactlyInAnyOrderEntriesOf(origVals); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
FunctionService.registerFunction(function); | ||
Execution dataSet = FunctionService.onRegion(region); | ||
ResultCollector<?, ?> rc1 = execute(dataSet, singleKeySet, Boolean.TRUE, function, isByName); | ||
List<?> l = (List<?>) rc1.getResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not your fault, but single letter variable names are not great.
List<?> l = (List<?>) rc1.getResult(); | ||
assertThat(l).hasSize(3); | ||
for (Object item : l) { | ||
assertThat(item).isEqualTo(Boolean.TRUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "true" works instead of "Boolean.TRUE"
List<?> subL = (List<?>) value; | ||
assertThat(subL).hasSizeGreaterThan(0); | ||
for (Object o : subL) { | ||
assertThat(foundVals.add((Integer) o)).isTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix the typing of the generics to avoid casting? subL is as List
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I get what you are exactly proposing here.
dataSet.withFilter(testKeysSet).setArguments(Boolean.TRUE).execute(new FunctionAdapter() { | ||
@Override | ||
public void execute(FunctionContext context) { | ||
@SuppressWarnings("unchecked") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is.
} | ||
}); | ||
List<?> l = (List<?>) rc1.getResult(); | ||
logger.info("Result size : " + l.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming and typing, not your fault, but could you improve it?
There is a lot of technical debt in this file. Good job cleaning a lot up. It all helps. |
55cae14
to
62cb006
Compare
62cb006
to
7d1defd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the JIRA to say that isHA has to be false for this behavior to exist.
Done. Thanks! |
…7493) * GEODE-10155: Avoid threads hanging when function execution times-out * GEODE-10155: Updated after review * GEODE-10155: More changes after review * GEODE-10155: Changes after more reviews * GEODE-10155: Some more changes after review * GEODE-10155: More changes after review * GEODE-10155: More clean-up after review
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Has your PR been rebased against the latest commit within the target branch (typically
develop
)?Is your initial contribution a single, squashed commit?
Does
gradlew build
run cleanly?Have you written or updated unit tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?