Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure migration thread dies during shutdown #8565

Closed

Conversation

mtopolnik
Copy link
Contributor

@mtopolnik mtopolnik commented Jul 21, 2016

As described in #8560, the migration thread is not joined after interrupting, leading to a possible (and actually observed) thread leak. This PR ensures the thread is dead before proceeding with the shutdown procedure.

This PR also includes some non-related code style cleanup in collection classes.

@mdogan
Copy link
Contributor

mdogan commented Jul 21, 2016

There's still problem in invocation system retry mechanism, which will make join() call blocking infinitely.
We've reproduced the problem with a simple test.

@mtopolnik
Copy link
Contributor Author

Since this PR fixes something that's is itself an issue, I propose we merge this and have another PR to fix the invocation system.

@mdogan
Copy link
Contributor

mdogan commented Jul 21, 2016

👍

@mtopolnik
Copy link
Contributor Author

The current build got stuck during testing, without timing out.

@mdogan
Copy link
Contributor

mdogan commented Jul 21, 2016

I don't think it will pass. You just encountered the infinite block issue during retry.

@mdogan
Copy link
Contributor

mdogan commented Jul 21, 2016

CacheHotRestartTest is in EE repo.

@mtopolnik
Copy link
Contributor Author

Yes, clearly not the culprit here. The change may affect any other test, of course, but it would still fail after the regular JUnit timeout.

@mtopolnik
Copy link
Contributor Author

For reference, this is the list of all tests which were running at the point it got stuck:

com.hazelcast.internal.partition.impl.InternalPartitionServiceLiteMemberTest
com.hazelcast.internal.partition.impl.MigrationCommitTest
com.hazelcast.map.MapLiteMemberTest
com.hazelcast.map.nearcache.NearCacheLiteMemberTest
com.hazelcast.mapreduce.MapReduceLiteMemberTest
com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl_invokeOnPartitionLiteMemberTest

@mtopolnik
Copy link
Contributor Author

The repeated run is also stuck. The list of still-running tests is now as follows:

com.hazelcast.durableexecutor.DurableLongRunningTaskTest
com.hazelcast.executor.LongRunningTaskTest
com.hazelcast.internal.partition.impl.InternalPartitionServiceLiteMemberTest
com.hazelcast.internal.partition.impl.MigrationCommitTest
com.hazelcast.map.MapLiteMemberTest
com.hazelcast.map.impl.mapstore.MapLoaderTest
com.hazelcast.map.impl.tx.MapTransactionTest
com.hazelcast.map.nearcache.NearCacheLiteMemberTest
com.hazelcast.mapreduce.MapReduceLiteMemberTest
com.hazelcast.mapreduce.aggregation.MapAggregationLiteMemberTest
com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl_invokeOnPartitionLiteMemberTest

@jerrinot
Copy link
Contributor

I'm closing this one in favour of #8610

@jerrinot jerrinot closed this Jul 28, 2016
@mtopolnik mtopolnik deleted the failfast-migrationthread branch August 4, 2016 08:33
@mmedenjak mmedenjak added the Source: Internal PR or issue was opened by an employee label Apr 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Source: Internal PR or issue was opened by an employee Team: Core Type: Defect
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants