MapStore: Map init blocked when new nodes join during data loading #11407
Comments
@rruxandra this is the expected behavior when a If you configure the initial load mode as |
Thanks for the quick response. IMap<Object, Object> map = hazelcastInstances.get(0).getMap(mapName);
logger.info("Map size: " + map.size()); We've seen the same race condition happening also when using LAZY mode. Could you please check again? Thanks, |
I see, will take a closer look into this one. Cheers! |
@rruxandra not a solution neither sure if it's useful for your use case, however you may want to try setting property |
Hi @vbekiaris , Ruxandra and me have continued to debug this issue as well, but unfortunately we also did not found the root cause yet. The stack trace we see when the call to getMap() blocks and we stop the thread is the following: 169343 [pool-8-thread-1] WARN com.hazelcast.spi.ProxyService - [172.20.0.1]:5701 [dev] [3.8.4] Error while initializing proxy: IMap{name='map4TestMapStore33'}
com.hazelcast.core.HazelcastException: java.lang.InterruptedException: sleep interrupted
at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:94)
at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:56)
at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:52)
at com.hazelcast.util.ExceptionUtil.rethrow(ExceptionUtil.java:105)
at com.hazelcast.map.impl.proxy.MapProxySupport.waitUntilLoaded(MapProxySupport.java:591)
at com.hazelcast.map.impl.proxy.MapProxyImpl.waitUntilLoaded(MapProxyImpl.java:102)
at com.hazelcast.map.impl.proxy.MapProxySupport.initializeMapStoreLoad(MapProxySupport.java:222)
at com.hazelcast.map.impl.proxy.MapProxySupport.initialize(MapProxySupport.java:214)
at com.hazelcast.map.impl.proxy.MapProxyImpl.initialize(MapProxyImpl.java:102)
at com.hazelcast.spi.impl.proxyservice.impl.ProxyRegistry.doCreateProxy(ProxyRegistry.java:194)
at com.hazelcast.spi.impl.proxyservice.impl.ProxyRegistry.createProxy(ProxyRegistry.java:184)
at com.hazelcast.spi.impl.proxyservice.impl.ProxyRegistry.getOrCreateProxyFuture(ProxyRegistry.java:154)
at com.hazelcast.spi.impl.proxyservice.impl.ProxyRegistry.getOrCreateProxy(ProxyRegistry.java:135)
at com.hazelcast.spi.impl.proxyservice.impl.ProxyServiceImpl.getDistributedObject(ProxyServiceImpl.java:147)
at com.hazelcast.instance.HazelcastInstanceImpl.getDistributedObject(HazelcastInstanceImpl.java:376)
at com.hazelcast.instance.HazelcastInstanceImpl.getMap(HazelcastInstanceImpl.java:182)
at com.hazelcast.instance.HazelcastInstanceProxy.getMap(HazelcastInstanceProxy.java:96)
at com.nm.test.hazelcast.mapstore.TestMapStore33$1.run(TestMapStore33.java:133)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at java.lang.Thread.sleep(Thread.java:340)
at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:360)
at com.hazelcast.map.impl.proxy.MapProxySupport.waitAllTrue(MapProxySupport.java:611)
at com.hazelcast.map.impl.proxy.MapProxySupport.waitUntilLoaded(MapProxySupport.java:589)
... 18 more maybe this helps... thanks for looking into this and best, |
... and one more observation: 198357 [main] INFO com.nm.test.hazelcast.mapstore.TestMapStore33 - Loaded keys: 1296 When that happens it always failed for me. So, maybe taking a closer look at how this decision of the second node is taken could be interesting. Cheers, |
test demonstrating lack of progress as map is loaded and new member joins
@rruxandra @lukasblu I think now we have nailed the root cause of this issue and it's related to value loading tracking not handling migrations properly. I have a work-in-progress branch here -- it's not a proper, mergeable fix but it does pass the test. I'll post updates as we shape this into a proper fix. |
Hi @vbekiaris , |
Hi @lukasblu , there are several subtleties involved in migrations & map loading process so I would suggest you wait for a proper fix PR before starting tests on your side. |
Hello,
We discovered an issue in Hazelcast 3.8.5 when using MapStores.
It seems that hcInstance.getMap(mapName) gets blocked when a new node joins and data is being loaded.
For us this is a serious problem since it might prevent the system from starting.
Here is a test that reproduces the issue:
Could you please check this?
Is there a workaround to avoid this issue in 3.8.5?
Thanks,
Ruxandra
The text was updated successfully, but these errors were encountered: