TcpReplicator always get 100% CPU utilization ratio #78

piaoxijun · 2016-04-12T11:08:42Z

I'm using 2.4 version of Chronicle-Map.

I made 3 read nodes sync from a write node by Tcp. the 3 read nodes was actually doing service by reading entries from the map, and the central one update the data of the map.

the problem is that the central node always get the TcpReplicator thread 100% CPU utilization ratio and never get down again. It seems the thread is reading infos from the reader nodes.but the reader nodes never does update or remove operation.

When TcpReplicator thread get 100% usage.Then the sync of the map on reader nodes always fail or speed be very slow.

map size: 100,000,000
entries in map : 40,000,000

leventov · 2016-04-12T13:30:50Z

This might be due to incorrect node configuration. For example, serveral nodes are configured with the same replication identifier.

Another universal recommendations are switching to Chronicle Engine, which also does replication but it is a newer and more stable codebase, switching to Chronicle Map 3.8 for the same reason, or do both these things.

piaoxijun · 2016-04-13T02:22:12Z

Hi,leventov
I conform that identifier for each map is different . And here is my code for write node:

TcpTransportAndNetworkConfig tcpConfig = TcpTransportAndNetworkConfig
.of(sourcePortNum)
.heartBeatInterval(3600L, TimeUnit.SECONDS)
.autoReconnectedUponDroppedConnection(true)
.throttlingConfig(ThrottlingConfig.throttle(replicateBps, TimeUnit.SECONDS));

ChronicleMapBuilder<String, Value> mapBuilder = ChronicleMapBuilder
.of(String.class, Value.class).entries(maxEntrySize)
.replication(identifier, tcpConfig);
map = mapBuilder.createPersistedTo(new File(persistFile));

the read node add the writer's ip and port , no other difference.

leventov · 2016-04-13T10:33:34Z

Heartbeat interval of 1 hour is definitely not the right thing to do. Why you configured it that? I don't recommend to change the default value (1 second).

You could also try to use Chronicle-Engine, here: https://github.com/OpenHFT/Chronicle-Engine/blob/d4d7945e41119a6cc18d60e8ec01c7571e6078ac/src/test/java/net/openhft/chronicle/engine/Replication3WayIntIntTest.java you could find an example of configuring replication for Engine.

piaoxijun · 2016-04-16T10:45:15Z

hi,leventov
the problem is gone when I remove throttlingConfig statement.
replicateBps = 1024 * 1024 * 800
.throttlingConfig(ThrottlingConfig.throttle(replicateBps, TimeUnit.SECONDS));

But I still dont know why.

leventov · 2016-04-16T14:57:25Z

I have a hypothesis that some exception is thrown inside throttling logic, but it is consumed in replication logic. But, in this case it should leave traces in logs. If you could run the test which leads to 100% CPU with logging configured on debug level (-Dorg.slf4j.simpleLogger.defaultLogLevel=debug) it would be helpful.

RobAustin · 2019-05-22T11:48:39Z

closing as we no longer provide support to chronicle map v2

leventov added the 2.x label Apr 12, 2016

RobAustin added the wontfix label May 22, 2019

RobAustin closed this as completed May 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TcpReplicator always get 100% CPU utilization ratio #78

TcpReplicator always get 100% CPU utilization ratio #78

piaoxijun commented Apr 12, 2016

leventov commented Apr 12, 2016

piaoxijun commented Apr 13, 2016

leventov commented Apr 13, 2016

piaoxijun commented Apr 16, 2016

leventov commented Apr 16, 2016

RobAustin commented May 22, 2019

TcpReplicator always get 100% CPU utilization ratio #78

TcpReplicator always get 100% CPU utilization ratio #78

Comments

piaoxijun commented Apr 12, 2016

leventov commented Apr 12, 2016

piaoxijun commented Apr 13, 2016

leventov commented Apr 13, 2016

piaoxijun commented Apr 16, 2016

leventov commented Apr 16, 2016

RobAustin commented May 22, 2019