-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataloss in map when adding key a second time after new member joined the cluster #117
Comments
I could not reproduce the issue. Are you using default map configuration On Mon, Apr 9, 2012 at 10:46 AM, AlexDus <
|
I tried some different setting. Currently it looks like this:
|
And still having the issue? On Mon, Apr 9, 2012 at 12:08 PM, AlexDus <
|
Yes, and the settings are the same on both machines. I keep them in-sync with my private git repository. |
Could you send your whole configuration and those debug logs (mentioning eviction)? |
I just tried it with a default config file again - same result. Here is my custom config i am testing with:
Log-File from first server:
Log-File from second server:
|
Here is something weird: I just tested the whole procedure with two hazelcast servers on one physical machine and the problem is gone. If i try it again with two hazelcast servers on two physical machines, the problem is still there. |
I added some code to the listing of the keys. It now shows the ttl and the valid-state: public static void main(String[] args) {
ClientConfig clientConfig = new ClientConfig();
clientConfig.getGroupConfig().setName("dev").setPassword("dev-pass");
clientConfig.addAddress(Globals.serverNames);
HazelcastInstance client = HazelcastClient.newHazelcastClient(clientConfig);
PartitionService partitionService = client.getPartitionService();
IMap<Long, Long> myMap = client.getMap("mydata");
Set keys = myMap.keySet();
Iterator<Long> iter = keys.iterator();
int counter = 0;
while (iter.hasNext()) {
Long t = iter.next();
Partition partition = partitionService.getPartition(t);
Member ownerMember = partition.getOwner();
MapEntry rec = myMap.getMapEntry(t);
System.out.println(t + "; " + ownerMember.toString() + "; " + rec.getExpirationTime() + "; " + rec.isValid());
counter++;
}
System.out.println("Number of keys: " + counter);
client.getLifecycleService().shutdown();
} Looks like something is messing with the ttls: Output before the second hazelcast server starts up:
Output after the second hazelcast server started up and before putting the keys into the map for the second time:
Taking a look at the isValid(long now) function in AbstractRecord, this might be part of the problem. |
Actually setting time-to-live-seconds for example to 10000 helps. |
Here is another interesting thing. The system-time of the second machine has been a second behind. I put it 5 seconds before the system-time of the first machine, the problem is gone - almost. Now it works the other way around. So if the first machine joins the already running second machine, the problem occurs. The whole process seems to be very system-time sensitive. My guess would be, that it has something to do with the writeData und readData function in the DataRecordEntry class. In the writeData function you substract the current time und in the readData function you add the current time. |
Thanks for your findings, you did an excellent job. I could reproduce the
|
Welcome, sorry i didn't come up with a fix - but i'm too unfamiliar with your code to start messing with it and i havn't setup a debug environment for hazelcast (yet). But this is a great framework - thinking about using it in one of my next projects. |
I have uploaded two builds for you 2.0.3 and 2.1-SNAPSHOT including fix for this issue in downloads section. |
I'm going to test it this evening. I currently don't have the environment at hand. |
The fixes seem to work just fine. I noticed two things:
Like i said before, putting the keys in a second time worked just fine and no entries were lost! |
Thanks for testing. IPv6 support is being introduced in v2.1 and Hazelcast On Wed, Apr 11, 2012 at 9:30 PM, AlexDus <
|
Discover all IPs of PODs by default
The distributed map seems to loose data, when you try to put an already mapped key into the map after another member has been started. I found this behaviour in the Version 2.0.2 and 2.1 snapshot build. Didn't try any 1.x version. I am using two Win7 64 Bit machines with jdk 1.7u3.
Steps to reproduce the problem:
Output:
Output:
Some of the keys have changed ownership - just as expected.
Output:
All the migrated keys are missing. When setting the log level to finest, the second hazelcast server tells me, that it is evicting 4 entries. If you run the code to fill the map again, everything will be fine. I tried it with a lot of different map properties - none seem to solve this problem. Changes to the code (for example setting the ttl with put, setting a lock on the map, using replace instead of put for second mapping) didn't help.
The text was updated successfully, but these errors were encountered: