Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backup entries are less than owned entries #177

Closed
enesakar opened this issue May 31, 2012 · 8 comments

Comments

Projects
None yet
3 participants
@enesakar
Copy link
Member

commented May 31, 2012

Scenario:

  1. Start 39 nodes that will join a single cluster (3s delay between
    each node start)
  2. Wait for 10 nodes to join, once they do, individual nodes have
    permission to start writing to a Hazelcast IMap
  3. Wait for all nodes to join and complete their individual writes
  4. Wait for Hazelcast to complete all migrations
  5. Calculate owned keys versus backups.. there is a mismatch?

The cluster details:

  • 39 nodes
  • partition count of 128
  • 1 backup
  • wait to write data into Hazelcast until 10 nodes have joined, but do
    not wait for all 39
  • each node is reading and writing to Hazelcast's IMap, but they are
    only updating existing keys, not putting new ones
  • no ttl for the map
  • other than these settings, the map configuration is basically
    defaulted

The mail thread
https://groups.google.com/group/hazelcast/browse_thread/thread/de085fb8a5626588#

@enesakar

This comment has been minimized.

Copy link
Member Author

commented May 31, 2012

// test class for Issue177
public class MainIssue177 {

    public static void main(String[] args) throws FileNotFoundException {

        final Config config = null;
        System.setProperty("hazelcast.map.partition.count", "128");
        Hazelcast.init(config);
        final Random rand = new Random(System.currentTimeMillis());

        try {
            for (int i= 0; i< 10; i++) {
                Runnable runnable = new Runnable() {
                    public void run() {
                        HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
                        try {
                            Thread.sleep(rand.nextInt(100));
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                        }
                    }
                };
                new Thread(runnable).start();
                Thread.sleep(3000);
            }

            for (int i= 0; i< 20; i++) {
                Thread thread = new Thread("ins"+i) {
                    public void run() {
                        HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
                        try {
                            System.out.println("thread started: "+ getName());
                            Thread.sleep(rand.nextInt(100));
                            for (int j= 0; j< 10000; j++) {
                            instance.getMap("map").put(getName()+"-" + j, System.currentTimeMillis());
                            }
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                        }
                    }
                };
                thread.start();
                Thread.sleep(3000);
            }

            for (int i = 0; i < 3000; i++) {
                Set<Member> members = Hazelcast.getCluster().getMembers();
                System.out.println("turn:"+i + " size:" + members.size());
                Thread.sleep(2000);
                CallableIssue177 callable = new CallableIssue177();
                MultiTask<EntryCount> task = new MultiTask<EntryCount>(callable, members);
                ExecutorService executorService = Hazelcast.getExecutorService();
                executorService.execute(task);
                Collection<EntryCount> results = task.get();
                long totOwned = 0L;
                long totBackup = 0L;
                for (EntryCount result : results) {
                    totOwned += result.getOwned();
                    totBackup += result.getBackup();
                }
                System.out.println("owned:"+totOwned + " backup:" + totBackup );
            }
        } catch (ExecutionException e) {
            e.printStackTrace();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
@enesakar

This comment has been minimized.

Copy link
Member Author

commented May 31, 2012

public class CallableIssue177 implements Callable<EntryCount>, Serializable, HazelcastInstanceAware {
    HazelcastInstance hazelcastInstance;

    @Override
    public EntryCount call() throws Exception {
        long owned = 0L;
        long backedUp = 0L;
        for (Instance instance : hazelcastInstance.getInstances()) {
            if (instance instanceof IMap) {
                IMap map = (IMap) instance;
                LocalMapStats stats = map.getLocalMapStats();
                owned += stats.getOwnedEntryCount();
                backedUp += stats.getBackupEntryCount();
            }
        }
        return new EntryCount(owned, backedUp);
    }

    @Override
    public void setHazelcastInstance(HazelcastInstance hazelcastInstance) {
        this.hazelcastInstance = hazelcastInstance;
    }
}
@enesakar

This comment has been minimized.

Copy link
Member Author

commented May 31, 2012

public class EntryCount implements Serializable {
    private long owned;
    private long backup;

    public EntryCount(long owned, long backup) {
        this.owned = owned;
        this.backup = backup;
    }

    public long getOwned() {
        return owned;
    }

    public void setOwned(long owned) {
        this.owned = owned;
    }

    public long getBackup() {
        return backup;
    }

    public void setBackup(long backup) {
        this.backup = backup;
    }
}
@marshalium

This comment has been minimized.

Copy link
Contributor

commented Jul 12, 2012

This issue is pretty serious and is the only thing preventing us from upgrading to 2.1. Any idea when it will be fixed?

@mdogan mdogan closed this in 68d1a3d Jul 19, 2012

mdogan added a commit that referenced this issue Jul 19, 2012

@ghost ghost assigned mdogan Jul 20, 2012

@marshalium

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2012

Just tested this with 2.3.1. Your unit test passes but I'm still seeing the bug in a real distributed test.

@marshalium

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2012

We can work on creating a test case for you early next week if you need one.

@mdogan

This comment has been minimized.

Copy link
Member

commented Sep 1, 2012

You should explicitly enable redo support for backup operations by version
2.3.

See
http://hazelcast.com/docs/2.3/manual/single_html/#hazelcast.backup.redo.enabled

And 2.3 release notes ;

http://hazelcast.com/docs/2.3/manual/single_html/#ReleaseNotes

@mmdogan

~ Sent from mobile
On Sep 1, 2012 1:32 AM, "Marshall Scorcio" notifications@github.com wrote:

We can work on creating a test case for you early next week if you need
one.


Reply to this email directly or view it on GitHubhttps://github.com//issues/177#issuecomment-8206719.

@marshalium

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2012

That config change worked. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.