Optimize mem3 synchronization for large partition tables #24

Merged
merged 5 commits into from Apr 30, 2012

Conversation

Projects
None yet
2 participants
Owner

kocolosk commented Apr 27, 2012

This PR represents a significant rewrite of the logic for internal replication management. The highlights are that the nodes and dbs databases are replicated in a ring topology rather than a fully-connected mesh, and that we walk the partition table on disk instead of in memory on the initial sync.

kocolosk added some commits Feb 24, 2012

Sync "dbs" and "_users" with the next live node
The load induced by fully-connected mesh replication increases
quadratically with node count.  For large clusters with high rates of
database creation this ends up being significant.

This patch switches the topology to a ring.  Each node pushes to the
next live node in the ring.  Will deal with the correct action on
'nodeup' events in a separate commit.
Fold over DBs on disk rather than load into memory
This uses the new mem3_shards:fold API to walk the shards from the
on-disk representation.

BugzID: 13504
Give up on mem3_rep if DB was deleted
Previously we would retry an infinite number of times when this
happened
Owner

kocolosk commented Apr 27, 2012

It occurs to me that the initial_sync code should use a blocking API to execute replications instead of cast'ing them all at the server. Will address.

lgtm

Owner

kocolosk commented Apr 27, 2012

Rebasing to avoid merge conflict

Owner

kocolosk commented Apr 30, 2012

PR #25 makes the batch size configurable via an Options list. I'll back out the default batch_size change and submit a different patch later.

kocolosk added some commits Apr 27, 2012

Make initial_sync block for each replication
It's low priority, and we don't want to overrun the server.  The
implementation is kinda hacky, it sticks the From into #job.pid while
the job is in the waiting queue.
Owner

kocolosk commented Apr 30, 2012

Merging PR #25 introduces a trivial merge conflict in mem3_sync. I'll resolve it manually after that one goes in.

Owner

kocolosk commented Apr 30, 2012

Got the go-ahead from Paul to merge first, so here we go.

kocolosk added a commit that referenced this pull request Apr 30, 2012

Merge pull request #24 from cloudant/13504-mem3-sync-large-tables
Optimize mem3 synchronization for large partition tables

BugzID: 13504

@kocolosk kocolosk merged commit f8f1579 into 1.3.x Apr 30, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment