Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new /{db}/_sync_shards endpoint (admin-only) #1811

Merged
merged 4 commits into from Jan 18, 2019

Conversation

Projects
None yet
3 participants
@wohali
Copy link
Member

wohali commented Dec 15, 2018

Overview

This server admin-only endpoint forces an n-way sync of all shards
across all nodes on which they are hosted.

This can be useful for an administrator adding a new node to the
cluster, after updating _dbs so that the new node hosts an existing db
with content, to force the new node to sync all of that db's shards.

Users may want to bump their [mem3] sync_concurrency value to a
larger figure for the duration of the shards sync.

Closes #1807

Testing recommendations

Testing this requires a non-standard setup, so I have not written an
automated test case for this. (We should think about how this could
possibly be done in the Elixir test suite in the future.)

Manual test script:

# set up 2 nodes not joined together and put data in node1
dev/run -n 2 --no-join &
export AUTH="root:PASSWORD_GOES_HERE"
curl -X PUT http://$AUTH@localhost:15984/foo
curl -X PUT http://$AUTH@localhost:15984/foo/abc -d '{"a":"b"}'
curl -X PUT http://$AUTH@localhost:15984/foo/def -d '{"g":"h"}'

# see the shards with data in node1, here it's shards 00 and 20
ls -lR dev/lib/node1/data/shards/*/foo.*.couch

# join the nodes together
curl -X PUT http://$AUTH@localhost:15986/_nodes/node2@127.0.0.1 -d '{}'
curl http://$AUTH@localhost:15984/_membership
curl http://$AUTH@localhost:25984/_membership

# have node2 participate in serving foo for all shards
curl -X PUT http://$AUTH@localhost:15986/_dbs/foo -d '{"_id":"foo","_rev":"1-d0f95fcbe46dd04f66bc61906e202d56","shard_suffix":[46,49,53,52,52,56,51,52,53,49,51],"changelog":[["add","00000000-1fffffff","node1@127.0.0.1"],["add","20000000-3fffffff","node1@127.0.0.1"],["add","40000000-5fffffff","node1@127.0.0.1"],["add","60000000-7fffffff","node1@127.0.0.1"],["add","80000000-9fffffff","node1@127.0.0.1"],["add","a0000000-bfffffff","node1@127.0.0.1"],["add","c0000000-dfffffff","node1@127.0.0.1"],["add","e0000000-ffffffff","node1@127.0.0.1"]],"by_node":{"node1@127.0.0.1":["00000000-1fffffff","20000000-3fffffff","40000000-5fffffff","60000000-7fffffff","80000000-9fffffff","a0000000-bfffffff","c0000000-dfffffff","e0000000-ffffffff"],"node2@127.0.0.1":["00000000-1fffffff","20000000-3fffffff","40000000-5fffffff","60000000-7fffffff","80000000-9fffffff","a0000000-bfffffff","c0000000-dfffffff","e0000000-ffffffff"]},"by_range":{"00000000-1fffffff":["node1@127.0.0.1","node2@127.0.0.1"],"20000000-3fffffff":["node1@127.0.0.1","node2@127.0.0.1"],"60000000-7fffffff":["node1@127.0.0.1","node2@127.0.0.1"],"80000000-9fffffff":["node1@127.0.0.1","node2@127.0.0.1"],"a0000000-bfffffff":["node1@127.0.0.1","node2@127.0.0.1"],"c0000000-dfffffff":["node1@127.0.0.1","node2@127.0.0.1"],"e0000000-ffffffff":["node1@127.0.0.1","node2@127.0.0.1"]}}'

# double check shards in node2 are still empty (all should be same file size)
ls -lR dev/lib/node2/data/shards/*/foo.*.couch

# do a shard sync
curl -X POST http://$AUTH@localhost:15984/foo/_sync_shards

# check that shards 00 and 22 on node2 now have data from node1
ls -lR dev/lib/node2/data/shards/*/foo.*.couch

# optionally manually check docs in those shards
# substitute the correct timestamp in the shard name here
curl http://$AUTH@localhost:25986/shards%2F00000000-1fffffff%2Ffoo.1544834513/_all_docs
curl http://$AUTH@localhost:25986/shards%2F20000000-3fffffff%2Ffoo.1544834513/_all_docs

Checklist

  • Code is written and works correctly;
  • Changes are covered by tests;
  • Documentation reflects the changes;

Documentation PR will be issued once this PR (and the bikeshedding over the name of the new endpoint) is approved.

@wohali wohali requested review from chewbranca , janl and davisp Dec 15, 2018

@wohali

This comment has been minimized.

Copy link
Member Author

wohali commented Dec 15, 2018

While I was in mem3_httpd I noticed the (currently undocumented) /{db}/_shards and /{db}/_shards/{docid} endpoints.

We should document these.

I dithered briefly over putting this new endpoint at /{db}/_shards/_sync. Happy to oblige if others feel strongly.

@davisp

This comment has been minimized.

Copy link
Member

davisp commented Jan 18, 2019

+1

I don't think I'd nest it since its not quite the same.

Show resolved Hide resolved src/mem3/src/mem3_httpd.erl Outdated
@wohali

This comment has been minimized.

Copy link
Member Author

wohali commented Jan 18, 2019

TY @davisp - will leave it as is.

Documentation PR to land after this one.

@rnewson

This comment has been minimized.

Copy link
Member

rnewson commented Jan 18, 2019

rebase before merging though

@wohali wohali force-pushed the add-shard-sync-api branch from c25fd2e to 9d4cb03 Jan 18, 2019

Show resolved Hide resolved src/mem3/src/mem3_httpd.erl Outdated
Add new /{db}/_sync_shards endpoint (admin-only)
This server admin-only endpoint forces an n-way sync of all shards
across all nodes on which they are hosted.

This can be useful for an administrator adding a new node to the
cluster, after updating _dbs so that the new node hosts an existing db
with content, to force the new node to sync all of that db's shards.

Users may want to bump their `[mem3] sync_concurrency` value to a
larger figure for the duration of the shards sync.

Closes #1807

@wohali wohali force-pushed the add-shard-sync-api branch from 9d4cb03 to 85fbe71 Jan 18, 2019

wohali added some commits Jan 18, 2019

@wohali wohali merged commit 6cb0506 into master Jan 18, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@wohali wohali deleted the add-shard-sync-api branch Jan 18, 2019

@wohali wohali removed request for chewbranca , janl and davisp Jan 18, 2019

janl added a commit that referenced this pull request Feb 7, 2019

Add new /{db}/_sync_shards endpoint (admin-only) (#1811)
This server admin-only endpoint forces an n-way sync of all shards
across all nodes on which they are hosted.

This can be useful for an administrator adding a new node to the
cluster, after updating _dbs so that the new node hosts an existing db
with content, to force the new node to sync all of that db's shards.

Users may want to bump their `[mem3] sync_concurrency` value to a
larger figure for the duration of the shards sync.

Closes #1807

janl added a commit that referenced this pull request Feb 17, 2019

Add new /{db}/_sync_shards endpoint (admin-only) (#1811)
This server admin-only endpoint forces an n-way sync of all shards
across all nodes on which they are hosted.

This can be useful for an administrator adding a new node to the
cluster, after updating _dbs so that the new node hosts an existing db
with content, to force the new node to sync all of that db's shards.

Users may want to bump their `[mem3] sync_concurrency` value to a
larger figure for the duration of the shards sync.

Closes #1807
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.