_replicate vs _replication API endpoints #346

ufobat · 2018-10-31T14:21:13Z

I am currently writing a little script that helps me to setup a replication between two couchdbs. Therefore I want to read the current replication jobs, check if my desired job is already there. If not, I would setup a replication. If there are jobs that are no longer required (doesnt match my criteria) I wanted to remove the those.

While doing so I was falling in some traps:

In the Replication Section of the documentation there are _replication and _replicate. It took me a while to realize that those two are actually not the same names.

First I send my "replication requests" into _replicate instead of _replication. It took several minutes before something showed up in _schedule/jobs. I was confused that there was no way to show my configuration immediatly. What is the expected behaviour when you configure a replication to _replicate and what is the expected behaviour when you write it to _replication?

I would like to have some information in the documentation

that there are actually those two, and explain what is used for which usecase. (I dont know it, could you give me to write a PR, please)
that explains the expected behaviour in way like: When you do X you can see Y after Z happend.
(Like When you write to _replicate it takes at least x seconds x secons and then you can find the results in _schedule/jobs - not sure if that is correct)

If you could provide me some information I will gladly write a PR. :-)

The text was updated successfully, but these errors were encountered:

ufobat · 2018-10-31T14:36:48Z

FYI

<vatamane> ufobat: try checking  _scheduler/docs/ instead
<vatamane> _replicate endpoint is to create replications that are not backed by a document from the _replicator database
<vatamane> also note there is  the http /_replicate endpoint and a _replicator database I think you might be confusing the two
<ufobat> vatamane, when if i only use _replicator db? 
<ufobat> i can POST and GET from there?
<ufobat> i just care about the configruation, not of the replication results/states
<vatamane> you can create replications by creating documents in the _replicator database
<ufobat> vatamane, i actually did :-( 
<vatamane> these replication will persist even after a server is restarted (unlike replication from the _replicate endpoint)
<vatamane> to check the status of your replication try http://docs.couchdb.org/en/stable/api/server/common.html#scheduler-docs
<ufobat> because of my confusion i wrote https://github.com/apache/couchdb-documentation/issues/346
<vatamane> thanks ufobat, I'll update the ticket with more info
<ufobat> is the scheduler-doc information about which documents in which db where replicated?
<ufobat> my db is currently empty
<vatamane> _scheduler/doc allows you to track the state of replications which are backed by a document in a _replicator database
<ufobat> ah and _scheduler/jobs is for the _replicate api?
<vatamane> _scheduler/jobs tracks replication jobs that started broth from the _replicate endpoint and by creating documents in the _replicator db
<ufobat> and this applys also if i have a "custom_replicator" database, because docs says i could use one ( which would fit perfectly for my script approach, cause i wouldnt see other replications that dont belong to me)
<vatamane> you could use one, yeah, any database that end with the /_replicator suffix will work as a replicator db
<vatamane> when you query _scheduler/docs you'd have to pass that database name in the path explicitly like say _scheduler/docs/mycustom/_replicator
<ufobat> i think thats missin in the documentation as well

nickva · 2018-10-31T14:43:23Z

Creating Replications

There are two ways to create replications:

POST-ing to the /_replicate HTTP endpoint. Let's call these transient.
Creating a document in the _replicator database. Any database which ends in the /_replicator will work as a replicator database. Let's call these persistent.

Each are slightly different. For the transient ones there is no document backing the replication job. If the server crashes the job will just disappear. When jobs finish there is no way to query their state. This method of replicating was implemented initially.

Persistent ones are backed by a document in a _replicator db. They will persist across server restarts and it is possible to inspect their state after they finished.

The reason there are two is because transient ones where implemented first and were kept for backwards compatibility and are actually useful in some cases when programmatically creating replication jobs.

Monitoring Replications

Transient replications can be monitored via the /_active_tasks endpoint or the /_scheduler/jobs endpoint. It might take a a few minutes between the time a replication is created and it appears as a replication job in these endpoints.

http://docs.couchdb.org/en/latest/api/server/common.html#scheduler-jobs

Permanent replications can be monitored via the /_scheduler/docs endpoint as well as the /_scheduler/jobs and /_active_tasks. The /_scheduler/docs is preferred as it will show the state of the replication document before it becomes a replication job. Some documents could be invalid and could not become a replication job. Others might be delayed because they are fetching say the filter code from a slow source database.

http://docs.couchdb.org/en/latest/api/server/common.html#get--_scheduler-docs-replicator_db

Replication States

Replication documents become replication jobs and then replication jobs do all the replication work. There are a number of states a replication goes through so this chart might be helpful:

http://docs.couchdb.org/en/latest/replication/replicator.html#replication-states

ufobat · 2018-10-31T16:12:59Z

what about
{"id":"_design/_replicator","key":"_design/_replicator","value":{"rev":"1-85a961d0d9b235b7b4f07baed1a38fda"}} which is a default document in the _replication db?

ufobat · 2018-10-31T16:17:36Z

<vatamane> that's expected it's automatically created and is used to validate replicate documents created in that database
<rnewson> it enforces correctness of the docs, it's an internal detail really
<ufobat> i am just asking because when i do a get on _replicator/_all_docs i need to know that this is there in order to skip over it
<vatamane> you can skip over it
<ufobat> because "remove anything thats not my desired replication configuration" would be wrong
<ufobat> ty :)

wohali · 2018-10-31T16:32:36Z

@ufobat do you have everything you need to write a PR?

ufobat · 2018-10-31T17:57:57Z

@wohali I think I do, thank you :-)

ufobat · 2018-11-12T19:42:34Z

I am not sure if and where i should mention the _design/_replicator document. any idea?

ufobat added a commit to ufobat/couchdb-documentation that referenced this issue Nov 12, 2018

More information about replication. Fixes apache#346

c1739fd

ufobat added a commit to ufobat/couchdb-documentation that referenced this issue Nov 12, 2018

More information about replication. Fixes apache#346

c6723c0

ufobat mentioned this issue Nov 12, 2018

More information about replication. Fixes #346 #349

Merged

3 tasks

wohali closed this as completed in 6a29484 Nov 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

_replicate vs _replication API endpoints #346

_replicate vs _replication API endpoints #346

ufobat commented Oct 31, 2018

ufobat commented Oct 31, 2018

nickva commented Oct 31, 2018

ufobat commented Oct 31, 2018

ufobat commented Oct 31, 2018

wohali commented Oct 31, 2018

ufobat commented Oct 31, 2018

ufobat commented Nov 12, 2018

_replicate vs _replication API endpoints #346

_replicate vs _replication API endpoints #346

Comments

ufobat commented Oct 31, 2018

ufobat commented Oct 31, 2018

nickva commented Oct 31, 2018

ufobat commented Oct 31, 2018

ufobat commented Oct 31, 2018

wohali commented Oct 31, 2018

ufobat commented Oct 31, 2018

ufobat commented Nov 12, 2018