Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Shadow Replicas #22024

Closed
bleskes opened this Issue Dec 7, 2016 · 8 comments

Comments

Projects
None yet
6 participants
@bleskes
Copy link
Member

bleskes commented Dec 7, 2016

The shadow replicas feature (added in #8976) has not seen the amount of usage we originally hoped for. On the flip side it is very tricky to maintain, effectively adding a second replication logic. Example issues include #22021 , #17695, #16358 , #13583 , #16357 - see here for more. It is also requires new features (like sequence numbers) to be implemented for both normal replicas and shadow replicas, increasing the amount of future work.

Given the lack of uptake we are considering to remove them (and will mark them as deprecated in a separate PR).

This is issue is opened as a placeholder for discussion and to give people the option to object and state their use case.

@bleskes bleskes changed the title Removal Shadow Replicas Remove Shadow Replicas Dec 7, 2016

bleskes added a commit to bleskes/elasticsearch that referenced this issue Dec 7, 2016

@makeyang

This comment has been minimized.

Copy link
Contributor

makeyang commented Dec 8, 2016

segment-based or segments-based replication is really helpful for certain scenario.let's say to provide something like read-write isloate which is common in database.
but this feature shouldn't bound to shared filesystem.

@bleskes

This comment has been minimized.

Copy link
Member Author

bleskes commented Dec 8, 2016

@makeyang yes. Segment based replication represent different trade offs and is interesting in some aspects. As you say - it is not bound to a shared file system. We talking (for a long time) about the cons and pros of it and how to potentially implement it without relying on a shared file system to do the heavy lifting of transferring files between nodes (note that we currently rely on the translog to be copied as well, something segment based replication doesn't imply). If and when we implement it, it should work on normal file systems. Since that's the hardest part to address, I don't feel there is value in keeping shadow replicas to expedite segment based replication (if this is what you meant).

@makeyang

This comment has been minimized.

Copy link
Contributor

makeyang commented Dec 9, 2016

@bleskes got it. thanks. if translog based or segment based asynchronous replication is implemente, it is really no reaso to keep shadow replicas.

@chenryn

This comment has been minimized.

Copy link

chenryn commented Dec 13, 2016

Is this meaning asynchronous replica will be published when shadow replica is totally removed?

@jasontedor

This comment has been minimized.

Copy link
Member

jasontedor commented Dec 13, 2016

Is this meaning asynchronous replica will be published when shadow replica is totally removed?

It does not mean that.

bleskes added a commit that referenced this issue Jan 16, 2017

bleskes added a commit that referenced this issue Jan 18, 2017

Add a deprecation notice to shadow replicas (#22647)
Relates to #22024

On top of documentation, the PR adds deprecation loggers and deals with the resulting warning headers.

The yaml test is set exclude versions up to 6.0. This is need to make sure bwc tests pass until this is backported to 5.2.0 . Once that's done, I will change the yaml test version limits

bleskes added a commit that referenced this issue Jan 18, 2017

Add a deprecation notice to shadow replicas (#22647)
Relates to #22024

On top of documentation, the PR adds deprecation loggers and deals with the resulting warning headers.

The yaml test is set exclude versions up to 6.0. This is need to make sure bwc tests pass until this is backported to 5.2.0 . Once that's done, I will change the yaml test version limits

bleskes added a commit that referenced this issue Jan 19, 2017

Add a deprecation notice to shadow replicas (#22647)
Relates to #22024

On top of documentation, the PR adds deprecation loggers and deals with the resulting warning headers.

The yaml test is set exclude versions up to 6.0. This is need to make sure bwc tests pass until this is backported to 5.2.0 . Once that's done, I will change the yaml test version limits
@apatrida

This comment has been minimized.

Copy link
Contributor

apatrida commented Feb 9, 2017

what about read heavy scenarios where you want to scale uniformly per node without the delay of replication (segment or per transaction) or without the cost of applying them (per transaction) and maybe over a WAN (segment would work for that, since likely it is async)?

@apatrida

This comment has been minimized.

Copy link
Contributor

apatrida commented Feb 9, 2017

Other question related to the future of replacing this, because not sure where thinking is represented (other github issue) for future replication models.

WAN replication => the clusters are not one cluster, but two distinct and may have different shard balances. Poor man's version of this now is near-cluster=>S3 backup/snapshot=>S3 restore snapshot=>remote-cluster

@bleskes

This comment has been minimized.

Copy link
Member Author

bleskes commented Feb 9, 2017

what about read heavy scenarios where you want to scale uniformly per node without the delay of replication (segment or per transaction) or without the cost of applying them (per transaction)

I'm not sure what you mean? a ready heavy environment means writing is not a big deal, and most of the work of the shards is to serve reads. Shadow replicas, which still need to push the segments (with the few new documents) and do stuff like merging, won't help much there?

maybe over a WAN
Poor man's version of this now is near-cluster=>S3 backup/snapshot=>S3 restore snapshot=>remote-cluster

I need, snapshot and restore is the current solution. Shadow replicas are not a long term solution as the current implementation assumes a shared file system. We are working on stuff like changes API which will allow you to stream (with delay) operations to another cluster.

@dakrone dakrone self-assigned this Mar 21, 2017

dakrone added a commit that referenced this issue Apr 11, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.