-
Notifications
You must be signed in to change notification settings - Fork 1.5k
disaster recover #7465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disaster recover #7465
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yangxuanjia Thanks for raising the PR 🎉 I looked at the PR. The changes look good to me.
I see a few dgraph/yxj_start_*.sh scripts. I believe they were accidentally added. Can you please remove them?
We'll also need @manishrjain's review one this PR.
done |
|
@yangxuanjia The review of this PR is pending because we're working on the 21.03 release currently. This PR is a useful change and we'll get this into dgraph after the next release. Thanks for working on it. |
|
hey @yangxuanjia, I see that there are some conflicts on this PR. Is it okay if I resolve the conflicts and push some changes to this PR? |
|
@jarifibrahim, |
…uanjia/dgraph into yangxuanjia-yxjdgraph_disaster_recover
…raph_disaster_recover
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @yangxuanjia for your contribution. We'd merge it to a branch first where I can refactor this change a bit before merging to master.
Btw, happy to chat sometime about your experiences with running Dgraph.
Reviewed 10 of 10 files at r3.
Reviewable status: 10 of 18 files reviewed, 5 unresolved discussions (waiting on @jarifibrahim, @vvbalaji-dgraph, and @yangxuanjia)
dgraph/cmd/zero/raft.go, line 31 at r3 (raw file):
"github.com/dgraph-io/dgraph/worker" "go.uber.org/zap"
Remove this. We don't use it. No point introducing it in this PR.
dgraph/cmd/zero/raft.go, line 564 at r3 (raw file):
} func (n *node) checkForCIDInEntries() (string, bool, error) {
Why are we returning string here?
dgraph/cmd/zero/zero.go, line 673 at r3 (raw file):
} func (s *Server) RemoveAlphaNode(ctx context.Context, member *pb.Member) (*pb.Status, error) {
We already have remove node. Why do this?
protos/pb.proto, line 564 at r3 (raw file):
rpc MoveTablet(MoveTabletRequest) returns (Status) {} rpc ApplyLicense(ApplyLicenseRequest) returns (Status) {} rpc RemoveAlphaNode (Member) returns (Status) {}
don't need this.
worker/disaster_recover_raft.go, line 2 at r3 (raw file):
/* * Copyright 2017-2018 Dgraph Labs, Inc. and Contributors
2021
Disaster recover for zero and alpha.
If raft crash more than half of voters, and docker can't not available, it can be usefully for online environment.
Just restart the only alive node of zero or alpha with --force_new_cluster=true, the cluster will good.
But this method is lossy recovery, you must use this when you have not other method, it's the last resort when you are in disaster.
This change is