Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdmain: delete member dir falling back to proxy #3949

Closed
wants to merge 1 commit into from

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented Dec 3, 2015

This fixes #3827 where member falls to
back to proxy successfully at first, but fails in subsequent tries.
It fails when there are 'member' and 'proxy' directory in the same place,
one of which did not get cleaned up from failure and causes this conflict
error message: 'invalid datadir. Both member and proxy directories exist.'

This fixes etcd-io#3827 where member falls to
back to proxy successfully at first, but fails in subsequent tries.
It fails when there are 'member' and 'proxy' directory in the same place,
one of which did not get cleaned up from failure and causes this conflict
error message: 'invalid datadir. Both member and proxy directories exist.'
@gyuho
Copy link
Contributor Author

gyuho commented Dec 8, 2015

Kindly ping @xiang90 @jonboulle

Please review and let me know if you have any other suggestion to resolve #3827. Thanks!

@xiang90
Copy link
Contributor

xiang90 commented Dec 8, 2015

@gyuho Can we stop etcd from creating member dir if it does not start as an etcd member?

@gyuho
Copy link
Contributor Author

gyuho commented Dec 8, 2015

@xiang90 Ok, I will try that approach! Thanks

@gyuho
Copy link
Contributor Author

gyuho commented Dec 8, 2015

@xiang90 I looked at the code again, and I think the issue matter only when a member falls back to proxy.

Code1. https://github.com/coreos/etcd/blob/master/etcdmain/etcd.go#L198
Code2. https://github.com/coreos/etcd/blob/master/etcdserver/server.go#L184-L205
Code3. https://github.com/coreos/etcd/blob/master/etcdserver/storage.go#L107-L125

Code1's startEtcd calls Code2's etcdserver.NewServer, which then calls upgradeDataDir in Code3. But startEtcd does not get called if it starts as a proxy (https://github.com/coreos/etcd/blob/master/etcdmain/etcd.go#L113-L137), which means we already do not create member dir if it does not start as an etcd member (it does when it starts as a member but falls back to proxy).

Please let me know if you have any feedback.

@gyuho
Copy link
Contributor Author

gyuho commented Dec 8, 2015

I think we either allow member to have both proxy and member directories for this case, or delete the member directory when a member falls back to proxy. BTW, why we do not allow both proxy and member in the first place?

@xiang90
Copy link
Contributor

xiang90 commented Dec 8, 2015

@gyuho My question is when the will the user have both dir? That should not happen.

@gyuho
Copy link
Contributor Author

gyuho commented Dec 9, 2015

@xiang90 Please correct me if I am wrong.

https://github.com/coreos/etcd/blob/c0353697da1216fd0555928efb53f4134a2b5ab4/etcdmain/etcd.go#L124-L136

shows if startEtcd returns discovery error, it tries to start as a proxy.
I think startEtcd could try to create both member and proxy directory in this case:

  1. startEtcd succeeds creating its member directory`
  2. This member crashes and tries to rererun startEtcd
  3. But this time, discovery fails in the function getPeerURLsMapAndToken returning discovery type error
  4. The logic at https://github.com/coreos/etcd/blob/c0353697da1216fd0555928efb53f4134a2b5ab4/etcdmain/etcd.go#L127-L132 detects this and tries to fall back to proxy
  5. But before the crash, it was an etcd member with member directory, so now startProxy fails.

@xiang90
Copy link
Contributor

xiang90 commented Dec 9, 2015

I think startEtcd could try to create both member and proxy directory in this case:

etcd should probably ONLY create member dir after it finishes bootstrapping, not before.

@xiang90
Copy link
Contributor

xiang90 commented Dec 9, 2015

In the discovery case, etcd should create member dir after it finishes discovery and finds all its peers.

@gyuho
Copy link
Contributor Author

gyuho commented Dec 9, 2015

Thanks. I will investigate more.

Sincerely,
Gyu-Ho Lee
On Dec 8, 2015 17:35, "Xiang Li" notifications@github.com wrote:

In the discovery case, etcd should create member dir after it finishes
discovery and finds all its peers.


Reply to this email directly or view it on GitHub
#3949 (comment).

@gyuho
Copy link
Contributor Author

gyuho commented Dec 29, 2015

Closing this in favor of #4087.

@gyuho gyuho closed this Dec 29, 2015
@gyuho gyuho deleted the delete_fallback branch January 31, 2016 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

etcd daemon: creates member directory if cluster config is not valid
2 participants