New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add etcd scaleup playbook #3043
Conversation
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch?
|
|
bot, test pull request |
|
aos-ci-test |
|
ae88c63 - State: success - All Test Contexts: aos-ci-jenkins/OS_unit_tests - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-2-unit-tests-683/ae88c63dda7c433cd2b42ad7d9527b483579dc4e.txt |
|
ae88c63 - State: success - All Test Contexts: "aos-ci-jenkins/OS_3.3_NOT_containerized, aos-ci-jenkins/OS_3.3_NOT_containerized_e2e_tests" - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-3-test-matrix-CONTAINERIZED=_NOT_containerized,OSE_VER=3.3,PYTHON=System-CPython-2.7,TOPOLOGY=openshift-cluster,TargetBranch=master,nodes=openshift-ansible-slave-688/ae88c63dda7c433cd2b42ad7d9527b483579dc4e.txt |
|
ae88c63 - State: success - All Test Contexts: "aos-ci-jenkins/OS_3.4_NOT_containerized, aos-ci-jenkins/OS_3.4_NOT_containerized_e2e_tests" - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-3-test-matrix-CONTAINERIZED=_NOT_containerized,OSE_VER=3.4,PYTHON=System-CPython-2.7,TOPOLOGY=openshift-cluster,TargetBranch=master,nodes=openshift-ansible-slave-688/ae88c63dda7c433cd2b42ad7d9527b483579dc4e.txt |
|
ae88c63 - State: success - All Test Contexts: "aos-ci-jenkins/OS_3.4_containerized, aos-ci-jenkins/OS_3.4_containerized_e2e_tests" - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-3-test-matrix-CONTAINERIZED=_containerized,OSE_VER=3.4,PYTHON=System-CPython-2.7,TOPOLOGY=openshift-cluster-containerized,TargetBranch=master,nodes=openshift-ansible-slave-688/ae88c63dda7c433cd2b42ad7d9527b483579dc4e.txt |
|
ae88c63 - State: success - All Test Contexts: "aos-ci-jenkins/OS_3.3_containerized, aos-ci-jenkins/OS_3.3_containerized_e2e_tests" - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-3-test-matrix-CONTAINERIZED=_containerized,OSE_VER=3.3,PYTHON=System-CPython-2.7,TOPOLOGY=openshift-cluster-containerized,TargetBranch=master,nodes=openshift-ansible-slave-688/ae88c63dda7c433cd2b42ad7d9527b483579dc4e.txt |
28cdc53
to
0604c6c
Compare
|
This is not complete yet, I found some troubles with it, need to validate it before |
b1af6fe
to
d08e21e
Compare
|
I updated the PR with playbooks working now |
|
I am working on fixing yamllint staff(travis errors) |
|
travis Erros fixed |
|
@sdodson PTAL |
|
Thanks, assigned someone to review who is more familiar with the tricky bits of cert work around etcd. |
|
@jkhelil Started testing this afternoon and will be reviewing soon! Sorry for the delay on this one |
|
@abutcher |
|
@jkhelil I've tested the playbook starting with multiple etcd instances and it works great. I also think the changes here look good. I didn't have success starting with a single etcd node and moving to multiple due to the way we configure advertised client urls. I haven't dug into this too deeply but it appears that with a single external etcd instance, the advertised client urls only contain As opposed to: |
|
We probably need to do something to convert the single etcd instance from 'localhost' to hostnames and then re-initialize the cluster with that config before scaling up. |
|
I was able to start with a single instance by making these changes to the etcd config in order to set a single node up for future clustering. diff --git a/roles/etcd/templates/etcd.conf.j2 b/roles/etcd/templates/etcd.conf.j2
index 64c14a0..1095e76 100644
--- a/roles/etcd/templates/etcd.conf.j2
+++ b/roles/etcd/templates/etcd.conf.j2
@@ -8,12 +8,8 @@
{% endfor -%}
{% endmacro -%}
-{% if etcd_peers | default([]) | length > 1 %}
ETCD_NAME={{ etcd_hostname }}
ETCD_LISTEN_PEER_URLS={{ etcd_listen_peer_urls }}
-{% else %}
-ETCD_NAME=default
-{% endif %}
ETCD_DATA_DIR={{ etcd_data_dir }}
#ETCD_SNAPSHOT_COUNTER=10000
ETCD_HEARTBEAT_INTERVAL=500
@@ -23,7 +19,6 @@ ETCD_LISTEN_CLIENT_URLS={{ etcd_listen_client_urls }}
#ETCD_MAX_WALS=5
#ETCD_CORS=
-{% if etcd_peers | default([]) | length > 1 %}
#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS={{ etcd_initial_advertise_peer_urls }}
{% if initial_etcd_cluster is defined and initial_etcd_cluster %}
@@ -37,7 +32,6 @@ ETCD_INITIAL_CLUSTER_TOKEN={{ etcd_initial_cluster_token }}
#ETCD_DISCOVERY_SRV=
#ETCD_DISCOVERY_FALLBACK=proxy
#ETCD_DISCOVERY_PROXY=
-{% endif %}
ETCD_ADVERTISE_CLIENT_URLS={{ etcd_advertise_client_urls }}
#[proxy] |
|
RFE for adding/removing etcd members here: #1772 |
|
use cases tested:
Tried that with both selinux enabled and disabled. No difference. |
|
1-node member related issue: etcd-io/etcd#7820 |
|
@abutcher Hi Andrew, were you able to do this |
Yes, tested this but not successfully. |
|
I am working on it, I keep you informed |
|
I manage to have it working with a cluster created with old conf, and I was able to scaleup etcd on new node using rpm based config, not dockerized one. which seems not related to etcd scaleup, it is reported here |
|
@abutcher can you give me explicit detail about what is not working exactly, I was able to add an etcd node using thes playbooks on a stack created with 3.2 playbooks. except the error i mentioned, which seems to be a bug on last openshift ansible |
|
Can one of the admins verify this patch?
|
|
@jkhelil The only case that isn't working for me is to create a cluster with a single instance in the [etcd] group with the master branch and then attempt to scale it up with this branch. However, if the cluster was initially created with this branch then scaling up from a single instance works fine. I also encountered some issues with overridden hostnames which I was able to work around with this commit abutcher@54d3b45. |
|
aos-ci-test |
|
[merge] |
|
bot, retest this please |
|
aos-ci-test |
|
[merge] |
|
@sdodson Do you think we would need a card to track the feature? I'm afraid it would be missed by QE if it's intended to be in OCP, as well as the document. |
|
@ganhuang there's one already, moved it to complete https://trello.com/c/EESwIsuW/171-5-support-for-scaling-up-etcd |
An etcd scaleup playbook is needed for these reasons: