Several pages still use removed services#11482
Conversation
|
The preview will be availble shortly at:
|
install_config/configuring_sdn.adoc
Outdated
There was a problem hiding this comment.
Should be separated 2 lines.
master-restart api
master-restart controllers
install_config/configuring_sdn.adoc
Outdated
There was a problem hiding this comment.
Should be separated 2 lines.
master-restart api
master-restart controllers
install_config/configuring_sdn.adoc
Outdated
There was a problem hiding this comment.
It's no longer "openvswitch service". We can simply say "restart OpenShift SDN."
|
@nekop Can you please take another look? I have made further changes, such as the API health check, replaced stopping and disabling service., |
17ed8ed to
523cf4e
Compare
nekop
left a comment
There was a problem hiding this comment.
This is half-done review, but the important thing so far is:
- If we move files /etc/origin/node/pods/, we need to restore it
- atomic-openshift-node.service exists in 3.10
I'll add more comments later.
admin_guide/ipsec.adoc
Outdated
There was a problem hiding this comment.
- The static pod files are master services, not the node we want to reboot
- If we move files, we need to restore the files after that
- We don't need to reboot node services nor del-br br0 if we reboot the node at step 3
To fix, simply replace the steps with node reboot.
admin_guide/iptables.adoc
Outdated
There was a problem hiding this comment.
The atomic-openshift-node.service is still used in 3.10, no need to replace it.
admin_guide/manage_nodes.adoc
Outdated
There was a problem hiding this comment.
The atomic-openshift-node.service is still used in 3.10, no need to replace it.
admin_guide/manage_nodes.adoc
Outdated
admin_guide/manage_nodes.adoc
Outdated
admin_guide/seccomp.adoc
Outdated
admin_guide/seccomp.adoc
Outdated
admin_guide/sysctls.adoc
Outdated
There was a problem hiding this comment.
We need to restore files before reboot and to restore files we need to stop atomic-openshift-node.service (otherwise it starts again).
# mkdir -p /etc/origin/node/pods-stopped
# mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/
# systemctl stop atomic-openshift-node
# mv /etc/origin/node/pods-stopped/* /etc/origin/node/pods/
# reboot
We probably need more steps to see if etcd is properly stopped. I'll send a query to openshift-sme.
There was a problem hiding this comment.
Seems we just need stop etcd service here, so I think should move etcd.yaml instead of all files in /etc/origin/node/pods/ folder unless you want stop master-api and controllers too.
so it might be:
mv /etc/origin/node/pods/etcd.yaml /etc/origin/node/pods-stopped/
@xingxingxia @geliu2016 could you help confirm?
There was a problem hiding this comment.
There are three types of pods: api, controllers, and etcd in the 'kube-system' project:
If here focus API service, then you can say view the master-api pods in the *kube-system* project and use the command with a label to get master-api pod only as below:
oc get pod -n kube-system -l openshift.io/component=api
There was a problem hiding this comment.
The master-controllers and master-etcd pod in the output can be removed if using the command above with the label.
There was a problem hiding this comment.
It might not be true. The master-api.example.com looks like the pod name and you cannot curl it. You should get the hostname or IP of the node which the pod landed then curl the host. for example:
$ oc get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
master-api-qe-anli310master-etcd-zone1-1 1/1 Running 0 7h 10.240.0.16 qe-anli310master-etcd-zone1-1
$ curl -k https://qe-anli310master-etcd-zone1-1/healthz
ok
There was a problem hiding this comment.
@lihongan Sorry, due to engaged on other tasks, didn't reply you timely in daylights.
The PR is a bit long, cross different files (and different subteams Networking, Master, Pod, maybe. A suggestion: if possible, it is better to have separate PRs for different subteams review). I'm from Master QE, just did a quick review to part of the PR today. Thank you!
admin_guide/ipsec.adoc
Outdated
There was a problem hiding this comment.
Syntax issue: viewing from https://github.com/mburke5678/openshift-docs/blob/BZ-1614095/admin_guide/ipsec.adoc , it has an unnecessary ending +
admin_guide/disabling_features.adoc
Outdated
There was a problem hiding this comment.
The context here is about features in node and about node service atomic-openshift-node, this service has no change in 3.10. It is not about master-restart controllers which is about controllers service in master.
Thus above change is not needed.
There was a problem hiding this comment.
@xingxingxia I am told that in 3.10+, changes to nodes no longer require a restart of the node. Changes made to the node config map are automatically made by the sync pods.
Release notes: https://docs.openshift.com/container-platform/3.10/release_notes/ocp_3_10_release_notes.html#ocp-310-new-node-configuration-process
admin_guide/manage_nodes.adoc
Outdated
There was a problem hiding this comment.
systemctl stop docker atomic-openshift-node stops both docker and atomic-openshift-node. Why do systemctl stop docker again?
Either one line:
systemctl stop docker atomic-openshift-node
Or two lines:
systemctl stop atomic-openshift-node
systemctl stop docker
Please check other similar places in the PR
admin_guide/manage_nodes.adoc
Outdated
There was a problem hiding this comment.
node service means systemd service atomic-openshift-node. If wanting to restart the node service, just systemctl restart atomic-openshift-node instead of reboot the whole host. Thus pls remove lines 799~804
Please check other similar places in the PR
admin_guide/manage_nodes.adoc
Outdated
There was a problem hiding this comment.
@lihongan I'm not sure either whether above steps for etcd are correct. Waiting for etcd QE owner @geliu2016 (but he is on leave and may be back next week) . The changes about etcd in following files need his review too.
There was a problem hiding this comment.
BTW, will this PR's change take effect for all versions https://docs.openshift.com/container-platform/3.9, 3.10 and 3.7 ... etc? If yes, need take care of versions <=3.9. Because removing systemd services etcd.service/atomic-openshift-master-api/atomic-openshift-master-controllers with static pods /etc/origin/node/pods as replacement only begins since 3.10
There was a problem hiding this comment.
@xingxingxia Thank you. This PR is for 3.10=>
There was a problem hiding this comment.
hello @mburke5678, regarding to 'reboot', I tried to reboot etcd host and 'systemctl start atomic-openshift-node', both works(the etcd pod status is running), so my question is reboot etcd host is mandatory or run 'systemctl start atomic-openshift-node' is also ok.
as my experience, sometimes, in some special vm(virtual machine) env, reboot host operation will result in the host can't be started normally, even be terminated. so i have some concern about reboot operation.
There was a problem hiding this comment.
@geliu2016 I removed the reboot command so that the user need to only restate the node. Do you know whom we can ask to be certain?
Thank you for pointing this out.
There was a problem hiding this comment.
@mburke5678 , for etcd issue, you may try to ask to be cretain with jlegasse@redhat.com
There was a problem hiding this comment.
@geliu2016 Joe Legasse: "As far as etcd is concerned, no reboot is necessary of the host machine..."
There was a problem hiding this comment.
The node name is myserver.com so please remove /healthz from it.
|
@nekop @xingxingxia @lihongan @geliu2016 Are we OK to send this PR to documentation peer review? |
1 similar comment
|
@nekop @xingxingxia @lihongan @geliu2016 Are we OK to send this PR to documentation peer review? |
|
Hello,Michael
I reviewed and added comment several days ago, but the comment still in
pending status, could you pls take a look for it? thanks
admin_guide/topics/proc_removing-failed-etcd-member.adoc
<https://github.com/openshift/openshift-docs/pull/11482/files#diff-13578813adedc157a9999343aa416b14>
line 37
…On Tue, Sep 11, 2018 at 2:15 AM, Michael Burke ***@***.***> wrote:
@nekop <https://github.com/nekop> @xingxingxia
<https://github.com/xingxingxia> @lihongan <https://github.com/lihongan>
@geliu2016 <https://github.com/geliu2016> Are we OK to send this PR to
documentation peer review?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#11482 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AVVNi9LNdBCalrZp0gC4L1Y0gbCz1WxYks5uZqxfgaJpZM4V2TAL>
.
|
|
@geliu2016 Did I address your comment? Was it #11482 (comment)? |
geliu2016
left a comment
There was a problem hiding this comment.
@geliu2016 Did I address your comment? Was it #11482 (comment)?
@mburke5678 , I noticed that 'reboot' be removed at line 26, is this 'reboot'(line 37) necessary? thx
admin_guide/topics/proc_removing-failed-etcd-member.adoc https://github.com/openshift/openshift-docs/pull/11482/files#diff-13578813adedc157a9999343aa416b14 line 37
There was a problem hiding this comment.
@mburke5678 , I noticed that 'reboot' be removed at line 26, is this 'reboot'(line 37) necessary? thx
There was a problem hiding this comment.
@geliu2016 I apologize. I thought I had removed the reboot. I have removed the command now.
Commit: 45165857d06f12c2d6d185a6078804ee9656aa50
|
@mfojtik Can you help with an ectd question? In the "Creating a single-node etcd cluster" procedure, does the user need a reboot after stopping etcd in Step 1 of the following? |
|
@geliu2016 I finally found someone to verify that the |
|
I also opened this https://bugzilla.redhat.com/show_bug.cgi?id=1626735 sorry for any dupes! |
|
@csheremeta PTAL |
|
@geliu2016 @lihongan PTAL One change from a customer issue and we can merge. |
|
@mburke5678, LGTM, thx |
|
@openshift/team-documentation Find and replace changes. PTAL to see if there is anything I did wrong. |
91ec982 to
9e98b35
Compare
3fbbe6d to
fa85d2c
Compare
|
/cherrypick enterprise-3.10 |
|
@mburke5678: #11482 failed to apply on top of branch "enterprise-3.10": DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
https://bugzilla.redhat.com/show_bug.cgi?id=1614095
Also: https://bugzilla.redhat.com/show_bug.cgi?id=1615199