Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/daemon: ensure /home/core/.ssh is there, not just /home/core #448

Merged
merged 1 commit into from
Feb 19, 2019

Conversation

runcom
Copy link
Member

@runcom runcom commented Feb 17, 2019

Signed-off-by: Antonio Murdaca runcom@linux.com

- What I did

The code was just checking /home/core but we later try to write to /home/core/.ssh (that folder is surely there but this needs to be fixed anyway).

I've centralized file writes in #401 as well to reduce the chances of doing this again.

- How to verify it

- Description for the changelog

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 17, 2019
@openshift-ci-robot openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 17, 2019
@runcom runcom changed the title pkg/daemon: ensure .ssh is there, not /home/core pkg/daemon: ensure /home/core/.ssh is there, not just /home/core Feb 17, 2019
Since we later write inside /home/core/.ssh.

Signed-off-by: Antonio Murdaca <runcom@linux.com>
@@ -608,9 +608,8 @@ func (dn *Daemon) updateSSHKeys(newUsers []ignv2_2types.PasswdUser) error {
// Keys should only be written to "/home/core/.ssh"
// Once Users are supported fully this should be writing to PasswdUser.HomeDir
glog.Infof("Writing SSHKeys at %q", coreUserSSHPath)

if err := dn.fileSystemClient.MkdirAll(filepath.Dir(coreUserSSHPath), os.FileMode(0600)); err != nil {
Copy link
Member Author

@runcom runcom Feb 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for clarity:

  • filepath.Dir("/home/core/.ssh") == "/home/core"
  • filepath.Dir("/home/core/.ssh/") == "/home/core/.ssh"

just a leading slash... but we don't need that anyway

Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice Feb 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gah i lost that on the refactor from the .join to the const. good catch!

@runcom
Copy link
Member Author

runcom commented Feb 17, 2019

unit flake from #417

/retest

@runcom
Copy link
Member Author

runcom commented Feb 17, 2019

Haproxy e2e-aws flake

/retest

@runcom
Copy link
Member Author

runcom commented Feb 17, 2019

failure looks like the bug fixed in #442

/retest

@kikisdeliveryservice
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 18, 2019
@runcom
Copy link
Member Author

runcom commented Feb 18, 2019

Cluster operator network error

/retest

@kikisdeliveryservice
Copy link
Contributor

"Cluster operator network has not yet reported success" -> then timed out.
/test e2e-aws

@runcom
Copy link
Member Author

runcom commented Feb 18, 2019

time="2019-02-18T21:33:32Z" level=info msg="Waiting up to 30m0s for the cluster to initialize..."
time="2019-02-18T21:33:32Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:33:40Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:33:55Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:34:10Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:35:40Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:36:25Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:36:40Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:39:40Z" level=debug msg="Still waiting for the cluster to initialize: Cluster operator monitoring is reporting a failure: Failed to rollout the stack. Error: running task Updating Prometheus-k8s failed: waiting for Prometheus object changes failed: waiting for Prometheus: retrieving Prometheus object failed: Get https://172.30.0.1:443/apis/monitoring.coreos.com/v1/namespaces/openshift-monitoring/prometheuses/k8s: dial tcp 172.30.0.1:443: connect: connection refused"
time="2019-02-18T21:43:40Z" level=debug msg="Still waiting for the cluster to initialize: Cluster operator network has not yet reported success"
time="2019-02-18T21:44:25Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:47:40Z" level=debug msg="Still waiting for the cluster to initialize: Cluster operator network has not yet reported success"
time="2019-02-18T21:49:25Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:52:55Z" level=debug msg="Still waiting for the cluster to initialize: Cluster operator network has not yet reported success"
time="2019-02-18T21:56:40Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T21:59:55Z" level=debug msg="Still waiting for the cluster to initialize: Cluster operator network has not yet reported success"
time="2019-02-18T22:03:25Z" level=debug msg="Still waiting for the cluster to initialize..."
time="2019-02-18T22:03:32Z" level=fatal msg="failed to initialize the cluster: timed out waiting for the condition"

/retest

@ashcrow
Copy link
Member

ashcrow commented Feb 18, 2019

/lgtm

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, kikisdeliveryservice, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [ashcrow,kikisdeliveryservice,runcom]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@runcom
Copy link
Member Author

runcom commented Feb 18, 2019

level=fatal msg="failed to initialize the cluster: Cluster operator network has not yet reported success"

FYI: that is correctly reported already and probably a container runtime bug

/retest

@openshift-merge-robot openshift-merge-robot merged commit 86a5a55 into openshift:master Feb 19, 2019
@runcom runcom deleted the fix-filepathdir branch February 19, 2019 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants