New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USHIFT-1590: skip failing invariants in MicroShift #28193
USHIFT-1590: skip failing invariants in MicroShift #28193
Conversation
@pacevedom: This pull request references USHIFT-1590 which is a valid jira issue. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
a821213
to
ecdb4fb
Compare
/retest-required |
if !isMicroShift { | ||
if err := sampler.TearDownInClusterMonitors(w.adminRESTConfig); err != nil { | ||
return err | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !isMicroShift { | |
if err := sampler.TearDownInClusterMonitors(w.adminRESTConfig); err != nil { | |
return err | |
} | |
} | |
if isMicroShift { | |
return nil | |
} | |
if err := sampler.TearDownInClusterMonitors(w.adminRESTConfig); err != nil { | |
return err | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} else { | ||
infra, err := configClient.ConfigV1().Infrastructures().Get(context.Background(), "cluster", metav1.GetOptions{}) | ||
isMicroShift, err := exutil.IsMicroShiftCluster(kubeClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use early returns to eliminate nesting. This is hard to follow.
} else { | ||
infra, err := configClient.ConfigV1().Infrastructures().Get(context.Background(), "cluster", metav1.GetOptions{}) | ||
isMicroShift, err := exutil.IsMicroShiftCluster(kubeClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this should become a getPlatformType(restConfig) (platformType, error)
to make it more clear. I think this is trying to
- fail if a kubeclient cannot be created
- if the cluster lacks an infrastructure CRD, return empty, no error
- if the cluster has an infrastructure CRD, error if there is no CR instance
- if the cluster has a CR instance, return the platform type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -30,6 +31,15 @@ func (w *kubeletLogCollector) CollectData(ctx context.Context, storageDir string | |||
if err != nil { | |||
return nil, nil, err | |||
} | |||
// MicroShift does not have a proper journal for the node logs api. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its under a different service, the journal belongs to microshift which holds the whole control plane logging output, not just kubelet. All the supportability docs are prepared for it (as well as an sos plugin)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the output:
$ oc get --raw "/api/v1/nodes/microshift-dev/proxy/logs"
<pre>
<a href="README">README</a>
<a href="anaconda/">anaconda/</a>
<a href="audit/">audit/</a>
<a href="btmp">btmp</a>
<a href="btmp-20230810">btmp-20230810</a>
<a href="chrony/">chrony/</a>
<a href="containers/">containers/</a>
<a href="crio/">crio/</a>
<a href="cron">cron</a>
<a href="cron-20230704">cron-20230704</a>
<a href="cron-20230810">cron-20230810</a>
<a href="cron-20230817">cron-20230817</a>
<a href="cron-20230824">cron-20230824</a>
<a href="dnf.librepo.log">dnf.librepo.log</a>
<a href="dnf.log">dnf.log</a>
<a href="dnf.log.1">dnf.log.1</a>
<a href="dnf.rpm.log">dnf.rpm.log</a>
<a href="firewalld">firewalld</a>
<a href="hawkey.log">hawkey.log</a>
<a href="hawkey.log-20230704">hawkey.log-20230704</a>
<a href="hawkey.log-20230810">hawkey.log-20230810</a>
<a href="hawkey.log-20230817">hawkey.log-20230817</a>
<a href="hawkey.log-20230824">hawkey.log-20230824</a>
<a href="insights-client/">insights-client/</a>
<a href="kdump.log">kdump.log</a>
<a href="kube-apiserver/">kube-apiserver/</a>
<a href="lastlog">lastlog</a>
<a href="libvirt/">libvirt/</a>
<a href="maillog">maillog</a>
<a href="maillog-20230704">maillog-20230704</a>
<a href="maillog-20230810">maillog-20230810</a>
<a href="maillog-20230817">maillog-20230817</a>
<a href="maillog-20230824">maillog-20230824</a>
<a href="messages">messages</a>
<a href="messages-20230704">messages-20230704</a>
<a href="messages-20230810">messages-20230810</a>
<a href="messages-20230817">messages-20230817</a>
<a href="messages-20230824">messages-20230824</a>
<a href="openvswitch/">openvswitch/</a>
<a href="ovn/">ovn/</a>
<a href="ovn-kubernetes/">ovn-kubernetes/</a>
<a href="pods/">pods/</a>
<a href="private/">private/</a>
<a href="qemu-ga/">qemu-ga/</a>
<a href="rhsm/">rhsm/</a>
<a href="secure">secure</a>
<a href="secure-20230704">secure-20230704</a>
<a href="secure-20230810">secure-20230810</a>
<a href="secure-20230817">secure-20230817</a>
<a href="secure-20230824">secure-20230824</a>
<a href="spooler">spooler</a>
<a href="spooler-20230704">spooler-20230704</a>
<a href="spooler-20230810">spooler-20230810</a>
<a href="spooler-20230817">spooler-20230817</a>
<a href="spooler-20230824">spooler-20230824</a>
<a href="sssd/">sssd/</a>
<a href="swtpm/">swtpm/</a>
<a href="tallylog">tallylog</a>
<a href="wtmp">wtmp</a>
</pre>
Test is looking for journal and then filters by the systemd unit for kubelet. In microshift the kubelet is part of microshift systemd unit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like you need to fix this code then, not simply skip it.
8ba7f64
to
7e65118
Compare
7e65118
to
13857ff
Compare
/retest-required |
@pacevedom: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm |
/label acknowledge-critical-fixes-only |
return platform, fmt.Errorf("error checking MicroShift cluster: %v", err) | ||
} | ||
if isMicroShift { | ||
return platform, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will be empty, not nil, correct? You sure that's what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the platform is only needed when deploying on Azure, and thats not the case for MicroShift.
This has made obvious, but will not address shortcomings for addressing CI failures. I'm willing to allow it, but microshift failure resolution will suffer for lack of this. /approve holding for @pacevedom to acknowledge the shortcoming. You may release on acknowledgement without a fix if you wish. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, neisw, pacevedom The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
As we already talked about it over Slack, posting here a summary. /hold cancel because of the reasons above. |
/unhold |
c3d0116
into
openshift:master
No description provided.