New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confusing behaviour after monitoring is enabled #921
Comments
It is well possible that some of these are issues in OpenShift itself rather than in CRC and the environment that it sets up for OpenShift. However, it's possible that there is something about CRC or in CRC which should be done differently to remediate the problem and have the OpenShift cluster fully healthy, except maybe some alerts about there only being one node in the cluster. |
This is part of a known issue:
is disabled for CRC as they do not provide functionality we need and would otherwise consume resources. Also see: https://access.redhat.com/documentation/en-us/red_hat_codeready_containers/1.0/html-single/release_notes_and_known_issues/index#metrics_are_disabled_by_default Take note that https://access.redhat.com/documentation/en-us/red_hat_codeready_containers/1.3/html-single/getting_started_guide/index#common-tasks_gsg does mention:
which affects the results from metrics. Note: we dropped the statement related to Cluster state reporting "Healthy" when this is not the case. I believe we did so under the pretense to add a message as part of the startup. Perhaps this got neglected/miscommunicated. Will bring this up again. |
Thanks @gbraad for the explanation. Is there an easy way to enable the machine API and cluster version operators if I don't worry about the resource consumption that much, similar to the ability of enabling monitoring per https://access.redhat.com/documentation/en-us/red_hat_codeready_containers/1.0/html-single/release_notes_and_known_issues/index#metrics_are_disabled_by_default ? I'd like to have that CRC-based testing cluster as close to the real OCP installation as possible, as a basis for future investigation. |
@adelton Machine config operator doesn't work with single node cluster openshift/machine-config-operator#579 and to enable the cluster version operator is fairly simple just using |
But @praveenkumar, wouldn't enabling the CVO have unwanted side effects? It wants to pull the cluster into a certain state ... |
@gbraad No, it shouldn't because @adelton already enabling the monitor operator, that what we disable, it will also bring the cluster state back to what the we change https://github.com/code-ready/snc/blob/master/snc.sh#L221-L255 here. |
Thank you. Should I expected it to appear at https://console-openshift-console.apps-crc.testing/settings/cluster/clusteroperators? The
alert is gone but I see https://console-openshift-console.apps-crc.testing/k8s/cluster/config.openshift.io~v1~ClusterVersion/version reporting Failing because of
And in OCM, in the Cluster operators listing, I see "version" Failing, with link pointing to https://console-openshift-console.apps-crc.testing/k8s/cluster/config.openshift.io~v1~ClusterOperator/version which says 404 Not Found. Note the ClusterOperator/version, as opposed to the ClusterVersion/version URL which has some content in it. |
@adelton This is expected since as per CVO, we should have 3 instances of etcd to perform a quorum, for CRC this is not possible :( |
Digging into it a bit more, it seems like |
For the record, running this (enable monitoring, and
I've searched around but did not find definitive cause (and workaround) for the |
@adelton You can't perform the upgrade of the cluster using CRC since it is single node cluster and the machine config operator not going to supported for this. In real cluster upgrade happen one by one on the worker/master node and rebooting them. I think if you are really want to test upgrade then CRC might not be the good choice :( |
Understood. Thank you. |
@praveenkumar Then I suggest that https://access.redhat.com/documentation/en-us/red_hat_codeready_containers/1.3/html-single/getting_started_guide/index (and possibly other places) should explicitly mention that CRC does not support |
General information
crc setup
before starting it (Yes/No)? YesCRC version
crc version: 1.3.0+918756b OpenShift version: 4.2.10 (embedded in binary)
CRC status
CRC config
Empty output.
Host Operating System
Steps to reproduce
crc setup
crc start -p /tmp/pull.secret -m 16000
to give the VM enough memory to allow monitoring to work, also see [BUG] Documentation should state that monitoring will not work unless memory is increased from default #810oc login -u kubeadmin
oc scale --replicas=1 statefulset --all -n openshift-monitoring; oc scale --replicas=1 deployment --all -n openshift-monitoring
based on https://code-ready.github.io/crc/#starting-monitoring-alerting-telemetry_gsgExpected
https://console-openshift-console.apps-crc.testing/dashboards says "Cluster is healthy" and there are no alerts.
https://console-openshift-console.apps-crc.testing/settings/cluster/ lists Last Completed Version and the Update History as completed.
https://console-openshift-console.apps-crc.testing/settings/cluster/clusteroperators shows all cluster operators "green" with no error messages.
Actual
https://console-openshift-console.apps-crc.testing/dashboards says "Cluster is healthy" but it lists bunch of alerts:
https://console-openshift-console.apps-crc.testing/settings/cluster/ shows
https://console-openshift-console.apps-crc.testing/settings/cluster/clusteroperators at the top shows message
However, all operators have Status listed as Available with green check mark, monitoring has Message "Successfully rolled out the stack.", machine-api has Message "-".
Overall, it is not clear what the status of the machine-api operator is and why it's reported in alert as down when in the operator list it is shown running and available, it is not clear why the cluster version operator has the "has disappeared" alert and how to fix it, and why the message speaks about cluster operator monitoring not rolled out when the monitoring operator says otherwise.
Logs
You can start crc with
crc start --log-level debug
to collect logs.Please consider posting this on http://gist.github.com/ and post the link in the issue.
The text was updated successfully, but these errors were encountered: