[enhancement] check dns pod state as part of crc status operation #3852

adrianriobo · 2023-09-29T07:51:14Z

This seems to be the root cause for #3851

When we initialize a microshift cluster we ensure the state of the cluster as running then we deploy and expose a service to test the connectivity, when we do it manually it works as expected, but when we run that through an automation it fails with:

time="2023-09-28T13:44:03Z" level=error msg="Post \"http://gateway/hosts/add\": dial tcp: lookup gateway on 10.43.0.10:53: read udp 10.42.0.6:35808->10.43.0.10:53: read: connection refused"

The code for crc status on microshift checks oc get node but does not take care about the state of the pods running within the cluster.

On this scenario we expected dns is working fine as crc status is running but it is not the case, so it may worthwhile to check the state of the dns to ensure the running state of the cluster.

The text was updated successfully, but these errors were encountered:

adrianriobo · 2023-10-03T08:51:52Z

After adding a check for dns on the e2e now route is always added as it is expected, but we end up having issues with the test service deployed, ending on CreateContainerError due to:

Warning  FailedCreatePodSandBox  34m                    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_httpd-example-6bf9c787d7-d4mnd_testproj_f468e5fe-b4e9-45b6-b6f6-856426565406_0(6e1a579fdb65e06338b7e1e68f09866369562a4888142d363fb377c770b9bdf2): error adding pod testproj_httpd-example-6bf9c787d7-d4mnd to CNI network "ovn-kubernetes": plugin type="ovn-k8s-cni-overlay" name="ovn-kubernetes" failed (add): failed to send CNI request: Post "http://dummy/": dial unix /var/run/ovn-kubernetes/cni//ovn-cni-server.sock: connect: no such file or directory

So probably we will need to check the state for ovn as well

gbraad · 2024-04-10T07:36:53Z

@adrianriobo is this still relevant?

adrianriobo · 2024-04-10T08:18:42Z

It is relevant from a point of view of automation around it, on e2e we are trying to emulate a user experience and with the delays in between operations this is unlikely to happen (not having a healthy state cause we do not check).

But if we explore the CI use case and it would imply automation this should be in place.

Also as we do for OCP I guess it would be good to have the check #4009 at the start to ensure what we are delivering is something functional.

adrianriobo added points/1 priority/minor os/windows os/macos preset/microshift kind/enhancement New feature or request labels Sep 29, 2023

praveenkumar self-assigned this Oct 3, 2023

adrianriobo mentioned this issue Oct 3, 2023

[e2e] story_microshift ensures operational status for cluster #3857

Merged

praveenkumar mentioned this issue Feb 1, 2024

Check status of all the core pods for microshift #4009

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement] check dns pod state as part of crc status operation #3852

[enhancement] check dns pod state as part of crc status operation #3852

adrianriobo commented Sep 29, 2023

adrianriobo commented Oct 3, 2023

gbraad commented Apr 10, 2024

adrianriobo commented Apr 10, 2024 •

edited

Loading

[enhancement] check dns pod state as part of crc status operation #3852

[enhancement] check dns pod state as part of crc status operation #3852

Comments

adrianriobo commented Sep 29, 2023

adrianriobo commented Oct 3, 2023

gbraad commented Apr 10, 2024

adrianriobo commented Apr 10, 2024 • edited Loading

adrianriobo commented Apr 10, 2024 •

edited

Loading