[BUG] Fails to resolve nameserver when run own DNS server through docker. #4144

phantomjinx · 2024-05-03T14:33:31Z

General information

OS: Linux (Fedora 38)
Hypervisor: KVM
Did you run crc setup before starting it (Yes/No)? Yes
Running CRC on: Baremetal-Server

CRC version

CRC version: 2.35.0+3956e8
OpenShift version: 4.15.10
Podman version: 4.4.4

CRC status

# Put `crc status --log-level debug` output here
DEBU CRC version: 2.35.0+3956e8                   
DEBU OpenShift version: 4.15.10                   
DEBU Podman version: 4.4.4                        
DEBU Running 'crc status'                         
crc does not seem to be setup correctly, have you run 'crc setup'?

CRC config

# Put `crc config view` output here
- consent-telemetry                     : no
- disk-size                             : 100
- kubeadmin-password                    : <my-password>
- memory                                : 32768
- nameserver                            : 192.168.200.1
- skip-check-crc-dnsmasq-file           : true
- skip-check-network-manager-config     : true

Host Operating System

# Put the output of `cat /etc/os-release` in case of Linux
NAME="Fedora Linux"
VERSION="38 (KDE Plasma)"
ID=fedora
VERSION_ID=38
VERSION_CODENAME=""
PLATFORM_ID="platform:f38"
PRETTY_NAME="Fedora Linux 38 (KDE Plasma)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:38"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f38/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=38
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=38
SUPPORT_END=2024-05-14
VARIANT="KDE Plasma"
VARIANT_ID=kde

Steps to reproduce

crc setup -> reports crc is setup correctly
crc start --log-level info --nameserver 192.168.200.1 -p crc-pull-secret.json

Expected

crc to validate pull-secret and continue starting ...

Actual

....
INFO Configuring shared directories               
INFO Check internal and public DNS query...       
INFO Check DNS query from host...                 
INFO Verifying validity of the kubelet certificates... 
INFO Starting kubelet service                     
INFO Waiting for kube-apiserver availability... [takes around 2min] 
INFO Waiting until the user's pull secret is written to the instance disk... 
Failed to update pull secret on the disk: Temporary error: pull secret not updated to disk (x202)

Logs

Before gather the logs try following if that fix your issue - Have tried the following:

$ crc delete -f
$ crc cleanup
$ crc setup
$ crc start --log-level debug

Please consider posting the output of crc start --log-level debug on http://gist.github.com/ and post the link in the issue.

https://gist.github.com/phantomjinx/2e87e94860df04d4a0275bed88e52d19

The text was updated successfully, but these errors were encountered:

adrianriobo · 2024-05-03T14:35:16Z

Hey this one is already reported, can you check #4110 also the fix is coming #4143

phantomjinx · 2024-05-03T14:51:45Z

The workaround does not apply to me as I routinely disable system-resolved and run by own DNS server through docker.

So not convinced it can necessarily be closed since my setup is different and the workaround is not really a resolution.

phantomjinx · 2024-05-03T14:53:21Z

Needless to say I was running 2.27 and everything was working without a problem.

phantomjinx · 2024-05-03T15:03:17Z

If required, I can post the configuration of my /etc/sysconfig/iptables and docker ps.

cfergeau · 2024-05-03T15:30:39Z

@praveenkumar should be able to provide more details next week.

My understanding is that there are 2 bugs.
crc changes the guest's resolv.conf to something like

search crc.testing
nameserver 192.168.130.11
nameserver 192.168.130.1

This allows the processes running inside the cluster to resolve (for example) api.crc.testing
This resolv.conf guest modification broke when we upgraded openshift to 4.15.

This means with 4.15 bundles, the guest will rely on the host DNS resolution to resolve *.crc.testing, and *.apps-crc.testing to the guest external IP, which is 192.168.130.11. bug #4110 is one situation where the host DNS resolution of *crc.testing does not work as expected, and this causes cluster startup failures.

Given your custom DNS configuration, I would guess your host knows nothing about *crc.testing? If you can configure it to return 192.168.130.11 for *.crc.testing and *.apps-crc.testing, this might help to go further.

praveenkumar · 2024-05-03T15:37:26Z

I can see the you skipped the crc-dnsmasq file and also network-manager configuration. you are also passing the nameserver which I am hoping the IP of the container which you are running dnsmasq. Before 4.15 bundle everything worked because we were using openshift-sdn which was not making change to network and restarting the NM in the VM. which is why in the VM resolv.conf stays as in your case and work great till 4.14 bundle

search crc.testing
nameserver 192.168.200.1
nameserver 192.168.130.1

but if you login to the VM (in current case) you will see only following

nameserver 192.168.130.1

try something which @cfergeau suggested #4144 (comment) and see if that works.

phantomjinx · 2024-05-03T15:51:31Z

So 192.168.200.1 is my DNS container, piedharrier (running bind):

            "Networks": {
                "skynet": {
                    "IPAMConfig": {
                        "IPv4Address": "192.168.200.1",
                        "IPv6Address": "2001:8b0:1103:6ed4:cccc:1111::9"
                    },
                    "Links": null,
                    "Aliases": [
                        "f816a3454037",
                        "piedharrier"
                    ],

piedharrier is already capable of resolving the cluster:

[root@piedharrier /]# ping api.crc.testing
PING api.crc.testing (192.168.130.11) 56(84) bytes of data.
64 bytes from 192.168.130.11 (192.168.130.11): icmp_seq=1 ttl=63 time=0.366 ms
64 bytes from 192.168.130.11 (192.168.130.11): icmp_seq=2 ttl=63 time=0.340 ms
^C
--- api.crc.testing ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.340/0.353/0.366/0.013 ms

praveenkumar · 2024-05-03T15:53:23Z

What about other domain like api-int.crc.testing or foo.apps-crc.testing ..etc?

phantomjinx · 2024-05-03T15:56:52Z

So I add aliases whenever I install a new app and they all alias api.crc.testing

praveenkumar · 2024-05-06T12:17:28Z

@phantomjinx Thanks for all the info, I am working on it to resolve this issue.

praveenkumar · 2024-05-07T14:35:52Z

@phantomjinx can you try to download the artifact from https://github.com/crc-org/crc/actions/runs/8985004693 one and extract it (which will have the crc binary and rpm) use that crc binary and see if that works for you?

phantomjinx · 2024-05-08T10:46:36Z

Unfortunately, crc could not start ...

...
INFO Using bundle path /home/phantomjinx/.crc/cache/crc_libvirt_4.15.10_amd64.crcbundle 
INFO Checking if running as non-root              
INFO Checking if running inside WSL2              
INFO Checking if crc-admin-helper executable is cached 
WARN Preflight checks failed during `crc start`, please try to run `crc setup` first in case you haven't done so yet 
unexpected version of the crc-admin-helper executable: crc-admin-helper-linux version mismatch: 0.5.2 expected but 0.0.12 found in the cache

praveenkumar · 2024-05-09T04:15:36Z

@phantomjinx you had to do crc setup first before crc start

phantomjinx · 2024-05-09T16:09:10Z

@praveenkumar ran crc setup first (this time separately) and it completed with no problem. Then ran crc start and same error:

[phantomjinx@microraptor:/home/openshift/bin] 20s $ crc setup
INFO Using bundle path /home/phantomjinx/.crc/cache/crc_libvirt_4.15.10_amd64.crcbundle 
INFO .... <snip>
INFO Checking if libvirt 'crc' network is active  
INFO Checking if CRC bundle is extracted in '$HOME/.crc' 
INFO Checking if /home/phantomjinx/.crc/cache/crc_libvirt_4.15.10_amd64.crcbundle exists 
INFO Getting bundle for the CRC executable        
INFO Downloading bundle: /home/phantomjinx/.crc/cache/crc_libvirt_4.15.10_amd64.crcbundle... 
4.81 GiB / 4.81 GiB [--------------------------------------------------------------------------------------------------] 100.00% 15.34 MiB/s
INFO Uncompressing /home/phantomjinx/.crc/cache/crc_libvirt_4.15.10_amd64.crcbundle 
crc.qcow2:  20.41 GiB / 20.41 GiB [------------------------------------------------------------------------------------------------] 100.00%
oc:  149.79 MiB / 149.79 MiB [-----------------------------------------------------------------------------------------------------] 100.00%
Your system is correctly setup for using CRC. Use 'crc start' to start the instance

[phantomjinx@microraptor:/home/openshift/bin] 17m50s $ start-crc
Starting with current configuration ... continue? y
Changes to configuration property 'memory' are only applied when the CRC instance is started.
If you already have a running CRC instance, then for this configuration change to take effect, stop the CRC instance with 'crc stop' and restart it with 'crc start'.
Changes to configuration property 'disk-size' are only applied when the CRC instance is started.
If you already have a running CRC instance, then for this configuration change to take effect, stop the CRC instance with 'crc stop' and restart it with 'crc start'.
Successfully configured nameserver to 192.168.200.1
Successfully configured skip-check-crc-dnsmasq-file to true
Successfully configured skip-check-network-manager-config to true
Successfully configured kubeadmin-password to ########
INFO Using bundle path /home/phantomjinx/.crc/cache/crc_libvirt_4.15.10_amd64.crcbundle 
INFO Checking if running as non-root              
INFO Checking if running inside WSL2              
INFO Checking if crc-admin-helper executable is cached 
WARN Preflight checks failed during `crc start`, please try to run `crc setup` first in case you haven't done so yet 
unexpected version of the crc-admin-helper executable: crc-admin-helper-linux version mismatch: 0.5.2 expected but 0.0.12 found in the cache

Just to avoid any confusion, I execute a script start-crc which first runs crc config ... options.
Here it is:

#!/bin/bash

export CRCHOME="/home/openshift/bin"

help() {
  echo "$0 [-c] [-h]"
  echo "    -c: clear out and reset as new"
  echo "    -h: show this help"
  exit 1
}

while getopts ":ch" opt ; do
  case "${opt}" in
    c) CLEAR=1 ;;
    h) help ;;
    \?) help ;;
  esac
done

if [ "${CLEAR}" == "1" ]; then
  echo "Clearing out and resetting ..."
  crc delete
  rm -rf ${HOME}/.crc/*
fi

read -p "Starting with current configuration ... continue? " -n 1 -r
echo    # (optional) move to a new line
if [[ ${REPLY} =~ ^[Yy]$ ]]; then
  START_CRC=1
fi

if [ "${START_CRC}" != 1 ]; then
  echo "Not starting crc... Exiting"
  exit 0
fi

. ${CRCHOME}/crc-configure

if [ "${CLEAR}" == "1" ]; then
  echo "Running crc setup ..."
  crc setup
fi

crc start \
  --log-level info \
  --nameserver 192.168.200.1 \
  -p crc-pull-secret.json

if [ $? == 0 ]; then
  echo "Restarting iptables ..."
  sudo service iptables restart
fi

praveenkumar · 2024-05-13T08:13:21Z

@phantomjinx Did you use export CRCHOME="/home/openshift/bin" same path for crc setup because this should've fix that issue? make sure you check with which crc for both the commands to make sure same path is used.

phantomjinx · 2024-05-14T12:17:12Z

@praveenkumar

Yes. I have been using CRCHOME and explicitly setting it.

So I have successfully started up a cluster with the downloaded artifact but in order to get around the version checks, I had to compile the following separately and drop into the .crc/bin directory.

crc-admin-helper-linux (downloaded binary reports its version as 0.0.12 - compiling returned correct version of 0.5.2)
crc-driver-libvirt (downloaded binary reports its version as 0.13.5 - compiling returned correct version of 0.13.7)

So, upshot is that whatever was changed in the download snapshot did fix the DNS issues and allowed the pull secret to be correctly installed.

cfergeau · 2024-05-14T13:19:11Z

For what it's worth, I've just checked that if I download https://developers.redhat.com/content-gateway/file/pub/openshift-v4/clients/crc/2.35.0/crc-linux-amd64.tar.xz , after running crc setup, I have the right version:

$ ./crc version
CRC version: 2.35.0+3956e8
OpenShift version: 4.15.10
Podman version: 4.4.4

$ ./crc setup
[...]
INFO Checking if crc-admin-helper executable is cached 
INFO Caching crc-admin-helper executable          
[...]

$ ~/.crc/bin/crc-admin-helper-linux --version
admin-helper version 0.5.2

cfergeau · 2024-05-14T13:20:25Z

(checking the gh actions artifact now...). It seems to be embedding an older version indeed, we'd need to fix that :)

cfergeau · 2024-05-14T13:23:46Z

https://github.com/crc-org/crc/blob/main/images/rpmbuild/Containerfile.in <- these versions need to be updated.

`make test-rpmbuild` uses a container images with preinstalled admin-helper/machine-driver-libvirt RPMs to generate a `crc` binary/rpm embedding these binaries. However, the versions used no longer match what crc expects, which is causing issues. This was reported in crc-org#4144 Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>

`make test-rpmbuild` uses a container images with preinstalled admin-helper/machine-driver-libvirt RPMs to generate a `crc` binary/rpm embedding these binaries. However, the versions used no longer match what crc expects, which is causing issues. This was reported in #4144 Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>

praveenkumar · 2024-05-15T06:59:31Z

@phantomjinx We just released 2.36.0 which have the fix, can you try that version and close this issue if it is working for you?

phantomjinx · 2024-05-15T14:19:17Z

2.36.0 is good. Thanks for sorting this.

phantomjinx added kind/bug Something isn't working status/need triage labels May 3, 2024

adrianriobo added the resolution/duplicate This issue or pull request already exists label May 3, 2024

adrianriobo closed this as completed May 3, 2024

adrianriobo reopened this May 3, 2024

adrianriobo removed the resolution/duplicate This issue or pull request already exists label May 3, 2024

phantomjinx mentioned this issue May 3, 2024

[BUG] crc start fails on Waiting until the user's pull secret is written to the instance disk... step #4110

Closed

praveenkumar changed the title ~~[BUG]~~ [BUG] Fails to resolve nameserver when run own DNS server through docker. May 3, 2024

praveenkumar mentioned this issue May 6, 2024

nameservers: Use nmcli to update the nameserver and search info #4145

Merged

praveenkumar mentioned this issue May 8, 2024

[BUG] error: connect: connection refused - verify you have provided the correct host and port and that the server is currently running. #4121

Open

cfergeau mentioned this issue May 14, 2024

rpm: Update admin-helper/machine-driver-libvirt RPMs #4163

Merged

phantomjinx closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Fails to resolve nameserver when run own DNS server through docker. #4144

[BUG] Fails to resolve nameserver when run own DNS server through docker. #4144

phantomjinx commented May 3, 2024

adrianriobo commented May 3, 2024 •

edited

Loading

phantomjinx commented May 3, 2024 •

edited

Loading

phantomjinx commented May 3, 2024 •

edited

Loading

phantomjinx commented May 3, 2024

cfergeau commented May 3, 2024 •

edited

Loading

praveenkumar commented May 3, 2024

phantomjinx commented May 3, 2024

praveenkumar commented May 3, 2024

phantomjinx commented May 3, 2024 •

edited

Loading

praveenkumar commented May 6, 2024

praveenkumar commented May 7, 2024

phantomjinx commented May 8, 2024

praveenkumar commented May 9, 2024

phantomjinx commented May 9, 2024

praveenkumar commented May 13, 2024

phantomjinx commented May 14, 2024

cfergeau commented May 14, 2024

cfergeau commented May 14, 2024 •

edited

Loading

cfergeau commented May 14, 2024

praveenkumar commented May 15, 2024

phantomjinx commented May 15, 2024

[BUG] Fails to resolve nameserver when run own DNS server through docker. #4144

[BUG] Fails to resolve nameserver when run own DNS server through docker. #4144

Comments

phantomjinx commented May 3, 2024

General information

CRC version

CRC status

CRC config

Host Operating System

Steps to reproduce

Expected

Actual

Logs

adrianriobo commented May 3, 2024 • edited Loading

phantomjinx commented May 3, 2024 • edited Loading

phantomjinx commented May 3, 2024 • edited Loading

phantomjinx commented May 3, 2024

cfergeau commented May 3, 2024 • edited Loading

praveenkumar commented May 3, 2024

phantomjinx commented May 3, 2024

praveenkumar commented May 3, 2024

phantomjinx commented May 3, 2024 • edited Loading

praveenkumar commented May 6, 2024

praveenkumar commented May 7, 2024

phantomjinx commented May 8, 2024

praveenkumar commented May 9, 2024

phantomjinx commented May 9, 2024

praveenkumar commented May 13, 2024

phantomjinx commented May 14, 2024

cfergeau commented May 14, 2024

cfergeau commented May 14, 2024 • edited Loading

cfergeau commented May 14, 2024

praveenkumar commented May 15, 2024

phantomjinx commented May 15, 2024

adrianriobo commented May 3, 2024 •

edited

Loading

phantomjinx commented May 3, 2024 •

edited

Loading

phantomjinx commented May 3, 2024 •

edited

Loading

cfergeau commented May 3, 2024 •

edited

Loading

phantomjinx commented May 3, 2024 •

edited

Loading

cfergeau commented May 14, 2024 •

edited

Loading