Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] OKD 4.12 does not start because of missing kubeadmin-password file #3635

Closed
VGerris opened this issue May 5, 2023 · 25 comments
Closed
Labels
kind/bug Something isn't working preset/openshift-okd Unsupported configuration priority/minor size/M tags/good first issue Good for newcomers tags/help wanted Extra attention is needed

Comments

@VGerris
Copy link
Contributor

VGerris commented May 5, 2023

General information

  • OS: Linux
  • Hypervisor: KVM
  • Did you run crc setup before starting it (Yes)?
  • Running CRC on: Laptop

CRC version

crc version
CRC version: 2.18.0+4ea3a1
OpenShift version: 4.12.13
Podman version: 4.4.1

CRC status

crc status --log-level debug
DEBU CRC version: 2.18.0+4ea3a1                   
DEBU OpenShift version: 4.12.13                   
DEBU Podman version: 4.4.1                        
DEBU Running 'crc status'                         
CRC VM:          Running
OpenShift:       Stopped (v4.12.0-0.okd-2023-02-18-033438)
RAM Usage:       5.292GB of 25.22GB
Disk Usage:      16.23GB of 32.68GB (Inside the CRC VM)
Cache Usage:     38.11GB
Cache Directory: /home/user/.crc/cache

CRC config

crc status --log-level debug
DEBU CRC version: 2.18.0+4ea3a1                   
DEBU OpenShift version: 4.12.13                   
DEBU Podman version: 4.4.1                        
DEBU Running 'crc status'                         
CRC VM:          Running
OpenShift:       Stopped (v4.12.0-0.okd-2023-02-18-033438)
RAM Usage:       5.292GB of 25.22GB
Disk Usage:      16.23GB of 32.68GB (Inside the CRC VM)
Cache Usage:     38.11GB
Cache Directory: /home/user/.crc/cache

Host Operating System

cat /etc/os-release
NAME="Fedora Linux"
VERSION="38 (Workstation Edition)"
ID=fedora
VERSION_ID=38
VERSION_CODENAME=""
PLATFORM_ID="platform:f38"
PRETTY_NAME="Fedora Linux 38 (Workstation Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:38"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f38/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=38
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=38
SUPPORT_END=2024-05-14
VARIANT="Workstation Edition"
VARIANT_ID=workstation

Steps to reproduce

  1. download latest crc binary
  2. crc config set preset okd
  3. crc setup ( which downloads okd 4.12)
  4. crc start

Expected

succesfull setup

Actual

fails with :

Failed to update kubeadmin user password: Cannot generate the kubeadmin user password: open /home/user/.crc/machines/crc/kubeadmin-password: no such file or directory

Indeed the directory and the file do not exist, which seems to be the cause of the issue.
If I set these up with an empty file, the start continues and then complains about the machine qemu-1-crc existing, which seems to be the crc machine. When I remove that, the same problem occurs.

The issue does not seem to happen with OCP 4.12, only OKD.

Logs


INFO Checking if CRC bundle is extracted in '$HOME/.crc' 
INFO Checking if /home/user/.crc/cache/crc_okd_libvirt_4.12.0-0.okd-2023-02-18-033438_amd64.crcbundle exists 
INFO Getting bundle for the CRC executable        
INFO Downloading bundle: /home/user/.crc/cache/crc_okd_libvirt_4.12.0-0.okd-2023-02-18-033438_amd64.crcbundle... 
Getting image source signatures
Copying blob 0d76ba17a55c done  
Copying config 9d8cdd8dcc done  
Writing manifest to image destination
Storing signatures
INFO Extracting the image bundle layer...         
crc_okd_libvirt_4.12.0-0.okd-2023-02-18-033438_amd64.crcbundle:  3.86 GiB / 3.86 GiB [-----------------------------------------------------------------] 100.00%
INFO Verifying the bundle signature...            
INFO Uncompressing /home/user/.crc/cache/crc_okd_libvirt_4.12.0-0.okd-2023-02-18-033438_amd64.crcbundle 
crc.qcow2:  14.26 GiB / 14.26 GiB [--------------------------------------------------------------------------------------------------------------------] 100.00%
oc:  124.65 MiB / 124.65 MiB [-------------------------------------------------------------------------------------------------------------------------] 100.00%
Your system is correctly setup for using CRC. Use 'crc start' to start the instance
[user@fedora Downloads]$ crc start

Before gather the logs try following if that fix your issue

$ crc delete -f
$ crc cleanup
$ crc setup
$ crc start --log-level debug

Please consider posting the output of crc start --log-level debug on http://gist.github.com/ and post the link in the issue.

Snippet of debug log:

DEBU error: Temporary error: pull secret not updated to disk - sleeping 2s
DEBU retry loop: attempt 41
DEBU Running SSH command:
DEBU SSH command succeeded
DEBU error: Temporary error: pull secret not updated to disk - sleeping 2s
DEBU retry loop: attempt 42
DEBU Running SSH command:
DEBU SSH command succeeded

then

Failed to update kubeadmin user password: Cannot generate the kubeadmin user password: open /home/user/.crc/machines/crc/kubeadmin-password: no such file or directory

Failed to update kubeadmin user password: Cannot generate the kubeadmin user password: open

@VGerris VGerris added kind/bug Something isn't working status/need triage labels May 5, 2023
@VGerris
Copy link
Contributor Author

VGerris commented May 5, 2023

another setup and start :


INFO Checking if /home/user/.crc/cache/crc_okd_libvirt_4.12.0-0.okd-2023-02-18-033438_amd64.crcbundle exists 
Your system is correctly setup for using CRC. Use 'crc start' to start the instance
[user@fedora Downloads]$ crc start
INFO Loading bundle: crc_okd_libvirt_4.12.0-0.okd-2023-02-18-033438_amd64... 
INFO A CRC VM for OKD 4.12.0-0.okd-2023-02-18-033438 is already running 
Started the OpenShift cluster.

The server is accessible via web console at:
  

Log in as administrator:
  Username: kubeadmin
  Password: 

Log in as user:
  Username: developer
  Password: developer

Use the 'oc' command line interface:
  $ eval $(crc oc-env)
  $ oc login -u developer 


NOTE:
This cluster was built from OKD - The Community Distribution of Kubernetes that powers Red Hat OpenShift.
If you find an issue, please report it at https://github.com/openshift/okd

So no valid url shows and openshift does not start:


crc status --log-level debug
DEBU CRC version: 2.18.0+4ea3a1                   
DEBU OpenShift version: 4.12.13                   
DEBU Podman version: 4.4.1                        
DEBU Running 'crc status'                         
CRC VM:          Running
OpenShift:       Stopped (v4.12.0-0.okd-2023-02-18-033438)
RAM Usage:       5.368GB of 25.22GB
Disk Usage:      16.25GB of 32.68GB (Inside the CRC VM)
Cache Usage:     38.11GB
Cache Directory: /home/user/.crc/cache

@VGerris
Copy link
Contributor Author

VGerris commented May 5, 2023

Workaround : set an arbitrary password in the kubeadmin-password file before start. It wil for some reason not update kubeconfig:

INFO Adding crc-admin and crc-developer contexts to kubeconfig... 
ERRO Cannot update kubeconfig: open : no such file or directory 
Started the OpenShift cluster.

The server is accessible via web console at:
  

Log in as administrator:
  Username: kubeadmin
  Password: 

Log in as user:
  Username: developer
  Password: developer

Use the 'oc' command line interface:
  $ eval $(crc oc-env)
  $ oc login -u developer 


NOTE:
This cluster was built from OKD - The Community Distribution of Kubernetes that powers Red Hat OpenShift.
If you find an issue, please report it at https://github.com/openshift/okd

and not print the usual login links.

This should work with the content of the kubeadmin-password file as password:

oc login -u kubeadmin --server=https://api.crc.testing:6443
The server uses a certificate signed by an unknown authority.
You can bypass the certificate check, but any data you send to the server could be intercepted by others.
Use insecure connections? (y/n): y

WARNING: Using insecure TLS client config. Setting this option is not supported!

Authentication required for https://api.crc.testing:6443 (openshift)
Console URL: https://api.crc.testing:6443/console
Username: kubeadmin
Password: 
Login successful.

You have access to 66 projects, the list has been suppressed. You can list all projects with 'oc projects'

Using project "default".

The issue seems to be that the kubeadmin-password file is not created and on top of that that the kubeconfig file cannot be found, although there is on in multiple sub directories:

locate kubeconfig
/home/user/.crc/cache/crc_okd_libvirt_4.12.0-0.okd-2023-02-18-033438_amd64/kubeconfig
/home/user/.crc/machines/crc/kubeconfig

This all works as expected on another machine with 4.11 okd and an older crc version.
I am happy to test if someone has a proposed fix, thank you.

FYI I did try a full clean and reinstall multiple times with no succes, but somehow OCP 4.12.13 seemed to work without issues.

@VGerris VGerris changed the title [BUG] [BUG] OKD 4.12 does not start because of missing kubeadmin-password file and/or kubeconfig write issue May 5, 2023
@praveenkumar
Copy link
Member

Failed to update kubeadmin user password: Cannot generate the kubeadmin user password: open /home/user/.crc/machines/crc/kubeadmin-password: no such file or directory

@VGerris This file is created during crc start, may be you have restricted permission to your home directory and crc not able to create that file.

@thanosz
Copy link

thanosz commented May 8, 2023

I am also experiencing the same problem

cfergeau added a commit to cfergeau/crc that referenced this issue May 11, 2023
It's only done for the openshift preset at the moment, but okd also
expects a kubeadmin file to be created.
This should fix part of crc-org#3635
cfergeau added a commit to cfergeau/crc that referenced this issue May 11, 2023
It's only done for the openshift preset at the moment, but okd also
expects a kubeadmin file to be created.
This should fix part of crc-org#3635
@cfergeau
Copy link
Contributor

cfergeau commented May 11, 2023

This branch should help, some codepaths are only used for openshift bundles when they should be used for both openshift and okd.

https://github.com/cfergeau/crc/commits/okd

@bergner
Copy link

bergner commented May 12, 2023

I tried both https://github.com/cfergeau/crc "okd" branch and https://github.com/crc-org/crc HEAD of the master branch. There are some issues with both, but the fixes in the "okd" branch gave me some good hints on what to fix in the master branch. Bug #2857 also causes crc start with the okd preset to fail with errors described in this ticket. The fixes proposed by @cfergeau introduces a IsOKD() method in CrcBundleInfo and then replaces various occurrences or bundleInfo.IsOpenShift() with bundleInfo.IsOpenShift() || bundleInfo.IsOKD() and the problem caused by the linked issue is exactly the same, an if statement involving IsOpenShift().

Given that the "okd" preset isn't receiving all that much love from maintainers would it perhaps make more sense to have bundleInfo.IsOpenShift() check for both the "openshift" and "okd" presets and then people who contribute to CRC development don't have to remember to always write bundleInfo.IsOpenShift() || bundleInfo.IsOKD()

func (bundle *CrcBundleInfo) IsOpenShift() bool {
    preset := bundle.GetBundleType()
    return preset == crcPreset.OpenShift || preset == crcPreset.OKD
}

With that fix in current HEAD the "okd" preset seems to work ok.

@gbraad
Copy link
Contributor

gbraad commented May 12, 2023

I just wanted to add, but already see people noticed: this concerns OKD

We do not provide support or code changes for OKD. This is based on community support. So, if you can, please provide a PR to fix this.

The reason for these codepaths is that this was originally maintained as a fork. We have since backported those changes due to lack of interest of and work on the fork. We wanted to prevent them from affected our critical flow of delivering the OCP-based release.

@gbraad gbraad added tags/help wanted Extra attention is needed tags/good first issue Good for newcomers size/M priority/minor preset/openshift-okd Unsupported configuration and removed status/need triage labels May 12, 2023
bergner pushed a commit to bergner/crc that referenced this issue May 12, 2023
There are numerous places in the CRC code base that calls IsOpenShift()
to run logic that is not intended for Podman setups or other more
limited setups. Using the "okd" preset gives you something that is
very close to the "openshift" preset.

CrcBundleInfo's IsOpenShift() now treats the "okd" preset as if it was
OpenShift. This means we can avoid littering the code with checks for
both OpenShift and OKD in a bunch of places, which also reduces the risk
of forgetting to check for OKD.

Fixes crc-org#3635
bergner pushed a commit to bergner/crc that referenced this issue May 12, 2023
There are numerous places in the CRC code base that calls IsOpenShift()
to run logic that is not intended for Podman setups or other more
limited setups. Using the "okd" preset gives you something that is
very close to the "openshift" preset.

CrcBundleInfo's IsOpenShift() now treats the "okd" preset as if it was
OpenShift. This means we can avoid littering the code with checks for
both OpenShift and OKD in a bunch of places, which also reduces the risk
of forgetting to check for OKD.

Fixes crc-org#3635
@gbraad
Copy link
Contributor

gbraad commented May 12, 2023

I think in this case it might have been a forgotten case due to the addition of Microshift. Anyway, it runs the tests now. not sure if this will be included in the release of next week.

@cfergeau
Copy link
Contributor

I tried both https://github.com/cfergeau/crc "okd" branch and https://github.com/crc-org/crc HEAD of the master branch.

Thanks for testing/improving my branch! :)

@VGerris
Copy link
Contributor Author

VGerris commented May 24, 2023

I can confirm okd 4.12 works as expected by building : https://github.com/crc-org/crc .(UPDATE - does not start)
I had to do a crc cleanup because I got :

INFO Loading bundle: crc_okd_libvirt_4.12.0-0.okd-2023-04-16-041331_amd64... 
Error loading machine:

after cleanup I ran setup and started, but :

[ubuntu@studiox360g5-lan crc]$ ./out/linux-amd64/crc start
INFO Loading bundle: crc_okd_libvirt_4.12.0-0.okd-2023-04-16-041331_amd64... 
INFO A CRC VM for OKD 4.12.0-0.okd-2023-04-16-041331 is already running 
Cannot create cluster configuration: Error reading kubeadmin password from bundle open /home/ubuntu/.crc/machines/crc/kubeadmin-password: no such file or directory

right before that at start :

INFO Waiting for kube-apiserver availability... [takes around 2min] 
Error waiting for apiserver: Temporary error: ssh command error:
command : timeout 5s oc get nodes --context admin --cluster crc --kubeconfig /opt/kubeconfig
err     : Process exited with status 1

so same issue but now with an ssh time out before.

@praveenkumar there are no permission issues.

Again, the workaround I described fixes it so it seems that kubeadmin file is still not being created?
I got the timeout again with ssh so I needed to start twice, then I get the usual output:


./out/linux-amd64/crc start
INFO Loading bundle: crc_okd_libvirt_4.12.0-0.okd-2023-04-16-041331_amd64... 
INFO A CRC VM for OKD 4.12.0-0.okd-2023-04-16-041331 is already running 
Started the OpenShift cluster.

The server is accessible via web console at:
  https://console-openshift-console.apps-crc.testing

Log in as administrator:
  Username: kubeadmin
  Password: hallo

Log in as user:
  Username: developer
  Password: developer

Use the 'oc' command line interface:
  $ eval $(crc oc-env)
  $ oc login -u developer https://api.crc.testing:6443


NOTE:
This cluster was built from OKD - The Community Distribution of Kubernetes that powers Red Hat OpenShift.
If you find an issue, please report it at https://github.com/openshift/okd

thanks

@VGerris
Copy link
Contributor Author

VGerris commented May 24, 2023

scratch that, it is down. I can try the other branch but would appreciate it if this can be looked into further, especially if nothing specific is needed for OKD. Thank you.

@VGerris
Copy link
Contributor Author

VGerris commented May 25, 2023

Apologies for writing here again, but using the 2.19 version of crc works if one does something like:
echo hello > /home/<youruser>/.crc/machines/crc/kubeadmin-password
after crc setup.

So if in the code there is a simply way to make that happen I think it is ok.
Current HEAD doesn´t work for me so I hope it does not get worse.
I have no time to make a PR at the moment, but happy to help testing if needed.
thanks

@cfergeau
Copy link
Contributor

func getClusterConfig(bundleInfo *bundle.CrcBundleInfo) (*types.ClusterConfig, error) {
if !bundleInfo.IsOpenShift() {
return &types.ClusterConfig{
ClusterType: bundleInfo.GetBundleType(),
ProxyConfig: &network.ProxyConfig{},
}, nil
}
kubeadminPassword, err := cluster.GetKubeadminPassword()
if err != nil {
return nil, fmt.Errorf("Error reading kubeadmin password from bundle %v", err)
}
still needs to be fixed.

@cfergeau cfergeau reopened this May 25, 2023
@cfergeau cfergeau changed the title [BUG] OKD 4.12 does not start because of missing kubeadmin-password file and/or kubeconfig write issue [BUG] OKD 4.12 does not start because of missing kubeadmin-password file May 25, 2023
@VGerris
Copy link
Contributor Author

VGerris commented May 25, 2023

I made a PR :
#3679

https://github.com/crc-org/crc/pull/3679/files

I am not sure if it can be further improved given :
main...bergner:crc:bug-3635-GetBundleType-okd
where a similar change is made.

N.B. Note the error I got before, not sure where that comes from.

@DaveWK
Copy link

DaveWK commented May 27, 2023

Just ran into this and assumed it was an issue with the ssl certs being too old.

Would it be possible for crc to print the actual error instead of what I'd consider a "red herring" about ssl certs expiry? Only found this issue because of enabling --debug logging, but Cannot create cluster configuration: Error reading kubeadmin password from bundle open C:\<User dir>\.crc\machines\crc\kubeadmin-password: The system cannot find the file specified. seems like something that should at least be at WARN

@bergner
Copy link

bergner commented May 28, 2023

There are several occurrences of IsOpenshift() checks that were problematic on OKD which is why I opted to have the IsOpenshift() check for both the "openshift" and "okd" presets in my PR rather than sprinkle || bundle.IsOKD() in lots of places. It reduces the risk of people forgetting to check for the OKD case, but as comments in my PR alludes to a semantically better way of dealing with this is to have capability based checks rather than preset based checks, e.g. NeedsKubeadminPassword, NeedsSomethingElse rather than IsOpenshift and IsOKD.

@VGerris
Copy link
Contributor Author

VGerris commented Jun 2, 2023

I just installed the recently released 2.20. This specific issue is resolved, but OKD does not start with many errors like :

Temporary error: ssh command error:
command : timeout 5s oc get nodes --context admin --cluster crc --kubeconfig /opt/kubeconfig
err     : Process exited with status 1

I wrote earlier what I tried to fix this and where I got stuck. I am happy to help out if someone can bring me up to speed with how the bundles work and are created so I can look further into this. pm me at github_username_at_gmail_ .thanks

@cfergeau
Copy link
Contributor

cfergeau commented Jun 2, 2023

Are you using nested virtualization?

@VGerris
Copy link
Contributor Author

VGerris commented Jun 2, 2023

hi, no I just run it as is on Fedora 38. With the VM I means the crc machine created by the installer.

@DaveWK
Copy link

DaveWK commented Jun 2, 2023

I was able to get crc w/ okd to work on fedora 38 (and windows) after building the version w/ your fix -- thanks!

@VGerris timeouts could be related to it being too slow during hte bootstrapping; what are you using for CPU/Memory config on crc? The defaults are a bit too low IMO.. I set it for 8 cpu and 16 gb ram, but something closer to 4 cpu and 8 gb ram is probably a more reasonable minimum than the current default. After you change the defaults you will need to run crc delete and crc setup again to rebuild the vm..

If that doesn't work (and back to my original comment) try doing a fresh VM setup/init using the --debug flag since as I stated in my comment, the actual error/exception that is causing an issue is not being logged properly at an ERROR/WARN log level.

@VGerris
Copy link
Contributor Author

VGerris commented Jun 2, 2023

hi,

I looked into it again, it is not a resource issue it is the bug I encountered and mentioned in the linked bug:
I looked further into the issue of the server starting and found etcd also to be down. That turned out to be similar to : #1786 . In my case I changed 192.168.126.11 to 192.168.130.11 in /etc/kubernetes/manifests/etcd-pod.yaml . After a restart of the kubelet service, etcd started and kube-apiserver too checked by crictl ps.

#3679 (comment)

login : ssh -i /home//.crc/machines/crc/id_ecdsa -p 22 core@192.168.130.11
Changing the ip in the file and restarting kubelet service makes it work.
So perhaps that IP bug needs to be revisited?
I also noticed I cannot login as kubeadmin as printed by kubeadmin , developer works.
After a while, oauth and kubeapi server and marketplace operator restart.
This happens with 6 vcpu, also on default 4 ( I have 16000 mem and never had issues with 4 cpus).

I am unable to get it to start and work again ( now on 4.13 OKD 2.20 of crc)

oc logs etcd-crc-cd8kd-master-0 -n openshift-etcd --kubeconfig=/opt/kubeconfig

[update]
problem now is : {"level":"warn","ts":"2023-06-02T14:58:55.088Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"10.217.0.39:36760","server-name":"","error":"remote error: tls: bad certificate"}
{"level":"warn","ts":"2023-06-02T14:58:55.232Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"10.217.0.39:36766","server-name":"","error":"remote error: tls: bad certificate"}
{"level":"warn","ts":"2023-06-02T14:58:55.247Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"192.168.126.11:38754","server-name":"","error":"remote error: tls: bad certificate"}
{"level":"warn","ts":"2023-06-02T14:58:55.832Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"10.217.0.2:57670","server-name":"","error":"remote error: tls: bad certificate"}
[/update]
[update2] Seems that IP change is not working either for etcd. my working 4.11 has the IP set to 192.168.126.11 and works, so perhaps this is another issue. The network interfaces in VM on both machines are the same so at this moment I have no idea why it doesn't 'just work'.
I retried with OCP and it works straight away after a clean and setup/start which seems to rule out resources.
[/update2]

[update3]
Since networking seems to be an issue, I check ip a output.
In the working 4.11 OKD :

7: eth10: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 4e:e5:0e:00:2e:5a brd ff:ff:ff:ff:ff:ff
    inet 192.168.126.11/24 brd 192.168.126.255 scope global noprefixroute eth10
       valid_lft forever preferred_lft forever
    inet6 fe80::1c7c:8cb1:4d30:c14e/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

it comes straight under enp2s0 with 192.168.130.11 address.
On the non-working 4.13, the two lines for ipv6 are missing and it's a bit lower in the hierarchy.
Otherwise it looks similar.
[/update3]

[update4]
cause of crc not starting is api server not starting because etcd not starting (apiserver cannot connect to 2379). I saw these pipemail errors in some of the commands in the etcd manifest yaml and thought that had something to do with it and tried to copy the config of 4.11 without succes.
I have no idea why this seems so fragile, it looked like it worked partially once and then not anymore.
As a last option I removed the .crc dir and restarted - same problem.
The cause seems to have to with bug 1888 and how NODE_IP is set but I haven't found yet why this happens precisely.
Would be great if someone does.
[/update4]

@VGerris
Copy link
Contributor Author

VGerris commented Jun 3, 2023

Ok, this is another ugly workaround that made my instance start.

  • ssh -i /home/youruser/.crc/machines/crc/id_ecdsa -p 22 core@192.168.130.11 to login to the vm
  • sudo su to become root
  • vi /etc/systemd/system/kubelet.service ( or use your favorite editor that is installed :) -> change KUBE_NODE_IP to 192.168.126.11
  • systemctl restart kubelet -> restarts kubelet server

The above got my cluster working but only the developer account seemed to exist. The authentication is configured with htacess. To have oc run as admin and fix the kubeadmin user you can do the following ( being logged in to the VM):

  • export KUBEADMIN=/opt/kubeadmin -> oc uses this for auth
  • oc create user kubeadmin -> creates the user
  • https://docs.okd.io/4.13/authentication/identity_providers/configuring-htpasswd-identity-provider.html , in short
  • oc get secret htpass-secret -ojsonpath={.data.htpasswd} -n openshift-config | base64 --decode > users.htpasswd -> gets secret with users
  • update password for kubeadmin ( I haven´t tried it the one printed worked, I copied the developer one so it is 'developer'
  • oc create secret generic htpass-secret --from-file=htpasswd=users.htpasswd --dry-run=client -o yaml -n openshift-config | oc replace -f -
  • oc adm policy add-cluster-role-to-user cluster-admin kubeadmin -> make kubeadmin cluster admin
  • should work to login now with kubeadmin with password 'developer'

At restart I get : Failed to update pull secret on the disk: Temporary error: pull secret not updated to disk (x207)
but I can use the cluster now.

I cannot find how the mentioned var is set so not sure how this should be fixed.
This has some good pointers:
#1888
Perhaps someone with more knowledge can look into that?
At this moment I cannot say it this is a consequence of the PR, but it seems unlikely to me.

The root of the issue is that the kubelet service uses the wrong IP to look at and start etcd , which is set by KUBE_NODE_IP in the systemd file.

I will leave this for a bit now and hope someone can clarify this further and/or fix it structurally, thank you.

@praveenkumar
Copy link
Member

Looks like this is issue of our bundle generation script https://github.com/crc-org/snc/blob/master/createdisk.sh#L116-L122 we are setting right internal IP in case of OCP but not in case of OKD, we will put a PR and may be when next bundle is generated shouldn't have any issue.

@praveenkumar
Copy link
Member

crc-org/snc#733 is how it should be handled as part of bundle generation.

@VGerris
Copy link
Contributor Author

VGerris commented Aug 21, 2023

I tested with the latest crc and it works like a charm.
Thank you!
OKD version : 4.13.0-0.okd-2023-06-04-080300
https://developers.redhat.com/content-gateway/rest/mirror/pub/openshift-v4/clients/crc/2.24.1
Closing this, thanks again!

@VGerris VGerris closed this as completed Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working preset/openshift-okd Unsupported configuration priority/minor size/M tags/good first issue Good for newcomers tags/help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

7 participants