Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RKE2 fails to start using NATS with Kine #6186

Open
sdemura opened this issue Jun 12, 2024 · 3 comments
Open

RKE2 fails to start using NATS with Kine #6186

sdemura opened this issue Jun 12, 2024 · 3 comments

Comments

@sdemura
Copy link

sdemura commented Jun 12, 2024

Environmental Info:
RKE2 Version:

>rke2 --version
rke2 version v1.28.9+rke2r1 (07bf87f9118c1386fa73f660142cc28b5bef1886)
go version go1.21.9 X:boringcrypto

Node(s) CPU architecture, OS, and Version:

> uname -a
Linux jammy-01 5.15.0-107-generic #117-Ubuntu SMP Fri Apr 26 12:26:49 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

single node

Describe the bug:

After seeing https://nats.io/blog/exploring-nats-as-a-backend-for-k3s/ I was hopeful this would work with the new Kine support, but it appears it doesn't.

Steps To Reproduce:

Running nats externally

> ./nats-server -js
[4892] 2024/06/12 17:55:32.196742 [INF] Starting nats-server
[4892] 2024/06/12 17:55:32.196803 [INF]   Version:  2.10.14
[4892] 2024/06/12 17:55:32.196804 [INF]   Git:      [31af767]
[4892] 2024/06/12 17:55:32.196810 [INF]   Name:     NCOCM4TEGBQHNZQGEIUJLBXICIOEVNGN4KP5IWJMJCAFVPJ4V3DV4VZY
[4892] 2024/06/12 17:55:32.196813 [INF]   Node:     VAUHWSkw
[4892] 2024/06/12 17:55:32.196816 [INF]   ID:       NCOCM4TEGBQHNZQGEIUJLBXICIOEVNGN4KP5IWJMJCAFVPJ4V3DV4VZY
[4892] 2024/06/12 17:55:32.197008 [INF] Starting JetStream
[4892] 2024/06/12 17:55:32.197093 [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __
[4892] 2024/06/12 17:55:32.197097 [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |
[4892] 2024/06/12 17:55:32.197098 [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |
[4892] 2024/06/12 17:55:32.197099 [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|
[4892] 2024/06/12 17:55:32.197100 [INF]
[4892] 2024/06/12 17:55:32.197101 [INF]          https://docs.nats.io/jetstream
[4892] 2024/06/12 17:55:32.197102 [INF]
[4892] 2024/06/12 17:55:32.197103 [INF] ---------------- JETSTREAM ----------------
[4892] 2024/06/12 17:55:32.197107 [INF]   Max Memory:      8.71 GB
[4892] 2024/06/12 17:55:32.197109 [INF]   Max Storage:     54.67 GB
[4892] 2024/06/12 17:55:32.197111 [INF]   Store Directory: "/tmp/nats/jetstream"
[4892] 2024/06/12 17:55:32.197112 [INF] -------------------------------------------
[4892] 2024/06/12 17:55:32.197324 [INF]   Starting restore for stream '$G > KV_kine'
[4892] 2024/06/12 17:55:32.197440 [INF]   Restored 1 messages for stream '$G > KV_kine' in 0s
[4892] 2024/06/12 17:55:32.197519 [INF] Listening for client connections on 0.0.0.0:4222
[4892] 2024/06/12 17:55:32.197598 [INF] Server is ready

Configure rke2 to use external, nats and explicitly set noEmbed

> grep datastore-endpoint /etc/rancher/rke2/config.yaml
datastore-endpoint: nats://?noEmbed
> sudo rke2 server --debug
WARN[0000] not running in CIS mode
INFO[0000] Applying Pod Security Admission Configuration
INFO[0000] Starting rke2 v1.28.9+rke2r1 (07bf87f9118c1386fa73f660142cc28b5bef1886)
INFO[0000] Starting temporary kine to reconcile with datastore
DEBU[0000] using config &nats.Config{clientURL:"nats://localhost:4222", clientOptions:[]nats.Option(nil), revHistory:0xa, bucket:"kine", replicas:1, slowThreshold:500000000, noEmbed:false, dontListen:false, serverConfig:"", stdoutLogging:false, host:"localhost", port:4222, dataDir:""}
INFO[0000] connecting to nats://localhost:4222
INFO[0000] using bucket: kine
INFO[0000] bucket initialized: kine
INFO[0000] Kine available at unix://kine.sock
ERRO[0001] btree watcher error: context canceled
INFO[0001] generated self-signed CA certificate CN=rke2-client-ca@1718216055: notBefore=2024-06-12 18:14:15.144914536 +0000 UTC notAfter=2034-06-10 18:14:15.144914536 +0000 UTC
INFO[0001] certificate CN=system:admin,O=system:masters signed by CN=rke2-client-ca@1718216055: notBefore=2024-06-12 18:14:15 +0000 UTC notAfter=2025-06-12 18:14:15 +0000 UTC
INFO[0001] certificate CN=system:rke2-supervisor,O=system:masters signed by CN=rke2-client-ca@1718216055: notBefore=2024-06-12 18:14:15 +0000 UTC notAfter=2025-06-12 18:14:15 +0000 UTC
INFO[0001] certificate CN=system:kube-controller-manager signed by CN=rke2-client-ca@1718216055: notBefore=2024-06-12 18:14:15 +0000 UTC notAfter=2025-06-12 18:14:15 +0000 UTC
INFO[0001] certificate CN=system:kube-scheduler signed by CN=rke2-client-ca@1718216055: notBefore=2024-06-12 18:14:15 +0000 UTC notAfter=2025-06-12 18:14:15 +0000 UTC
INFO[0001] certificate CN=system:apiserver,O=system:masters signed by CN=rke2-client-ca@1718216055: notBefore=2024-06-12 18:14:15 +0000 UTC notAfter=2025-06-12 18:14:15 +0000 UTC
INFO[0001] certificate CN=system:kube-proxy signed by CN=rke2-client-ca@1718216055: notBefore=2024-06-12 18:14:15 +0000 UTC notAfter=2025-06-12 18:14:15 +0000 UTC
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x1fe6e6e]

goroutine 217 [running]:
github.com/k3s-io/kine/pkg/drivers/nats.(*KeyValue).btreeWatcher(0xc000cfec00, {0x46a4638, 0xc000cf6f00})
        /go/pkg/mod/github.com/k3s-io/kine@v0.11.7/pkg/drivers/nats/kv.go:324 +0xee
github.com/k3s-io/kine/pkg/drivers/nats.NewKeyValue.func1()
        /go/pkg/mod/github.com/k3s-io/kine@v0.11.7/pkg/drivers/nats/kv.go:521 +0x65
created by github.com/k3s-io/kine/pkg/drivers/nats.NewKeyValue in goroutine 1
        /go/pkg/mod/github.com/k3s-io/kine@v0.11.7/pkg/drivers/nats/kv.go:517 +0x146

It appears that rke2 is ignoring the NATs paraemeter, as I'd expect noEmbed:false to be true here:

DEBU[0000] using config &nats.Config{clientURL:"nats://", clientOptions:[]nats.Option(nil), revHistory:0xa, bucket:"kine", replicas:1, slowThreshold:500000000, noEmbed:false, dontListen:false, serverConfig:"", 

Also wondering if it has to do with RKE2's use of unixs:// instead of unix:// for the Kine socket.

@brandond
Copy link
Contributor

ERRO[0001] btree watcher error: context canceled

Not sure what that's about...

@bruth are you available to take a look at this?

@bruth
Copy link

bruth commented Jun 14, 2024

Indeed, will check it out.

@brandond
Copy link
Contributor

brandond commented Jun 14, 2024

I will say that rke2 and k3s do something goofy when tls is enabled, that is inherited from etcd. It starts up once using a plaintext listener to extract encrypted bootstrap data that includes the CA certs, and then shuts down, and starts up again using the configured certs that it extracted the first time. That message is probably related to that, but I don't know if it's also related to the failure to properly configure the nats client. I'm not sure we've actually tested anything except for sqlite yet, in rke2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants