Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JetStream gets into broken state when upgraded from v2.8.4 to v.2.9.0 #15476

Closed
3 tasks
mfaizanse opened this issue Sep 13, 2022 · 3 comments · Fixed by #15565
Closed
3 tasks

JetStream gets into broken state when upgraded from v2.8.4 to v.2.9.0 #15476

mfaizanse opened this issue Sep 13, 2022 · 3 comments · Fixed by #15565
Assignees
Labels
area/eventing Issues or PRs related to eventing kind/bug Categorizes issue or PR as related to a bug. release blocker
Milestone

Comments

@mfaizanse
Copy link
Member

When JetStream is upgraded from v2.8.4 to v.2.9.0, it gets into a broken state.

╰─ nats server report jsz --user admin --password <pass>
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                JetStream Summary                                                 │
├──────────────────┬───────────────┬─────────┬───────────┬──────────┬────────┬────────┬────────┬─────────┬─────────┤
│ Server           │ Cluster       │ Streams │ Consumers │ Messages │ Bytes  │ Memory │ File   │ API Req │ API Err │
├──────────────────┼───────────────┼─────────┼───────────┼──────────┼────────┼────────┼────────┼─────────┼─────────┤
│ eventing-nats-2* │ eventing-nats │ 1       │ 1         │ 26       │ 11 KiB │ 0 B    │ 11 KiB │ 45      │ 21      │
│ eventing-nats-0  │ eventing-nats │ 1       │ 1         │ 26       │ 11 KiB │ 0 B    │ 11 KiB │ 0       │ 0       │
│ eventing-nats-1  │ eventing-nats │ 1       │ 1         │ 26       │ 11 KiB │ 0 B    │ 11 KiB │ 0       │ 0       │
├──────────────────┼───────────────┼─────────┼───────────┼──────────┼────────┼────────┼────────┼─────────┼─────────┤
│                  │               │ 3       │ 3         │ 78       │ 34 KiB │ 0 B    │ 34 KiB │ 45      │ 21      │
╰──────────────────┴───────────────┴─────────┴───────────┴──────────┴────────┴────────┴────────┴─────────┴─────────╯

╭────────────────────────────────────────────────────────────╮
│                RAFT Meta Group Information                 │
├─────────────────┬────────┬─────────┬────────┬────────┬─────┤
│ Name            │ Leader │ Current │ Online │ Active │ Lag │
├─────────────────┼────────┼─────────┼────────┼────────┼─────┤
│ eventing-nats-0 │        │ true    │ true   │ 0.78s  │ 0   │
│ eventing-nats-1 │        │ true    │ true   │ 0.78s  │ 0   │
│ eventing-nats-2 │ yes    │ true    │ true   │ 0.00s  │ 0   │
╰─────────────────┴────────┴─────────┴────────┴────────┴─────╯

BUT:

╰─ nats stream ls
No Streams defined
╰─ nats stream info
nats: error: could not pick a Stream to operate on: no Streams are defined
╰─ nats consumer info
nats: error: could not select Stream: no Streams are defined

Logs from eventing-nats pods:

─ k logs -n kyma-system eventing-nats-2 -c nats | grep sap
[30] 2022/09/13 12:43:42.263459 [INF]   Starting restore for stream '$G > sap'
[30] 2022/09/13 12:43:42.267241 [INF]   Restored 26 messages for stream '$G > sap'
[30] 2022/09/13 12:43:42.267794 [INF]   Recovering 1 consumers for stream - '$G > sap'
[30] 2022/09/13 12:43:42.285545 [ERR] 100.64.0.17:6222 - rid:11 - Unknown account "$SYS" for remote subject "$JSC.SI.$G.sap"
[30] 2022/09/13 12:43:42.287077 [ERR] 100.64.2.14:6222 - rid:12 - Unknown account "$SYS" for remote subject "$JSC.SI.$G.sap"
[30] 2022/09/13 12:43:42.287518 [ERR] 100.64.2.14:6222 - rid:12 - Unknown account "$SYS" for remote subject "$JSC.CI.$G.sap.c74c20756af53b592f87edebff67bdf8"
╰─ k logs -n kyma-system eventing-nats-1 -c nats | grep sap
[31] 2022/09/13 12:44:15.578793 [INF]   Starting restore for stream '$G > sap'
[31] 2022/09/13 12:44:15.582617 [INF]   Restored 26 messages for stream '$G > sap'
[31] 2022/09/13 12:44:15.583279 [INF]   Recovering 1 consumers for stream - '$G > sap'
[31] 2022/09/13 12:44:15.594882 [ERR] 100.64.2.14:6222 - rid:12 - Unknown account "$SYS" for remote subject "$JSC.CI.$G.sap.c74c20756af53b592f87edebff67bdf8"
[31] 2022/09/13 12:44:15.595228 [ERR] 100.64.2.14:6222 - rid:12 - Unknown account "$SYS" for remote subject "$JSC.SI.$G.sap"
╰─ k logs -n kyma-system eventing-nats-0 -c nats | grep sap
[31] 2022/09/13 12:44:51.463276 [INF]   Starting restore for stream '$G > sap'
[31] 2022/09/13 12:44:51.466081 [INF]   Restored 26 messages for stream '$G > sap'
[31] 2022/09/13 12:44:51.466568 [INF]   Recovering 1 consumers for stream - '$G > sap'

Hint

Seems like the PR to add user account to NATS have a side-effect with the upgrade. According to NATS support, if the user account is changed during the upgrade then it could be the issue. From the intitial check, it seems that the secret eventing-nats-secret is getting updated during the kyma upgrade.

Steps to reproduce

  • Provision a Kyma cluster v2.6.2 using Production profile.
  • Create a function and subscription and sent some events.
  • Delete the function and sent some events.
  • Verify the events are stored in JetStream Stream nats stream info sap.
  • Upgrade Kyma to new version with NATS v 2.9.0. (or upgrade to this PR)
  • Check the JetStream health.

Tasks

  • Debug the issue and see if the issue is related to the changes done by PR.
  • Fix the issue.
  • Merge the PR to bump the NATS image.
@mfaizanse
Copy link
Member Author

mfaizanse commented Sep 19, 2022

Update: Same happened for kyma 2.6.2 to kyma 2.7.0-rc1. The NATS version in both Kyma versions is 2.8.4.

@mfaizanse
Copy link
Member Author

mfaizanse commented Sep 20, 2022

@zhoujing2022 zhoujing2022 modified the milestones: 2.7, 2.7.1 Sep 20, 2022
@zhoujing2022
Copy link
Collaborator

I created a new milestone, because we need rc2. please link everything to this milestone later. thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/eventing Issues or PRs related to eventing kind/bug Categorizes issue or PR as related to a bug. release blocker
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants