You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
7 days ago I upgraded NATS server from 2.8.4 to 2.10.5. We have a three-node cluster and I performed a rolling upgrade. There seemed to be no immediate consequences to this upgrade, and things ran well, until suddenly two days ago I started seeing odd errors cropping up in the logs from apps that use NATS, and I discovered that two of my three nodes had crashed. The system logs on those boxen both showed the same odd error: "fatal error: concurrent map read and map write", followed by a massive stack dump. Googling reveled some generic golang-related articles, but nothing NATS specific. I had two more crashes today, same error. I'm attaching the logged stack dump.
There doesn't seem to be a specific cause that I can relate to this. Looking through the message histories we keep reveal no common message or event that correlates with the crashes. They occurred at different times of the day and, of course, are on different hosts. Message volume is not particularly intese when the crashes occur. I'm at a loss.
Server: 2.10.5
Clients: NPM versions 2.7.1, 2.17.0, 2.9.0, 2.12.1, 2.8.0, 2.10.2 (Various scripts have installed whatever version was current at the time the script was authored)
Host environment
VMs Running Ubuntu LTS One is on 20.04.4, the others on 18.04.1.
Steps to reproduce
No response
The text was updated successfully, but these errors were encountered:
We are aware of the issue in general and have a solution on top of main already, will be included in 2.10.6 release which should probably be next week.
Observed behavior
7 days ago I upgraded NATS server from 2.8.4 to 2.10.5. We have a three-node cluster and I performed a rolling upgrade. There seemed to be no immediate consequences to this upgrade, and things ran well, until suddenly two days ago I started seeing odd errors cropping up in the logs from apps that use NATS, and I discovered that two of my three nodes had crashed. The system logs on those boxen both showed the same odd error: "fatal error: concurrent map read and map write", followed by a massive stack dump. Googling reveled some generic golang-related articles, but nothing NATS specific. I had two more crashes today, same error. I'm attaching the logged stack dump.
There doesn't seem to be a specific cause that I can relate to this. Looking through the message histories we keep reveal no common message or event that correlates with the crashes. They occurred at different times of the day and, of course, are on different hosts. Message volume is not particularly intese when the crashes occur. I'm at a loss.
nats-kablooey.log
Expected behavior
Not crash?
Server and client version
Server: 2.10.5
Clients: NPM versions 2.7.1, 2.17.0, 2.9.0, 2.12.1, 2.8.0, 2.10.2 (Various scripts have installed whatever version was current at the time the script was authored)
Host environment
VMs Running Ubuntu LTS One is on 20.04.4, the others on 18.04.1.
Steps to reproduce
No response
The text was updated successfully, but these errors were encountered: