-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Search before asking
- I searched in the issues and found nothing similar.
Version
Pulsar 2.10.3.
Docker v4.21.1
Pulsar is the only container running. Docker has 10 CPUs, 24GB of RAM, and 4GB of swap. The host machine is not under memory or CPU pressure (running a terminal and Docker, no other programs, activity monitor reports idle).
Docker engine: 24.0.2
Config:
{
"builder": {
"gc": {
"defaultKeepStorage": "180GB",
"enabled": true
}
},
"experimental": true,
"features": {
"buildkit": true
}
}
MacOS 12.6.7
ARM/M1 processor.
Minimal reproduce step
- Run this Docker command:
docker run \
--rm \
--name chariot_local_pulsar \
-it \
-p 6650:6650 \
-p 8080:8080 \
--cap-add=SYS_PTRACE \
--platform linux/x86_64 \
apachepulsar/pulsar:2.10.3 \
bin/pulsar standalone -nss -nfw
What did you expect to see?
Within a few minutes, a broker that I can use for admin API and messaging.
What did you see instead?
This bug is intermittent. Usually, things work fine.
But sometimes (about 1 or 2 out of every 5 times), the broker never starts up; it writes the attached logs and then stops emitting any log info (no activity for 30min). I can not connect to the management API or the messaging port and can not perform any operations. No attempts to connect to either port cause new log output to occur.
Anything else?
I have observed similar problems on Pulsar 3.0, with different logs.
Restarting the computer, or resetting docker (docker system prune --all --force) doesn't seem to reduce or increase the likelihood of this occurring. The very first broker start seems about as likely to fail as any subsequent restart, whether or not the docker daemon was restarted or the image was re-pulled.
If I exec into the container and attempt to capture a heap dump, I get the following error:
I have no name!@2f8835389778:/pulsar$ jmap -dump:live,format=b,file=dump.hprof 1
Exception in thread "main" com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /proc/1/root/tmp/.java_pid1: target process 1 doesn't respond within 10500ms or HotSpot VM not loaded
at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:100)
at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:128)
at jdk.jcmd/sun.tools.jmap.JMap.dump(JMap.java:208)
at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:114)
top doesn't report resource exhaustion in the container:
top - 16:33:00 up 1:02, 0 users, load average: 1.00, 1.01, 1.00
Tasks: 3 total, 0 running, 3 sleeping, 0 stopped, 0 zombie
%Cpu(s): 10.1 us, 0.0 sy, 0.0 ni, 89.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 24010.6 total, 19671.0 free, 1454.2 used, 2885.4 buff/cache
MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 21960.3 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 10000 20 0 9907.3m 883056 35864 S 101.0 3.6 19:22.82 java
423 10000 20 0 149248 13184 4992 S 0.0 0.1 0:00.07 bash
457 10000 0 0 150872 11024 4876 0 0.0 0.0 0:00.00 top
ps output:
UID PID PPID C STIME TTY TIME CMD
10000 1 0 99 16:14 pts/0 00:21:11 /usr/bin/qemu-x86_64 /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Dlog4j.shutdo
10000 423 0 0 16:24 pts/1 00:00:00 /usr/bin/qemu-x86_64 /bin/bash /bin/bash
10000 529 423 0 15:30 ? 00:00:00 ps -ef
Are you willing to submit a PR?
- I'm willing to submit a PR!