-
Notifications
You must be signed in to change notification settings - Fork 853
traffic_server deadlocked after config reload #1298
Copy link
Copy link
Closed
Description
There was a configuration change(an update to parent.config) pushed to a set of servers. Shortly after the config reload(via traffic_ctl), one host's traffic_cop starting failing heartbeats. The ATS process quit serving traffic. Attaching gdb, I see a number of threads attempting to do a hostdb lookup. Filesystem has a host.db.syncing file that is dated soon after the reload.
At the time of reload, there were approximately 800 active server connections. The same configuration was applied to 23 other hosts at the same time, which successfully reloaded without issue.
syslog:
Jan 3 16:29:37 s_sys@host traffic_manager[7156]: {0x7f6ec8ffe700} NOTE: User has changed config file parent.config
Jan 3 16:29:45 s_sys@host traffic_server[7169]: {0x2aaab470c700} NOTE: loading SSL certificate configuration from /opt/user/etc/trafficserver/ssl_multicert.config
Jan 3 16:34:50 s_sys@host traffic_cop[7154]: (test) read timeout [180000 ]
Jan 3 16:34:50 s_sys@host traffic_cop[7154]: server heartbeat failed [1]
Jan 3 16:38:00 s_sys@host traffic_cop[7154]: (test) read timeout [180000 ]
Jan 3 16:38:00 s_sys@host traffic_cop[7154]: server heartbeat failed [2]
/var/cache/trafficserver:
[user@host trafficserver]$ ls -altr
total 28
drwxr-xr-x. 10 root root 4096 Oct 17 09:00 ..
-rw-r--r-- 1 user user 12029 Jan 3 16:31 host.db
drwxr-xr-x 2 user user 4096 Jan 3 16:31 .
-rw-r--r-- 1 user user 4109 Jan 3 16:32 host.db.syncing
Reactions are currently unavailable