Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProxySQL is not able to perform a proper shutdown if scheduler scripts are still running #1723

Closed
jaimesicam opened this Issue Oct 3, 2018 · 5 comments

Comments

Projects
None yet
4 participants
@jaimesicam
Copy link

commented Oct 3, 2018

If you try to shutdown ProxySQL and it contains a long running or hung scheduler script, it doesn't kill those running processes. You will need to manually kill them to be able to start ProxySQL again

To simulate the issue:

  1. Setup ProxySQL with PXC
  2. Add "sleep 180" right before "exit 0" of proxysql_galera_checker
  3. Check ps aux:
proxysql 19711  0.0  0.0   4360   352 ?        S    06:26   0:00 sleep 180
proxysql 19712  0.1  0.4  12208  2184 ?        S    06:26   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 20009  0.0  0.0   4360   348 ?        S    06:26   0:00 sleep 180
proxysql 20010  0.1  0.4  12208  2180 ?        S    06:26   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 20308  0.0  0.0   4360   348 ?        S    06:26   0:00 sleep 180
proxysql 20309  0.0  0.4  12208  2184 ?        S    06:26   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 20606  0.0  0.0   4360   352 ?        S    06:26   0:00 sleep 180
proxysql 20607  0.0  0.3  12072  1944 ?        S    06:26   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
root     20611  0.0  0.1  12520   968 pts/0    R+   06:26   0:00 grep --color=auto proxysql
proxysql 20612  0.0  0.1  12072   696 ?        R    06:26   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
root     31449  0.0  0.1   7804   612 pts/1    S+   06:22   0:00 tail -f /var/lib/proxysql/proxysql.log
proxysql 31462  0.0  0.2  56576  1432 ?        S    06:22   0:00 /usr/bin/proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql
proxysql 31463  0.3  2.0 116276 10372 ?        Sl   06:22   0:00 /usr/bin/proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql
  1. Shutdown ProxySQL
mysql> proxysql shutdown
    -> ;
ERROR 2013 (HY000): Lost connection to MySQL server during query
  1. Check ps aux
proxysql 24186  0.0  0.4  12208  2184 ?        S    06:27   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 24483  0.0  0.0   4360   348 ?        S    06:27   0:00 sleep 180
proxysql 24484  0.0  0.4  12208  2184 ?        S    06:27   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 24781  0.0  0.0   4360   348 ?        S    06:27   0:00 sleep 180
proxysql 24782  0.0  0.4  12208  2180 ?        S    06:27   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 25082  0.0  0.0   4360   352 ?        S    06:27   0:00 sleep 180
proxysql 25086  0.0  0.4  12208  2180 ?        S    06:27   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 25386  0.0  0.0   4360   348 ?        S    06:27   0:00 sleep 180
proxysql 25387  0.0  0.4  12208  2184 ?        S    06:27   0:00 /bin/bash /bin/proxysql_galera_checker --config-file=/etc/proxysql-admin.cnf --writer-is-reader=ondemand --write-hg=10 --read-hg=11 --writer-count=1 --mode=singlewrite  --log=/var/lib/proxysql/my_centos_cluster_proxysql_galera_check.log
proxysql 25684  0.0  0.0   4360   352 ?        S    06:27   0:00 sleep 180
root     25686  0.0  0.1  12520   968 pts/0    R+   06:28   0:00 grep --color=auto proxysql
root     31449  0.0  0.1   7804   612 pts/1    S+   06:22   0:00 tail -f /var/lib/proxysql/proxysql.log
  1. Try to start ProxySQL again
[root@proxysql ~]# /etc/init.d/proxysql start
Starting ProxySQL: 2018-10-03 06:28:46 [INFO] Using config file /etc/proxysql.cnf
DONE!

However, proxysql.log has this error:
2018-10-03 06:28:46 network.cpp:53:listen_on_port(): [ERROR] bind(): Address already in use

It's also a bit strange that port 6032 running process is bash:

[root@proxysql ~]# netstat -tapn
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      555/rpcbind         
tcp       35      0 0.0.0.0:6032            0.0.0.0:*               LISTEN      18520/bash          
tcp        0      0 0.0.0.0:6033            0.0.0.0:*               LISTEN      25694/proxysql   

I'm currently using Percona rpm: proxysql-1.4.10-1.1.el7.x86_64

@renecannao

This comment has been minimized.

Copy link
Contributor

commented Oct 3, 2018

Hi.

If you try to shutdown ProxySQL and it contains a long running or hung scheduler script, it doesn't kill those running processes. You will need to manually kill them to be able to start ProxySQL again

This is expected, I don't think there is a valid reason to kill scheduler scripts (note also, that scheduler script could be a wrapper around some other process, so these won't be killed anyway).

However, proxysql.log has this error:

This is something we can try to optimize.
The error you get is because the scheduler script is still using that port, and Admin is unable to bind on it.
I think shouldn't be too difficult to fix that.
Thank you for the report.

@renecannao

This comment has been minimized.

Copy link
Contributor

commented Oct 8, 2018

Asking the question here too: what kernel version are you running?

@elchinoo

This comment has been minimized.

Copy link

commented Oct 30, 2018

Hello @renecannao ,

This is expected, I don't think there is a valid reason to kill scheduler scripts (note also, that scheduler script could be a wrapper around some other process, so these won't be killed anyway).

IMHO the parent should always kill its children and not leave orphan processes running. The scheduler scripts are forked from the proxysql process as we can see here:

root     21037  0:00 proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql
root     21038  0:00  \_ proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql
root     28629  0:00      \_ /bin/bash /usr/bin/proxysql_galera_checker --config-file=/e
root     28662  0:00      |   \_ /bin/bash -u /usr/bin/proxysql_node_monitor --config-fi
root     28692  0:00      |       \_ /bin/bash -u /usr/bin/proxysql_node_monitor --confi
root     28700  0:00      |           \_ /bin/bash -u /usr/bin/proxysql_node_monitor --c
root     29168  0:00      |               \_ /bin/bash -u /usr/bin/proxysql_node_monitor
root     28830  0:00      \_ /bin/bash /usr/bin/proxysql_galera_checker --config-file=/e
root     28863  0:00      |   \_ /bin/bash -u /usr/bin/proxysql_node_monitor --config-fi
root     28893  0:00      |       \_ /bin/bash -u /usr/bin/proxysql_node_monitor --confi
root     28901  0:00      |           \_ /bin/bash -u /usr/bin/proxysql_node_monitor --c
root     29369  0:00      |               \_ /bin/bash -u /usr/bin/proxysql_node_monitor
root     29092  0:00      \_ /bin/bash /usr/bin/proxysql_galera_checker --config-file=/e
root     29125  0:00      |   \_ /bin/bash -u /usr/bin/proxysql_node_monitor --config-fi
root     29155  0:00      |       \_ /bin/bash -u /usr/bin/proxysql_node_monitor --confi
root     29163  0:00      |           \_ /bin/bash -u /usr/bin/proxysql_node_monitor --c
root     29164  0:00      |               \_ /bin/bash -u /usr/bin/proxysql_node_monitor
root     29293  0:00      \_ /bin/bash /usr/bin/proxysql_galera_checker --config-file=/e
root     29326  0:00          \_ /bin/bash -u /usr/bin/proxysql_node_monitor --config-fi
root     29356  0:00              \_ /bin/bash -u /usr/bin/proxysql_node_monitor --confi
root     29364  0:00                  \_ /bin/bash -u /usr/bin/proxysql_node_monitor --c
root     29365  0:00                      \_ /bin/bash -u /usr/bin/proxysql_node_monitor

It should be an easy change in the /etc/init.d script. Instead of looping the "pgrep -x proxysql" or the PID file and killing the processes it would be better to get the process group ID and kill the group. It will make sure that all processes forked from proxysql will be killed, including the scheduler scripts. See the below example:

GPID=$(pgrep -x proxysql | head -n 1 | xargs ps -h -o pgid -q) && GPID="${GPID// /}" && kill -- -$GPID

@altmannmarcelo

This comment has been minimized.

Copy link

commented Nov 7, 2018

Hi @renecannao

Asking the question here too: what kernel version are you running?

[root@marcelo-altmann-239740-proxysql-1 ~]# grep 'already' /var/lib/proxysql/proxysql.log
2018-11-07 19:54:39 network.cpp:53:listen_on_port(): [ERROR] bind(): Address already in use
[root@marcelo-altmann-239740-proxysql-1 ~]# uname -a
Linux marcelo-altmann-239740-proxysql-1 4.17.4-1.el7.elrepo.x86_64 #1 SMP Tue Jul 3 09:40:42 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
@renecannao

This comment has been minimized.

Copy link
Contributor

commented Nov 22, 2018

Closed by 8df4d52

@renecannao renecannao closed this Nov 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.