-
Notifications
You must be signed in to change notification settings - Fork 978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #3310: Improve error reporting for 'MySQL_Monitor' and 'MySQL_Session' backend connections via counters #3311
Conversation
…Session' backend connections via counters
lib/MySQL_Monitor.cpp
Outdated
@@ -872,10 +874,12 @@ void * monitor_connect_thread(void *arg) { | |||
(strncmp(mmsd->mysql_error_msg,"ProxySQL Error: Access denied for user",strlen("ProxySQL Error: Access denied for user"))==0) | |||
) { | |||
proxy_error("Server %s:%d is returning \"Access denied\" for monitoring user\n", mmsd->hostname, mmsd->port); | |||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_ACCESS_DENIED_ERROR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we hardcoding ER_ACCESS_DENIED_ERROR
instead of using mysql_errno(mmsd->mysql)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for logging error is being performed on mmsd->mysql_error_msg
, I assumed that if mysql_errno(mmsd->mysql)
was guaranteed to be ER_ACCESS_DENIED_ERROR this check would much more simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because those are the only messages we want to write in the error log.
I don't see a valid reason to hardcode an error code, even if we know already what is.
lib/MySQL_Monitor.cpp
Outdated
} | ||
else if (strncmp(mmsd->mysql_error_msg,"Your password has expired.",strlen("Your password has expired."))==0) | ||
{ | ||
proxy_error("Server %s:%d is returning \"Your password has expired.\" for monitoring user\n", mmsd->hostname, mmsd->port); | ||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_MUST_CHANGE_PASSWORD_LOGIN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we hardcoding ER_MUST_CHANGE_PASSWORD_LOGIN
instead of using mysql_errno(mmsd->mysql)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as in comment #3311 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reply
@@ -1138,6 +1145,7 @@ void * monitor_read_only_thread(void *arg) { | |||
free(mmsd->mysql_error_msg); | |||
mmsd->mysql_error_msg = new_error; | |||
proxy_error("Timeout on read_only check for %s:%d after %lldms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: %s.\n", mmsd->hostname, mmsd->port, (now-mmsd->t1)/1000, new_error); | |||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_PROXYSQL_READ_ONLY_CHECK_CONN_TIMEOUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a connection is not created (for example, because of a timeout) the counter is already increased in create_new_connection()
.
This line can go away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counter in create_new_connection
is incrementing p_mysql_error_type::mysql
counter, this is incrementing p_mysql_error_type::proxysql
counter, the first one reflects the error in the backend connection, this one the internal ProxySQL error generated because it was a READ_ONLY_CHECK_CONN_TIMEOUT
. This second one offers information missing in the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using ER_PROXYSQL_READ_ONLY_CHECK_TIMEOUT
?
Too many variables seems an over-engineering, especially because it means creating a double number of time series. A timeout happened, this is all we need to know.
Same comment applies to all the others.
@@ -1192,6 +1201,7 @@ void * monitor_read_only_thread(void *arg) { | |||
if (now > mmsd->t1 + mysql_thread___monitor_read_only_timeout * 1000) { | |||
mmsd->mysql_error_msg=strdup("timeout check"); | |||
proxy_error("Timeout on read_only check for %s:%d after %lldms. If the server is overload, increase mysql-monitor_read_only_timeout.\n", mmsd->hostname, mmsd->port, (now-mmsd->t1)/1000); | |||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_PROXYSQL_READ_ONLY_CHECK_TIMEOUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a connection is not created (for example, because of a timeout) the counter is already increased in create_new_connection()
.
This line can go away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reasoning as for #3311 (comment)
lib/MySQL_Monitor.cpp
Outdated
@@ -1297,20 +1307,23 @@ VALGRIND_ENABLE_ERROR_REPORTING; | |||
if (resultset->rows_count) { | |||
// disable host | |||
proxy_error("Server %s:%d missed %d read_only checks. Assuming read_only=1\n", mmsd->hostname, mmsd->port, max_failures); | |||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_PROXYSQL_ASSUME_SRV_READ_ONLY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This itself it is not an error.
We can have a counter for when ProxySQL assumes that read_only=1
, but this is not an error.
The error is when the check fails, not when as action is performed.
This line must go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't tracked as a MySQL error, but as an ProxySQL generated error p_mysql_error_type::proxysql
, the line is outdated, new code is ER_PROXYSQL_READ_ONLY_CHECKS_MISSED
@@ -2851,7 +2848,6 @@ bool MySQL_Session::handler_again___status_CHANGING_AUTOCOMMIT(int *_rc) { | |||
myds->DSS = STATE_MARIADB_GENERIC; | |||
NEXT_IMMEDIATE_NEW(st); | |||
} else { | |||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::mysql, myconn->parent->myhgc->hid, myconn->parent->address, myconn->parent->port, mysql_errno(myconn->mysql)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of replacing this with 3 calls, we can use this:
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::mysql, myconn->parent->myhgc->hid, myconn->parent->address, myconn->parent->port, ( mysql_errno(myconn->mysql) ? mysql_errno(myconn->mysql) : ER_PROXYSQL_OFFLINE_SRV ) );
} else { | ||
proxy_error( | ||
"Detected a broken connection during SET AUTOCOMMIT on %s , %d : %d, %s\n", | ||
myconn->parent->address, | ||
myconn->parent->port, | ||
myerr, | ||
ER_PROXYSQL_OFFLINE_SRV, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can stay myerr
.
For the same replace didn't happen in othe functions, like handler_again___status_SETTING_INIT_CONNECT
include/MySQL_Monitor.hpp
Outdated
* Enum holding the different connection errors that could be reached for | ||
* the backend connections oppened by 'MySQL_Monitor' module. | ||
*/ | ||
enum PROXYSQL_MONITOR_ERR { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why we have PROXYSQL_MONITOR_ERR
and not just extend PROXYSQL_MYSQL_ERR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it could be handy to have a clear separation between general errors and monitoring errors, easier to remember error code ranges that particular values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes the maintenance more difficult.
Especially in cases in which they overlap.
For example, a ping can timeout no matter if in MySQL Session or on Monitor.
On the same topic, I think I don't see a counter for ping timeout in Monitor
include/MySQL_Monitor.hpp
Outdated
* the backend connections oppened by 'MySQL_Monitor' module. | ||
*/ | ||
enum PROXYSQL_MONITOR_ERR { | ||
ER_PROXYSQL_ASSUME_SRV_READ_ONLY = 10000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my other comments, I think 7 enums should go away, and 5 should stay
include/MySQL_Protocol.h
Outdated
* Enum holding the different MySQL connection errors that are used to report | ||
* invalid states in the backend connections. | ||
*/ | ||
enum PROXYSQL_MYSQL_ERR { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While here, we could also have added error codes 9001 to 9003
include/MySQL_Monitor.hpp
Outdated
* Enum holding the different connection errors that could be reached for | ||
* the backend connections oppened by 'MySQL_Monitor' module. | ||
*/ | ||
enum PROXYSQL_MONITOR_ERR { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes the maintenance more difficult.
Especially in cases in which they overlap.
For example, a ping can timeout no matter if in MySQL Session or on Monitor.
On the same topic, I think I don't see a counter for ping timeout in Monitor
lib/MySQL_Monitor.cpp
Outdated
@@ -872,10 +874,12 @@ void * monitor_connect_thread(void *arg) { | |||
(strncmp(mmsd->mysql_error_msg,"ProxySQL Error: Access denied for user",strlen("ProxySQL Error: Access denied for user"))==0) | |||
) { | |||
proxy_error("Server %s:%d is returning \"Access denied\" for monitoring user\n", mmsd->hostname, mmsd->port); | |||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_ACCESS_DENIED_ERROR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because those are the only messages we want to write in the error log.
I don't see a valid reason to hardcode an error code, even if we know already what is.
lib/MySQL_Monitor.cpp
Outdated
} | ||
else if (strncmp(mmsd->mysql_error_msg,"Your password has expired.",strlen("Your password has expired."))==0) | ||
{ | ||
proxy_error("Server %s:%d is returning \"Your password has expired.\" for monitoring user\n", mmsd->hostname, mmsd->port); | ||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_MUST_CHANGE_PASSWORD_LOGIN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reply
@@ -1138,6 +1145,7 @@ void * monitor_read_only_thread(void *arg) { | |||
free(mmsd->mysql_error_msg); | |||
mmsd->mysql_error_msg = new_error; | |||
proxy_error("Timeout on read_only check for %s:%d after %lldms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: %s.\n", mmsd->hostname, mmsd->port, (now-mmsd->t1)/1000, new_error); | |||
MyHGM->p_update_mysql_error_counter(p_mysql_error_type::proxysql, mmsd->hostgroup_id, mmsd->hostname, mmsd->port, ER_PROXYSQL_READ_ONLY_CHECK_CONN_TIMEOUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using ER_PROXYSQL_READ_ONLY_CHECK_TIMEOUT
?
Too many variables seems an over-engineering, especially because it means creating a double number of time series. A timeout happened, this is all we need to know.
Same comment applies to all the others.
…_MYSQL_ERR' holding all ProxySQL errors for backend connections
…tor and replaced zero error codes with ER_PROXYSQL_OFFLINE_SRV
…ect_thread' and no longer required include
…ssion_status_flags-t' to avoid replication lag issues in the CI
Closes #3310.