New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-14412 Add functionality to set keepalive tcp options #400
Conversation
Are these tests failing because of my change? Does not seem so :( |
Hi @leoleovich! Thank you for the contribution! Sorry for getting back to you a bit delayed. One of the test failures is caused by your patch, but that is not a problem. I can easily fix it. There are a few formalities that need to happen before we can accept your patch however. Similar to other open source projects, the MariaDB Foundation needs to have shared ownership of all code that is included in the MariaDB distribution. The easiest way to achieve this is by submitting your code under the BSD-new license. The other alternative is to sign the code contribution agreement which can be found here: https://mariadb.com/kb/en/mariadb/mca/ Please indicate in a comment below that you are contributing your new code of the whole pull request, including one or several files that are either new files or modified ones, under the BSD-new license or that you have filled out the contribution agreement and sent it. Regards, |
Hello, @cvicentiu! |
sql/mysqld.cc
Outdated
@@ -533,6 +533,7 @@ ulong slave_retried_transactions; | |||
ulonglong slave_skipped_errors; | |||
ulong feature_files_opened_with_delayed_keys= 0, feature_check_constraint= 0; | |||
ulonglong denied_connections; | |||
ulong keepalive_time=0, keepalive_intvl=0, keepalive_probes=0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be a duplicate of sql_class.h definitions and not used. Same with mysqld.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grooverdan if I remove these variables I get
sql/sys_vars.cc:5088:28: error: use of undeclared identifier 'keepalive_intvl'; did you mean 'Sys_keepalive_intvl'?
AUTO_SET GLOBAL_VAR(keepalive_intvl),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to be honest Sys_keepalive_intvl
works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I misunderstood something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, I was just surprised by message.
sql/sql_connect.cc
Outdated
@@ -1000,6 +1000,28 @@ static int check_connection(THD *thd) | |||
bzero((char*) &net->vio->remote, sizeof(net->vio->remote)); | |||
} | |||
vio_keepalive(net->vio, TRUE); | |||
|
|||
if (global_system_variables.keepalive_probes != 9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this condition here? Why use absolute 9 rather than the define below?
Does a 0 value turn off keep alives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid applying mariadb defaults all the time. So have to use some values to compare. And 9 is linux default.
According linux source code 0 will cause the error:
https://github.com/torvalds/linux/blob/b29794ec95c6856b316c2295904208bf11ffddd9/net/ipv4/tcp.c#L2612
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can live without keepalive probes at all. It is the least portable option (keepalive timeout is the most portable, can be used in Windows and OSX, and interval can be used at least on Windows). BTW, you're aware that in this form the patch is Linux-only and won't even compile elsewhere, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked manual one more time and seems like only TCP_KEEPIDLE/TCP_KEEPALIVE
are common for win/mac/linux.
But I think having TCP_KEEPCNT
(probes) and TCP_KEEPINTVL
(interval) is very important, because linux defaults are too high (9 and 75).
Also I pushed code with more #ifdef
to make it work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a way to set keepalive interval and timeout on Windows, per socket , is here https://msdn.microsoft.com/en-us/library/windows/desktop/dd877220(v=vs.85).aspx .
On OSX, TCP_KEEPALIVE sets the timeout https://stackoverflow.com/questions/15860127/how-to-configure-tcp-keepalive-under-mac-os-x . The probes are nowhere but on Linux, as I said. Why are probes/KEEPCNT is very important (just trying to understand)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good to me. A minor suggestion would be to also move the defaults outside of sql_const.h into some of the vio headers (violite.h?) , and have VIO in the constant names ( VIO_DEFAULT_KEEAPALIVE_INTERVAL or something like that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaintroub of course you will run functional testing on it, right? Because, for instance, I do not have windows machine near:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somebody (me or someone else) should run functional testing, yes. although with socket options like keepalive it is a slightly challenging testing, on any platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaintroub thanks, please feel free to contact me if you need anything.
sql/sys_vars.cc
Outdated
@@ -5073,6 +5073,33 @@ static Sys_var_ulong Sys_host_cache_size( | |||
NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(NULL), | |||
ON_UPDATE(fix_host_cache_size)); | |||
|
|||
static Sys_var_ulong Sys_keepalive_time( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include the units in the description (seconds?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keepalive_time
and keepalive_intvl
got _seconds
at the end.
sql/sys_vars.cc
Outdated
CMD_LINE(REQUIRED_ARG), VALID_RANGE(0, 65536), | ||
DEFAULT(KEEPALIVE_TIME), | ||
BLOCK_SIZE(1), | ||
NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(NULL)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered an ON_UPDATE function to change existing connections? (On all vars)
4978379
to
13d4d29
Compare
@vaintroub What is the status for this review? Did you take a look at this in its final state or should I? |
@cvicentiu : I have not, but I'm on it now |
Based on pull request by Oleg Obleukhov #400
Based on pull request by Oleg Obleukhov #400
Based on pull request by Oleg Obleukhov #400
I had to rework it quite a bit (made compile :)), and added tests to it |
Based on pull request by Oleg Obleukhov #400
Based on pull request by Oleg Obleukhov #400
Based on pull request by Oleg Obleukhov #400
pushed into 10.3 |
I tried to implement support of flexible configuration for
keepalive
.If we take linux or macos, by default values are too high and can not provide reliable work for production application (it is somewhere around 2,5 hours before keepalive actually starts working)
But lowering values for complete OS is not always an option, so it should be configurable on the application side. And in fact many applications can do it. For instance
postgres
.So I suggest we add configuration variables:
keepalive_time
keepalive_intvl
keepalive_probes
and use their values as socket options.
For meaning of params please check manual http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/