-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE-2678: Global connection cache #3515
Conversation
Thank you for submitting a PR! Maybe @olehermanse can review this? |
@amousset In our setup we have also a lot of copies in different bundles. I will test it in our framework. Thanks for the pull request |
9f53da7
to
7bd2018
Compare
@olehermanse I added the error detection mechanism suggested in the Jira issue. |
@cf-bottom jenkins with exotics, please. |
Thanks, @amousset , this looks promising. |
@basvandervlies Did you have a chance to test this yet? |
Alright, I triggered a build: (with exotics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me otherwise. Thanks for working on this! Opening new SSL connections over and over on the hub really is expensive.
Yes just did. Its for me 6 seconds faster om the whole run. 25 --> 19 seconds. |
libcfnet/conn_cache.c
Outdated
@@ -128,6 +128,22 @@ AgentConnection *ConnCache_FindIdleMarkBusy(const char *server, | |||
{ | |||
assert(svp->status == CONNCACHE_STATUS_IDLE); | |||
|
|||
// Check connection state before returning it | |||
int error = 0; | |||
socklen_t len = sizeof (error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no space after sizeof
libcfnet/conn_cache.c
Outdated
// Check connection state before returning it | ||
int error = 0; | ||
socklen_t len = sizeof (error); | ||
if (getsockopt(svp->conn->conn_info->sd, SOL_SOCKET, SO_ERROR, &error, &len) < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curly brace should be on separate line.
libcfnet/conn_cache.c
Outdated
socklen_t len = sizeof (error); | ||
if (getsockopt(svp->conn->conn_info->sd, SOL_SOCKET, SO_ERROR, &error, &len) < 0) { | ||
Log(LOG_LEVEL_VERBOSE, "FindIdle:" | ||
" found connection to '%s' but could not get socket status, skipping.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't break the string literal like this, it makes it harder to search for. Move "FindIdle:" down and into the same string literal as the rest of the message. (It's okay to go over 80 characters).
Also, since this log message has the name of the C function, it looks like it's aimed at a C developer / someone looking at the C code. I think LOG_LEVEL_DEBUG
is more appropriate.
libcfnet/conn_cache.c
Outdated
} | ||
if (error) { | ||
Log(LOG_LEVEL_VERBOSE, "FindIdle:" | ||
" found connection to '%s' but connection is broken, skipping.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, string literal in one and debug log level.
libcfnet/conn_cache.c
Outdated
server); | ||
continue; | ||
} | ||
if (error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (error != 0)
{
Curly brace on separate line, and we prefer explicit comparisons, except for bool
s with a clear name.
Currently the connection cache is reset after each bundle pass. This limits its effectivity, as all policies do not group file copies in the same bundle pass. This was apprently done to limit the risk of reusing broken connections. This commit keeps a unique connection cache for the whole agent run, and adds an error detection mechanism to avoid reusing broken cached connections.
7bd2018
to
12b51e3
Compare
Updated! |
@cf-bottom jenkins with exotics, please |
Sure, I triggered a build: (with exotics) |
Seems to be an unrelated failure in deployment tests caused by network issues in the clouds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks!
Currently the connection cache is destroyed after each bundle pass. In our usage of the agent (where we can have quite a lot of file copies all in different bundles), this leads to slower execution time and an increased load on the server (with up to tens of connections created and destroyed during a single run).
This patch naively activates the connection cache at the agent run level. This appears to have been held back (https://tracker.mender.io/browse/CFE-2678) because of https://tracker.mender.io/browse/CFE-2511, and waiting for a way to avoid reusing broken connections.
I could work on this as we really want to have a working connection cache. Does the solution proposed in CFE-2511 description suits you, and would allow reusing the connection cache for a whole agent run?