Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-2678: Global connection cache #3515

Merged
merged 1 commit into from
Mar 8, 2019

Conversation

amousset
Copy link
Contributor

Currently the connection cache is destroyed after each bundle pass. In our usage of the agent (where we can have quite a lot of file copies all in different bundles), this leads to slower execution time and an increased load on the server (with up to tens of connections created and destroyed during a single run).

This patch naively activates the connection cache at the agent run level. This appears to have been held back (https://tracker.mender.io/browse/CFE-2678) because of https://tracker.mender.io/browse/CFE-2511, and waiting for a way to avoid reusing broken connections.

I could work on this as we really want to have a working connection cache. Does the solution proposed in CFE-2511 description suits you, and would allow reusing the connection cache for a whole agent run?

@cf-bottom
Copy link

Thank you for submitting a PR! Maybe @olehermanse can review this?

@basvandervlies
Copy link
Contributor

@amousset In our setup we have also a lot of copies in different bundles. I will test it in our framework. Thanks for the pull request

@amousset amousset marked this pull request as ready for review March 4, 2019 10:44
@amousset
Copy link
Contributor Author

amousset commented Mar 4, 2019

@olehermanse I added the error detection mechanism suggested in the Jira issue.

@olehermanse
Copy link
Member

@cf-bottom jenkins with exotics, please.

@olehermanse
Copy link
Member

Thanks, @amousset , this looks promising.

@olehermanse
Copy link
Member

@amousset In our setup we have also a lot of copies in different bundles. I will test it in our framework. Thanks for the pull request

@basvandervlies Did you have a chance to test this yet?

@cf-bottom
Copy link

Alright, I triggered a build:

Build Status

(with exotics)

https://ci.cfengine.com/job/pr-pipeline/2099/

Copy link
Contributor

@vpodzime vpodzime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me otherwise. Thanks for working on this! Opening new SSL connections over and over on the hub really is expensive.

libcfnet/conn_cache.c Show resolved Hide resolved
@basvandervlies
Copy link
Contributor

@amousset In our setup we have also a lot of copies in different bundles. I will test it in our framework. Thanks for the pull request

@basvandervlies Did you have a chance to test this yet?

Yes just did. Its for me 6 seconds faster om the whole run. 25 --> 19 seconds.

@@ -128,6 +128,22 @@ AgentConnection *ConnCache_FindIdleMarkBusy(const char *server,
{
assert(svp->status == CONNCACHE_STATUS_IDLE);

// Check connection state before returning it
int error = 0;
socklen_t len = sizeof (error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no space after sizeof

// Check connection state before returning it
int error = 0;
socklen_t len = sizeof (error);
if (getsockopt(svp->conn->conn_info->sd, SOL_SOCKET, SO_ERROR, &error, &len) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curly brace should be on separate line.

socklen_t len = sizeof (error);
if (getsockopt(svp->conn->conn_info->sd, SOL_SOCKET, SO_ERROR, &error, &len) < 0) {
Log(LOG_LEVEL_VERBOSE, "FindIdle:"
" found connection to '%s' but could not get socket status, skipping.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't break the string literal like this, it makes it harder to search for. Move "FindIdle:" down and into the same string literal as the rest of the message. (It's okay to go over 80 characters).

Also, since this log message has the name of the C function, it looks like it's aimed at a C developer / someone looking at the C code. I think LOG_LEVEL_DEBUG is more appropriate.

}
if (error) {
Log(LOG_LEVEL_VERBOSE, "FindIdle:"
" found connection to '%s' but connection is broken, skipping.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, string literal in one and debug log level.

server);
continue;
}
if (error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (error != 0)
{

Curly brace on separate line, and we prefer explicit comparisons, except for bools with a clear name.

Currently the connection cache is reset after each bundle pass.
This limits its effectivity, as all policies do not group file copies
in the same bundle pass. This was apprently done to limit the risk
of reusing broken connections.

This commit keeps a unique connection cache for the whole agent run,
and adds an error detection mechanism to avoid reusing broken cached
connections.
@amousset
Copy link
Contributor Author

amousset commented Mar 7, 2019

Updated!

@vpodzime
Copy link
Contributor

vpodzime commented Mar 7, 2019

@cf-bottom jenkins with exotics, please

@cf-bottom
Copy link

Sure, I triggered a build:

Build Status

(with exotics)

https://ci.cfengine.com/job/pr-pipeline/2134/

@vpodzime
Copy link
Contributor

vpodzime commented Mar 7, 2019

Build Status

(with exotics)

https://ci.cfengine.com/job/pr-pipeline/2134/

Seems to be an unrelated failure in deployment tests caused by network issues in the clouds.

Copy link
Contributor

@vpodzime vpodzime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks!

@vpodzime vpodzime merged commit 0380930 into cfengine:master Mar 8, 2019
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants