Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-2678: Global connection cache #3515

Merged
merged 1 commit into from Mar 8, 2019
Merged

Conversation

@amousset
Copy link
Contributor

amousset commented Feb 20, 2019

Currently the connection cache is destroyed after each bundle pass. In our usage of the agent (where we can have quite a lot of file copies all in different bundles), this leads to slower execution time and an increased load on the server (with up to tens of connections created and destroyed during a single run).

This patch naively activates the connection cache at the agent run level. This appears to have been held back (https://tracker.mender.io/browse/CFE-2678) because of https://tracker.mender.io/browse/CFE-2511, and waiting for a way to avoid reusing broken connections.

I could work on this as we really want to have a working connection cache. Does the solution proposed in CFE-2511 description suits you, and would allow reusing the connection cache for a whole agent run?

@cf-bottom

This comment has been minimized.

Copy link

cf-bottom commented Feb 20, 2019

Thank you for submitting a PR! Maybe @olehermanse can review this?

@basvandervlies

This comment has been minimized.

Copy link
Contributor

basvandervlies commented Feb 20, 2019

@amousset In our setup we have also a lot of copies in different bundles. I will test it in our framework. Thanks for the pull request

@amousset amousset force-pushed the amousset:enable-conn-cache branch from 9f53da7 to 7bd2018 Mar 4, 2019
@amousset amousset marked this pull request as ready for review Mar 4, 2019
@amousset

This comment has been minimized.

Copy link
Contributor Author

amousset commented Mar 4, 2019

@olehermanse I added the error detection mechanism suggested in the Jira issue.

@olehermanse olehermanse requested review from olehermanse and vpodzime Mar 4, 2019
@olehermanse

This comment has been minimized.

Copy link
Member

olehermanse commented Mar 4, 2019

@cf-bottom jenkins with exotics, please.

@olehermanse

This comment has been minimized.

Copy link
Member

olehermanse commented Mar 4, 2019

Thanks, @amousset , this looks promising.

@olehermanse

This comment has been minimized.

Copy link
Member

olehermanse commented Mar 4, 2019

@amousset In our setup we have also a lot of copies in different bundles. I will test it in our framework. Thanks for the pull request

@basvandervlies Did you have a chance to test this yet?

@cf-bottom

This comment has been minimized.

Copy link

cf-bottom commented Mar 4, 2019

Alright, I triggered a build:

Build Status

(with exotics)

https://ci.cfengine.com/job/pr-pipeline/2099/

Copy link
Contributor

vpodzime left a comment

Looks good to me otherwise. Thanks for working on this! Opening new SSL connections over and over on the hub really is expensive.

libcfnet/conn_cache.c Show resolved Hide resolved
@basvandervlies

This comment has been minimized.

Copy link
Contributor

basvandervlies commented Mar 5, 2019

@amousset In our setup we have also a lot of copies in different bundles. I will test it in our framework. Thanks for the pull request

@basvandervlies Did you have a chance to test this yet?

Yes just did. Its for me 6 seconds faster om the whole run. 25 --> 19 seconds.

@@ -128,6 +128,22 @@ AgentConnection *ConnCache_FindIdleMarkBusy(const char *server,
{
assert(svp->status == CONNCACHE_STATUS_IDLE);

// Check connection state before returning it
int error = 0;
socklen_t len = sizeof (error);

This comment has been minimized.

Copy link
@olehermanse

olehermanse Mar 6, 2019

Member

no space after sizeof

// Check connection state before returning it
int error = 0;
socklen_t len = sizeof (error);
if (getsockopt(svp->conn->conn_info->sd, SOL_SOCKET, SO_ERROR, &error, &len) < 0) {

This comment has been minimized.

Copy link
@olehermanse

olehermanse Mar 6, 2019

Member

Curly brace should be on separate line.

socklen_t len = sizeof (error);
if (getsockopt(svp->conn->conn_info->sd, SOL_SOCKET, SO_ERROR, &error, &len) < 0) {
Log(LOG_LEVEL_VERBOSE, "FindIdle:"
" found connection to '%s' but could not get socket status, skipping.",

This comment has been minimized.

Copy link
@olehermanse

olehermanse Mar 6, 2019

Member

Don't break the string literal like this, it makes it harder to search for. Move "FindIdle:" down and into the same string literal as the rest of the message. (It's okay to go over 80 characters).

Also, since this log message has the name of the C function, it looks like it's aimed at a C developer / someone looking at the C code. I think LOG_LEVEL_DEBUG is more appropriate.

}
if (error) {
Log(LOG_LEVEL_VERBOSE, "FindIdle:"
" found connection to '%s' but connection is broken, skipping.",

This comment has been minimized.

Copy link
@olehermanse

olehermanse Mar 6, 2019

Member

Same as above, string literal in one and debug log level.

server);
continue;
}
if (error) {

This comment has been minimized.

Copy link
@olehermanse

olehermanse Mar 6, 2019

Member
if (error != 0)
{

Curly brace on separate line, and we prefer explicit comparisons, except for bools with a clear name.

Currently the connection cache is reset after each bundle pass.
This limits its effectivity, as all policies do not group file copies
in the same bundle pass. This was apprently done to limit the risk
of reusing broken connections.

This commit keeps a unique connection cache for the whole agent run,
and adds an error detection mechanism to avoid reusing broken cached
connections.
@amousset amousset force-pushed the amousset:enable-conn-cache branch from 7bd2018 to 12b51e3 Mar 7, 2019
@amousset

This comment has been minimized.

Copy link
Contributor Author

amousset commented Mar 7, 2019

Updated!

@vpodzime

This comment has been minimized.

Copy link
Contributor

vpodzime commented Mar 7, 2019

@cf-bottom jenkins with exotics, please

@cf-bottom

This comment has been minimized.

Copy link

cf-bottom commented Mar 7, 2019

Sure, I triggered a build:

Build Status

(with exotics)

https://ci.cfengine.com/job/pr-pipeline/2134/

@vpodzime

This comment has been minimized.

Copy link
Contributor

vpodzime commented Mar 7, 2019

Build Status

(with exotics)

https://ci.cfengine.com/job/pr-pipeline/2134/

Seems to be an unrelated failure in deployment tests caused by network issues in the clouds.

Copy link
Contributor

vpodzime left a comment

Looks good to me, thanks!

@vpodzime vpodzime requested a review from olehermanse Mar 8, 2019
@vpodzime vpodzime merged commit 0380930 into cfengine:master Mar 8, 2019
22 of 25 checks passed
22 of 25 checks passed
codecov/changes 6 files have unexpected coverage changes not visible in diff.
Details
codecov/patch 60% of diff hit (target 60.99%)
Details
codecov/project 60.96% (-0.04%) compared to 79be22b
Details
LGTM analysis: C/C++ No new or fixed alerts
Details
LGTM analysis: Python No code changes detected
Details
ci/testing_pr/PACKAGES_HUB_x86_64_linux_debian_6 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_HUB_x86_64_linux_debian_7 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_HUB_x86_64_linux_redhat_6 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_HUB_x86_64_linux_redhat_7 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_HUB_x86_64_linux_ubuntu_12 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_HUB_x86_64_linux_ubuntu_14 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_HUB_x86_64_linux_ubuntu_16 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_i386_linux_debian_4 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_i386_linux_redhat_4 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_i386_mingw Build and tests finished: success
Details
ci/testing_pr/PACKAGES_ia64_hpux_11.23 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_ppc64_aix_53 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_sparc64_solaris_11 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_x86_64_linux_debian_4 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_x86_64_linux_debian_7 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_x86_64_linux_redhat_4 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_x86_64_linux_redhat_6 Build and tests finished: success
Details
ci/testing_pr/PACKAGES_x86_64_mingw Build and tests finished: success
Details
ci/testing_pr/PACKAGES_x86_64_solaris_10 Build and tests finished: success
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants
You can’t perform that action at this time.