Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collectd: write_http: support concurrent upload #1639

Closed
wants to merge 1 commit into from

Conversation

skinowski
Copy link
Contributor

Switched write_http to use thread specific data to store curl
easy handle / content to be able to utilitze collectd write
threads. Following issues are fixed:

  1. Previously, single mutex was locked which reduced
    concurrency to single thread despite multiple writer threads.
  2. This same mutex was kept locked during network I/O. Especially
    with FQDN DNS resolution performed while signals disabled in libcurl,
    this can lock entire collectd writer threads. This still is a problem
    since it's libcurl/libc related, but now DNS/network hiccups will only
    impact one thread. Avoiding DNS and using IP addresses along with a
    configured network "Timeout", unbounded stalls can be avoided.
  3. Global config versus I/O context specific data is now separated.
  4. curl_slist_append() overwrote previous pointer which would leak if
    any curl_slist_append() failed since it returns NULL in case of errors.
  5. Improved error handling for curl_slist_append and curl_easy_setopt
    failures, previous code did not check error conditions.

Switched write_http to use thread specific data to store curl
easy handle / content to be able to utilitze collectd write
threads. Following issues are fixed:

1) Previously, single mutex was locked which reduced
concurrency to single thread despite multiple writer threads.
2) This same mutex was kept locked during network I/O. Especially
with FQDN DNS resolution performed while signals disabled in libcurl,
this can lock entire collectd writer threads. This still is a problem
since it's libcurl/libc related, but now DNS/network hiccups will only
impact one thread. Avoiding DNS and using IP addresses along with a
configured network "Timeout", unbounded stalls can be avoided.
3) Global config versus I/O context specific data is now separated.
4) curl_slist_append() overwrote previous pointer which would leak if
any curl_slist_append() failed since it returns NULL in case of errors.
5) Improved error handling for curl_slist_append and curl_easy_setopt
failures, previous code did not check error conditions.
@skinowski
Copy link
Contributor Author

This is a large rewrite and I'm not expecting this to be merged without some discussion/feedback. I'd appreciate your feedback/questions.

@skinowski
Copy link
Contributor Author

OpenSSL thread-safety lock callbacks are missing in this change.

{
wh_ctx_t *ctx = data;
if (ctx == NULL)
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annoyingly as it is, this file is indented 8 spaces. Can you do the same here?

@BrandonArp
Copy link
Contributor

@skinowski There is a little more work to do in the PR since adding the headers feature to write_http. I've created a patch that essentially is a rebase on top of master (and then applying the new functionality). Feel free to take a look at it and grab what you need: https://github.com/BrandonArp/collectd/tree/write_http_rewrite_rebase

@skinowski
Copy link
Contributor Author

Looking at the thread-safety issue, I think this change needs to be more involved than just write_http itself. At first glance, configure scripts need to detect which SSL libs are in use by various modules and perhaps expose public functions to initialize various SSL libraries only once so that all modules (nginx, curl_xml, curl_json, curl, bind, ascent, apache, etc.) could safely initialize SSL libraries in future. This would open the door for all such modules to become truly multi-threaded.

Besides GnuTLS, OpenSSL and NSS, libcurl could link to other SSL libs (PolarSSL, yassl, axTLS.) In order not to break things, I don't think we can merge this PR before all these possibilities are covered and thread-safe:

https://curl.haxx.se/libcurl/c/threadsafe.html

But the issue may be trickier than it seems, as for instance, openLDAP could also link to openSSL and it will initialize thread-safety callbacks upon ldap_initialize(). If both write_http and openLDAP is enabled, then they will both potentially try to set openSSL thread-safety callbacks depending on the platform and underlying libraries. Like openLDAP, some libraries do not expose their underlying SSL implementation and do not offer the flexibility that libcurl offers with regards to initialization.

What is your opinion on this? Should we abandon this change and avoid this complexity and any related potential issues?

@skinowski
Copy link
Contributor Author

It's probably much safer to re-implement this module using libcurl's multi interface and keep things single threaded. Using multi interface, that a single I/O thread can perform multiple simultaneous transfers.

@dago
Copy link
Contributor

dago commented Feb 25, 2020

I will close this PR for now taking your last comment into account:

Re-implement this module using libcurl's multi interface and keep things single threaded. Using multi interface, that a single I/O thread can perform multiple simultaneous transfers.

@dago dago closed this Feb 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants