-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added eloquent error message for MODCLUSTER-448. #140
Conversation
Triggering build using a merge of 4335bf7 on branch master: |
Build 137 outcome was SUCCESS using a merge of 4335bf7 on branch master: |
This is the offending line causing the core: apr_pool_destroy(worker->cp->pool); According to the behaviour described in BZ 56314, could it be the case that we simply can't destroy the pool at this place for this worker because the other worker is allocated there and it's accepting requests? IMHO it would be the best either |
(For my record)
The aforementioned inserted just before the offending apr_pool_destroy(worker->cp->pool); returns:
Bewildering. Note: |
@@ -55,4 +55,32 @@ int (* proxy_node_isup)(request_rec *r, int id, int load); | |||
int (* proxy_host_isup)(request_rec *r, char *scheme, char *host, char *port); | |||
}; | |||
typedef struct balancer_method balancer_method; | |||
|
|||
/* From apr_reslist.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is consistent between httpd 2.4.6 and httpd 2.2.26. Is there any better way how to avoid dereferencing pointer to incomplete type
?
Triggering build using a merge of 7907693 on branch master: |
if (!worker->opaque) | ||
worker->opaque = (proxy_cluster_helper *) apr_pcalloc(conf->pool, sizeof(proxy_cluster_helper)); | ||
if (!worker->opaque) | ||
return APR_EGENERAL; | ||
if (worker->cp->pool) { | ||
/* destroy and create a new one */ | ||
ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, server, "Reslist stats: worker->cp->res->ntotal:%d, worker->cp->res->nidle:%d", worker->cp->res->ntotal, worker->cp->res->nidle); | ||
//MODCLUSTER-448 | ||
if(worker->cp->res->ntotal != worker->cp->res->nidle) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the number of idle and total is not the same, we can't simply destroy the pool for this worker at this time. It will be caught during next iteration. This whole code branch is a mere corner case: It takes place when one tries to connect two worker with the same host and port at the same time. If you just remove and add 1 worker, you won't get ntotal != nidle ever.
@jfclere WDYT? |
Build 138 outcome was SUCCESS using a merge of 7907693 on branch master: |
@@ -101,6 +104,33 @@ static int enable_options = -1; /* Use OPTIONS * for CPING/CPONG */ | |||
#define TIMESESSIONID 300 /* after 5 minutes the sessionid have probably timeout */ | |||
#define TIMEDOMAIN 300 /* after 5 minutes the sessionid have probably timeout */ | |||
|
|||
/* From apr_reslist.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declaration is not enough :(, we need the definition for our check on ntotal, nidle...
Triggering build using a merge of 5398900 on branch master: |
Build 139 outcome was SUCCESS using a merge of 5398900 on branch master: |
Try with apr_reslist_acquired_count() |
NotesAd
Note in the aforementioned snippet that while I run preprocessor on our code, first with httpd-2.2.26 and then with httpd-2.4.6 apxs, looking for differences in the I wonder whether the httpd-2.2.26 ain't missing anything. It would be the best if |
I'm gonna revisit this beauty and try to implement it without re-defining |
@jfclere Gonna do differently and port to 1.3.x, 1.2.x and 2.0 (master). |
Description
If a worker registers with the balancer and it both doesn't exist at the time and yet it shares
url
with another worker returned byap_proxy_get_worker
, we log the following error message:In the aforementioned case, there were two workers with different jvmRoutes on two virtual machines, each reporting its hostname and port as
127.0.0.1:8009
.Tested