VBE.*.conn (concurrent connections to backend) not working as expected #2011

Closed
carlosabalde opened this Issue Jul 8, 2016 · 4 comments

Projects

None yet

4 participants

@carlosabalde

Expected Behavior

VBE.*.conn (concurrent connections to backend) should behave as a gauge. That's the expected behaviour according with varnishstat:

$ varnishstat -j
...
"VBE.5756f01a-bb7c-441c-9285-3a383cb41b96.some_be.conn": {
"description": "Concurrent connections to backend",
"type": "VBE", "ident": "5756f01a-bb7c-441c-9285-3a383cb41b96.some_be", "flag": "g", "format": "i",
"value": 69509
},
"VBE.5756f01a-bb7c-441c-9285-3a383cb41b96.some_be.req": {
"description": "Backend requests sent",
"type": "VBE", "ident": "5756f01a-bb7c-441c-9285-3a383cb41b96.some_be", "flag": "c", "format": "i",
"value": 69509
},
...

Current Behavior

VBE.*.conn behaves as a counter, always containing the same value as VBE.*.req.

Possible Solution

I've been checking the implementation and it seems the value of VBE.*.conn is never decremented. Its value is only incremented in vbe_dir_getfd() (bin/varnishd/cache/cache_backend.c):

...
Lck_Lock(&bp->mtx);
bp->n_conn++;
bp->vsc->conn++;
bp->vsc->req++;
Lck_Unlock(&bp->mtx);
...

In fact that's the only place where VBE.*.req is incremented too. That explains why their values are always identical in the varnishstat output. I guess that bp->vsc->conn should be decremented somewhere else in order to behave as a gauge.

Steps to Reproduce (for bugs)

Nothing special. Simply execute some requests triggering backend fetches and check the varnishstat output.

Context

At the moment this results in some items containing weird values in Zabbix.

Your Environment

  • Version used: varnishd (varnish-plus-4.1.2r2 revision b25a645)
  • Operating System and version: Ubuntu Trusty
@Dridi
Member
Dridi commented Jul 8, 2016

I haven't verified yet, but I'm sure this bug was introduced in 4.1 with dynamic backends, so it's probably still the case in current master. So for now I added both 4.1 and master labels on this issue.

@Dridi
Member
Dridi commented Jul 8, 2016

This should be enough:

diff --git i/bin/varnishd/cache/cache_backend.c w/bin/varnishd/cache/cache_backend.c
index 14e5c04..55b2a13 100644
--- i/bin/varnishd/cache/cache_backend.c
+++ w/bin/varnishd/cache/cache_backend.c
@@ -170,6 +170,7 @@ vbe_dir_finish(const struct director *d, struct worker *wrk,
        }
        assert(bp->n_conn > 0);
        bp->n_conn--;
+       bp->vsc->conn--;
 #define ACCT(foo)      bp->vsc->foo += bo->acct.foo;
 #include "tbl/acct_fields_bereq.h"
 #undef ACCT
@Dridi Dridi added a commit that closed this issue Jul 8, 2016
@Dridi Dridi Decrement the VBE.*.conn gauge
Fixes #2011
797333e
@Dridi Dridi closed this in 797333e Jul 8, 2016
@Dridi Dridi added a commit that referenced this issue Jul 8, 2016
@Dridi Dridi Decrement the VBE.*.conn gauge
Fixes #2011
36ce687
@Dridi Dridi added a commit that referenced this issue Jul 8, 2016
@Dridi Dridi Add #2011 to the changelog 1d73aa7
@Dridi Dridi self-assigned this Jul 8, 2016
@Dridi
Member
Dridi commented Jul 8, 2016

In the 4.0 days, backends were shared by VCLs (same name, same endpoint) and each backend owned its TCP connection pool.

Since 4.1, backends are owned by VCLs (no more name collision) and they share one connection pool per endpoint (same IPv4, same IPv6, same service/port). As a result the connection management from the backend's point of view is rather simple for HTTP/1, where request is almost synonymous to session. Once a request is done, the connection goes back to the pool.

So the number of connections is incremented and decremented in exactly one place as far as backends are concerned. The smart bits happen in the connection pool.

I pushed the fix in both master and 4.1 branches.

@hermunn
Contributor
hermunn commented Jul 8, 2016

Backport review: Dridi has completed the backport, removing label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment