Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Condition((vfc->vfp_nxt) != 0) not true in vbf_stp_fetchbody() #2645

Closed
nigoroll opened this issue Apr 11, 2018 · 5 comments
Closed

Condition((vfc->vfp_nxt) != 0) not true in vbf_stp_fetchbody() #2645

nigoroll opened this issue Apr 11, 2018 · 5 comments

Comments

@nigoroll
Copy link
Member

Panic at: Wed, 11 Apr 2018 03:37:23 GMT
Assert error in vbf_stp_fetchbody(), cache/cache_fetch.c line 438:
  Condition((vfc->vfp_nxt) != 0) not true.
version = varnish-trunk revision
5a239cbc3a6def944712f3ad063076cb14724acf, vrt api = 7.0
ident =
Linux,3.13.0-137-generic,x86_64,-junix,-smalloc,-sdefault,-hcritbit,epoll
now = 6699399.655359 (mono), 1523417788.809868 (real)
Backtrace:
  0x438f5b: /usr/sbin/varnishd() [0x438f5b]
  0x494d32: /usr/sbin/varnishd(VAS_Fail+0x42) [0x494d32]
  0x429889: /usr/sbin/varnishd() [0x429889]
  0x45525e: /usr/sbin/varnishd() [0x45525e]
  0x4556a0: /usr/sbin/varnishd() [0x4556a0]
  0x7f4c3a06f184: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)
[0x7f4c3a06f184]
  0x7f4c39d9c03d: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)
[0x7f4c39d9c03d]
thread = (cache-worker)
thr.req = (nil) {
},
thr.busyobj = 0x7f490cb4b020 {
  ws = 0x7f490cb4b058 {
    id = \"bo\",
    {s, f, r, e} = {0x7f490cb545b0, +59968, (nil), +59976},
  },
  retries = 3, failed = 1, flags = {do_stream, do_pass, uncacheable,
is_gunzip},
  http_conn = 0x7f490cb5f7e8 {
    fd = 131 (@0x7f46241c8be4),
    doclose = REQ_CLOSE,
    ws = 0x7f490cb4b058 {
      [Already dumped, see above]
    },
    {rxbuf_b, rxbuf_e} = {0x7f490cb5f848, 0x7f490cb5f90a},
    {pipeline_b, pipeline_e} = {0x7f490cb5f90a, 0x7f490cb62fb2},
    content_length = -1,
    body_status = chunked,
    first_byte_timeout = 60.000000,
    between_bytes_timeout = 60.000000,
  },
  director_req = 0x7f46dfc22ac0 {
    vcl_name = *redacted*_shard,
    type = (null) {
    },
  },
  director_resp = 0x7f46e54fce90 {
    vcl_name = *redacted*,
    type = backend {
      display_name = *redacted*,
      ipv4 = *redacted*,
      port = 80,
      hosthdr = *redacted*,
      health = healthy,
      admin_health = probe, changed = 1523278050.012533,
      n_conn = 1,
    },
  },
  http[bereq] = 0x7f490cb4b840 {
    ws = 0x7f490cb4b058 {
      [Already dumped, see above]
    },
    hdrs {
      \"GET\", *redacted*
    },
  },
  http[beresp] = 0x7f490cb4bed8 {
    ws = 0x7f490cb4b058 {
      [Already dumped, see above]
    },
    hdrs {
      \"HTTP/1.1\",
      \"500\",
      \"Internal Server Error\",
      \"Date: Wed, 11 Apr 2018 03:36:28 GMT\",
      \"Server: Apache/2.4.18 (Ubuntu)\",
      \"Content-Type: text/html;charset=utf-8\",
      \"Content-Language: en\",
      \"Connection: close\",
      \"Transfer-Encoding: chunked\",
    },
  },
  objcore[fetch] = 0x7f44be0a2540 {
    refcnt = 1,
    flags = {pass, private},
    exp_flags = {},
    boc = 0x7f478cd7cc80 {
      refcnt = 1,
      state = stream,
      vary = (nil),
      stevedore_priv = (nil),
    },
    exp = {1523417788.809763, 0.000000, 0.000000, 0.000000},
    objhead = 0x7f4c394d7080,
    stevedore = 0x7f4c39428480 (malloc Transient) {
      Simple = 0x7f4758137040,
      Obj = 0x7f4501983838 {priv=0x7f4501983830, ptr=0x7f4758137040,
len=288, space=288},
      LEN = 0x0...0,
      VXID = 0x2f3c35ca,
      FLAGS = 0x00,
      GZIPBITS = 0x0...0,
      LASTMODIFIED = 0x41d6b360af000000,
      VARY = {len=0, ptr=(nil)},
      HEADERS = {len=168, ptr=0x7f47581370b8},
    },
  },
  vcl = {
    name = *redacted*,
    busy = 82,
    discard = 0,
    state = auto,
    temp = warm,
    conf = {
      srcname = { *redacted*    },
    },
  },
  vmods = {
    std = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    directors = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf,
0.0},
    bodyaccess = {Varnish trunk
5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    re = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 7.0},
    re2 = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    dispatch = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf,
0.0},
    dcs = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    var = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 7.0},
    vtc = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    header = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 7.0},
  },
},

@nigoroll
Copy link
Member Author

nigoroll commented Apr 11, 2018

possible clues:

  • ws almost full, but not yet overflowed:
    id = \"bo\",
    {s, f, r, e} = {0x7f490cb545b0, +59968, (nil), +59976},
  • We had a FetchError (VFP_Error called)
    failed = 1
  • possibly all of the response piplelined
    {pipeline_b, pipeline_e} = {0x7f490cb5f90a, 0x7f490cb62fb2},

nigoroll added a commit that referenced this issue Apr 11, 2018
Ref #2645 but cannot be the cause because workspace is just not
overflowed
@nigoroll
Copy link
Member Author

If the workspace was overflowed, I would believe to understand this issue, but as it's not, I don't

@nigoroll
Copy link
Member Author

Has happened again. The panic output is almost identical, but this time the workspace was exactly full yet still not overflown (previously, we had 8 bytes left).

Panic at: Sat, 14 Apr 2018 00:27:59 GMT
Assert error in vbf_stp_fetchbody(), cache/cache_fetch.c line 438:
  Condition((vfc->vfp_nxt) != 0) not true.
version = varnish-trunk revision 5a239cbc3a6def944712f3ad063076cb14724acf, vrt api = 7.0
ident = Linux,3.13.0-137-generic,x86_64,-junix,-smalloc,-sdefault,-hcritbit,epoll
now = 3322133.863102 (mono), 1523665619.406965 (real)
Backtrace:
  0x438f5b: /usr/sbin/varnishd() [0x438f5b]
  0x494d32: /usr/sbin/varnishd(VAS_Fail+0x42) [0x494d32]
  0x429889: /usr/sbin/varnishd() [0x429889]
  0x45525e: /usr/sbin/varnishd() [0x45525e]
  0x4556a0: /usr/sbin/varnishd() [0x4556a0]
  0x7f6a513d0184: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7f6a513d0184]
  0x7f6a510fd03d: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f6a510fd03d]
thread = (cache-worker)
thr.req = (nil) {
},
thr.busyobj = 0x7f6668084020 {
  ws = 0x7f6668084058 {
    id = \"bo\",
    {s, f, r, e} = {0x7f666808d5b0, +59976, (nil), +59976},
  },
  retries = 3, failed = 1, flags = {do_stream, do_pass, uncacheable, is_gunzip},
  http_conn = 0x7f666809b428 {
    fd = 90 (@0x7f61c590c0b4),
    doclose = REQ_CLOSE,
    ws = 0x7f6668084058 {
      [Already dumped, see above]
    },
    {rxbuf_b, rxbuf_e} = {0x7f666809b488, 0x7f666809b54a},
    {pipeline_b, pipeline_e} = {0x7f666809b54a, 0x7f666809bfd8},
    content_length = -1,
    body_status = chunked,
    first_byte_timeout = 60.000000,
    between_bytes_timeout = 60.000000,
  },
  director_req = 0x7f66dfd9bd40 {
    vcl_name = *redacted*_shard,
    type = (null) {
    },
  },
  director_resp = 0x7f62a1090710 {
    vcl_name = *redacted*,
    type = backend {
      display_name = *redacted*,
      ipv4 = *redacted*,
      port = 80,
      hosthdr = *redacted*,
      health = healthy,
      admin_health = probe, changed = 1523458612.295022,
      n_conn = 1,
    },
  },
  http[bereq] = 0x7f6668084840 {
    ws = 0x7f6668084058 {
      [Already dumped, see above]
    },
    hdrs {
      \"GET\",
      *redacted*,
    },
  },
  http[beresp] = 0x7f6668084ed8 {
    ws = 0x7f6668084058 {
      [Already dumped, see above]
    },
    hdrs {
      \"HTTP/1.1\",
      \"500\",
      \"Internal Server Error\",
      \"Date: Sat, 14 Apr 2018 00:26:59 GMT\",
      \"Server: Apache/2.4.18 (Ubuntu)\",
      \"Content-Type: text/html;charset=utf-8\",
      \"Content-Language: en\",
      \"Connection: close\",
      \"Transfer-Encoding: chunked\",
    },
  },
  objcore[fetch] = 0x7f650c560d40 {
    refcnt = 1,
    flags = {pass, private},
    exp_flags = {},
    boc = 0x7f64b4e5fc80 {
      refcnt = 1,
      state = stream,
      vary = (nil),
      stevedore_priv = (nil),
    },
    exp = {1523665619.406912, 0.000000, 0.000000, 0.000000},
    objhead = 0x7f6a504d7080,
    stevedore = 0x7f6a50428480 (malloc Transient) {
      Simple = 0x7f62ef4e7080,
      Obj = 0x7f62a942a928 {priv=0x7f62a942a920, ptr=0x7f62ef4e7080, len=288, space=288},
      LEN = 0x0...0,
      VXID = 0x22f88c5c,
      FLAGS = 0x00,
      GZIPBITS = 0x0...0,
      LASTMODIFIED = 0x41d6b452b4c00000,
      VARY = {len=0, ptr=(nil)},
      HEADERS = {len=168, ptr=0x7f62ef4e70f8},
    },
  },
  vcl = {
    name = *redacted*,
    busy = 106,
    discard = 0,
    state = auto,
    temp = warm,
    conf = {
      srcname = {
        *redacted*
      },
    },
  },
  vmods = {
    std = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    directors = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    bodyaccess = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    re = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 7.0},
    re2 = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    dispatch = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    dcs = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    var = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 7.0},
    vtc = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 0.0},
    header = {Varnish trunk 5a239cbc3a6def944712f3ad063076cb14724acf, 7.0},
  },
},

@nigoroll
Copy link
Member Author

I'm lost. Before 71c29e8 (on acd9cab) I can easily reproduce this with an out-of-workspace condition, but I absolutely fail to see how we can get to the un-overflowed bo workspace.
Will try to get a current build into production and see if it happens again.

@nigoroll
Copy link
Member Author

for completeness, here's the same panic reproduced on acd9cab with vtc - but with the overflowed ws.

***  v1    1.5 debug|Error: Child (16501) died signal=6 (core dumped)
***  v1    1.5 debug|Error: Child (16501) Panic at: Thu, 19 Apr 2018 07:29:32 GMT
***  v1    1.5 debug|Assert error in vbf_stp_fetchbody(), cache/cache_fetch.c line 442:
***  v1    1.5 debug|  Condition((vfc->vfp_nxt) != 0) not true.
***  v1    1.5 debug|version = varnish-trunk revision acd9cab19f21a70504f13415687b5696f8428eaa, vrt api = 7.0
***  v1    1.5 debug|ident = Linux,4.9.0-6-amd64,x86_64,-jnone,-sdefault,-sdefault,-hcritbit,epoll
***  v1    1.5 debug|now = 4907.133083 (mono), 1524122971.140699 (real)
***  v1    1.5 debug|Backtrace:
***  v1    1.5 debug|  0x55b5c9bb5c47: varnishd(+0x49c47) [0x55b5c9bb5c47]
***  v1    1.5 debug|  0x55b5c9c18830: varnishd(VAS_Fail+0x40) [0x55b5c9c18830]
***  v1    1.5 debug|  0x55b5c9ba4dc8: varnishd(+0x38dc8) [0x55b5c9ba4dc8]
***  v1    1.5 debug|  0x55b5c9bd443c: varnishd(+0x6843c) [0x55b5c9bd443c]
***  v1    1.5 debug|  0x55b5c9bd4900: varnishd(+0x68900) [0x55b5c9bd4900]
***  v1    1.5 debug|  0x7f04c3524494: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f04c3524494]
***  v1    1.5 debug|  0x7f04c3266acf: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f04c3266acf]
***  v1    1.5 debug|thread = (cache-worker)
***  v1    1.5 debug|thr.req = (nil) {
***  v1    1.5 debug|},
***  v1    1.5 debug|thr.busyobj = 0x7f04b8056020 {
***  v1    1.5 debug|  ws = 0x7f04b8056058 {
***  v1    1.5 debug|    OVERFLOWED id = \"Bo\",
***  v1    1.5 debug|    {s, f, r, e} = {0x7f04b8057f50, +57512, (nil), +57512},
***  v1    1.5 debug|  },
***  v1    1.5 debug|  retries = 0, failed = 1, flags = {do_stream, do_pass, uncacheable, is_gunzip},
***  v1    1.5 debug|  http_conn = 0x7f04b8065f48 {
***  v1    1.5 debug|    fd = 28 (@0x7f04b602d084),
***  v1    1.5 debug|    doclose = NULL,
***  v1    1.5 debug|    ws = 0x7f04b8056058 {
***  v1    1.5 debug|      [Already dumped, see above]
***  v1    1.5 debug|    },
***  v1    1.5 debug|    {rxbuf_b, rxbuf_e} = {0x7f04b8065fa8, 0x7f04b8065fd7},
***  v1    1.5 debug|    {pipeline_b, pipeline_e} = {0x7f04b8065fd7, 0x7f04b8065ff3},
***  v1    1.5 debug|    content_length = -1,
***  v1    1.5 debug|    body_status = chunked,
***  v1    1.5 debug|    first_byte_timeout = 60.000000,
***  v1    1.5 debug|    between_bytes_timeout = 60.000000,
***  v1    1.5 debug|  },
***  v1    1.5 debug|  director_req = 0x7f04c29af210 {
***  v1    1.5 debug|    vcl_name = s1,
***  v1    1.5 debug|    type = backend {
***  v1    1.5 debug|      display_name = vcl1.s1,
***  v1    1.5 debug|      ipv4 = 127.0.0.1,
***  v1    1.5 debug|      port = 36315,
***  v1    1.5 debug|      hosthdr = 127.0.0.1,
***  v1    1.5 debug|      health = healthy,
***  v1    1.5 debug|      admin_health = probe, changed = 1524122970.944443,
***  v1    1.5 debug|      n_conn = 1,
***  v1    1.5 debug|    },
***  v1    1.5 debug|  },
***  v1    1.5 debug|  director_resp = director_req,
***  v1    1.5 debug|  http[bereq] = 0x7f04b8056620 {
***  v1    1.5 debug|    ws = 0x7f04b8056058 {
***  v1    1.5 debug|      [Already dumped, see above]
***  v1    1.5 debug|    },
***  v1    1.5 debug|    hdrs {
***  v1    1.5 debug|      \"GET\",
***  v1    1.5 debug|      \"/\",
***  v1    1.5 debug|      \"HTTP/1.1\",
***  v1    1.5 debug|      \"Host: 127.0.0.1\",
***  v1    1.5 debug|      \"X-Forwarded-For: 127.0.0.1\",
***  v1    1.5 debug|      \"X-Varnish: 1089\",
***  v1    1.5 debug|    },
***  v1    1.5 debug|  },
***  v1    1.5 debug|  http[beresp] = 0x7f04b8056a98 {
***  v1    1.5 debug|    ws = 0x7f04b8056058 {
***  v1    1.5 debug|      [Already dumped, see above]
***  v1    1.5 debug|    },
***  v1    1.5 debug|    hdrs {
***  v1    1.5 debug|      \"HTTP/1.1\",
***  v1    1.5 debug|      \"200\",
***  v1    1.5 debug|      \"OK\",
***  v1    1.5 debug|      \"Transfer-encoding: chunked\",
***  v1    1.5 debug|    },
***  v1    1.5 debug|  },
***  v1    1.5 debug|  objcore[fetch] = 0x7f04b640f0c0 {
***  v1    1.5 debug|    refcnt = 2,
***  v1    1.5 debug|    flags = {pass, private},
***  v1    1.5 debug|    exp_flags = {},
***  v1    1.5 debug|    boc = 0x7f04b6412080 {
***  v1    1.5 debug|      refcnt = 3,
***  v1    1.5 debug|      state = stream,
***  v1    1.5 debug|      vary = (nil),
***  v1    1.5 debug|      stevedore_priv = (nil),
***  v1    1.5 debug|    },
***  v1    1.5 debug|    exp = {1524122971.140661, -1.000000, 0.000000, 0.000000},
***  v1    1.5 debug|    objhead = 0x7f04c28ce080,
***  v1    1.5 debug|    stevedore = 0x7f04c28ca180 (malloc Transient) {
***  v1    1.5 debug|      Simple = 0x7f04b602f020,
***  v1    1.5 debug|      Obj = 0x7f04b6023928 {priv=0x7f04b6023920, ptr=0x7f04b602f020, len=144, space=144},
***  v1    1.5 debug|      LEN = 0x0...0,
***  v1    1.5 debug|      VXID = 0x00000441,
***  v1    1.5 debug|      FLAGS = 0x00,
***  v1    1.5 debug|      GZIPBITS = 0x0...0,
***  v1    1.5 debug|      LASTMODIFIED = 0x41d6b61156c00000,
***  v1    1.5 debug|      VARY = {len=0, ptr=(nil)},
***  v1    1.5 debug|      HEADERS = {len=24, ptr=0x7f04b602f098},
***  v1    1.5 debug|    },
***  v1    1.5 debug|  },
***  v1    1.5 debug|  vcl = {
***  v1    1.5 debug|    name = \"vcl1\",
***  v1    1.5 debug|    busy = 3,
***  v1    1.5 debug|    discard = 0,
***  v1    1.5 debug|    state = auto,
***  v1    1.5 debug|    temp = warm,
***  v1    1.5 debug|    conf = {
***  v1    1.5 debug|      srcname = {
***  v1    1.5 debug|        \"<vcl.inline>\",
***  v1    1.5 debug|        \"Builtin\",
***  v1    1.5 debug|      },
***  v1    1.5 debug|    },
***  v1    1.5 debug|  },
***  v1    1.5 debug|  vmods = {
***  v1    1.5 debug|    vtc = {Varnish trunk acd9cab19f21a70504f13415687b5696f8428eaa, 0.0},
***  v1    1.5 debug|    std = {Varnish trunk acd9cab19f21a70504f13415687b5696f8428eaa, 0.0},
***  v1    1.5 debug|  },
***  v1    1.5 debug|},
***  v1    1.5 debug|
***  v1    1.5 debug|
***  v1    1.5 debug|Debug: Child cleanup complete
***  v1    1.5 CLI RX  200
---- v1    1.5 Unexpected panic
*    top   1.5 failure during reset

nigoroll added a commit that referenced this issue Apr 19, 2018
issue a membar to ensure that the overflow marker is written
before a failing WS allocation returns and is thus visible from
a different context.

pan_ic() already has an implicit membar via the mutex lock.

Ref #2645
nigoroll added a commit that referenced this issue Apr 20, 2018
As WS_Reset() clears the overflow marker, the correct order for
resetting and marking an overflow is WS_Reset(); WS_MarkOverflow();

Fixes the last seemingly obscure bit of #2645
nigoroll added a commit that referenced this issue Apr 20, 2018
No need to remain paranoid when the actual root case is dead
simple (see previous commit)

This reverts commit f5ce551.

Ref: #2645
dridi pushed a commit to dridi/varnish-cache that referenced this issue Jun 20, 2018
Ref varnishcache#2645 but cannot be the cause because workspace is just not
overflowed
dridi pushed a commit to dridi/varnish-cache that referenced this issue Jun 20, 2018
dridi pushed a commit to dridi/varnish-cache that referenced this issue Jun 20, 2018
As WS_Reset() clears the overflow marker, the correct order for
resetting and marking an overflow is WS_Reset(); WS_MarkOverflow();

Fixes the last seemingly obscure bit of varnishcache#2645
dmatetelki pushed a commit to dmatetelki/varnish-cache that referenced this issue Mar 14, 2019
Ref varnishcache#2645 but cannot be the cause because workspace is just not
overflowed
dmatetelki pushed a commit to dmatetelki/varnish-cache that referenced this issue Mar 14, 2019
dmatetelki pushed a commit to dmatetelki/varnish-cache that referenced this issue Mar 14, 2019
issue a membar to ensure that the overflow marker is written
before a failing WS allocation returns and is thus visible from
a different context.

pan_ic() already has an implicit membar via the mutex lock.

Ref varnishcache#2645
dmatetelki pushed a commit to dmatetelki/varnish-cache that referenced this issue Mar 14, 2019
As WS_Reset() clears the overflow marker, the correct order for
resetting and marking an overflow is WS_Reset(); WS_MarkOverflow();

Fixes the last seemingly obscure bit of varnishcache#2645
dmatetelki pushed a commit to dmatetelki/varnish-cache that referenced this issue Mar 14, 2019
No need to remain paranoid when the actual root case is dead
simple (see previous commit)

This reverts commit f5ce551.

Ref: varnishcache#2645
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants