Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

director refcounting #2725

Closed
wants to merge 3 commits into from
Closed

Conversation

nigoroll
Copy link
Member

based upon #2723, description from the main commit's message:

This implementation is centered around the following insights:

  • Full refcounting is expensive and tidious

  • For directors, there exist mainly two types of references

    • long term, from a vcl or a director using another director as a backend

    • short term for the duration of a VCL task, such as an assignment to (be)req.backend

Based on these insights, we implement implicit and explicit references:

  • Otherwise unreferenced VCL_BACKENDs are put on a cool list (as before for backends), but we now use many cool lists

  • Each VCL task holds an implicit reference to all VCL_BACKENDs by referencing the current vdi_cool list, thus for any reference with a scope smaller or equal to a TASK, no explicit reference needs to and should be taken

  • VRT_VDI_Ref/Unref must be called whenever a reference to a VCL_BACKEND with longer scope is kept, for example when adding/removing a VCL_BACKEND to/from a director.

Details on coollists:

  • each task references the current coollist

  • all VCL_BACKENDs with refcnt == 0 are put on the coollist ref'd by the current task. VCL_BACKENDs with refcnt > 0 are not on any coollist

  • when a VCL_BACKEND is put on a coollist and the coollist has reached the minimum age, we start a new coollist which is then referenced by all new tasks. This way, the old coollist with the VCL_BACKEND(s) to be killed will eventually get unref'd

  • We kill backends on unref'd coollists in the order of their creation

VCL Backends get created with one initial reference.

Dynamic backends get created on the coollist, so their initial scope is just the current task.

@nigoroll
Copy link
Member Author

nigoroll commented Jul 2, 2018

now that #2723 is dead, this will need to look slightly diffrent

@nigoroll
Copy link
Member Author

nigoroll commented Jul 2, 2018

I've tried to wrap my head around this and failed. At this point I think that getting back to #2723 would be the cleanest option. Quoting myself on IRC:

no I was wrong again. So here's the trouble: for the VCL-defined director, all is well, we just declare it dead, reset the methods and wait for the other directors to unref, before we free the struct. BUT for the dynamic backends which get cleaned up during the lifetime of a VCL, that doesn't work, we would preserve the exact problem we want to solve, we must not kill the director until the last reference goes away. But if that last reference is held by a "VCL director" which gets finalized during VGC_Discard, we're back to where we left of

@nigoroll
Copy link
Member Author

nigoroll commented Jul 3, 2018

note to self: this should do it:

  • require vmods to never hold director references in the vmod itself, only in director instances
  • in VCC-generated VGC_discard, flush directors after finalizing all of them
    • probably need to loop until all references gone
  • then finalize vmods as before

@nigoroll
Copy link
Member Author

I've updated the patch as planned:

  • removed the vmod refcounting
  • added cool list flushing before vmod finalization

This implementation is centered around the following insights:

* Full refcounting is expensive and tidious

* For directors, there exist mainly two types of references

  - long term, from a vcl or a director using another director as a
    backend

  - short term for the duration of a VCL task, such as an assignment
    to (be)req.backend

Based on these insights, we implement implicit and explicit
references:

- Otherwise unreferenced VCL_BACKENDs are put on a cool list (as
  before for backends), but we now use many cool lists

- Each VCL task holds an implicit reference to all VCL_BACKENDs by
  referencing the current vdi_cool list, thus for any reference with a
  scope smaller or equal to a TASK, no explicit reference needs to
  and should be taken

- VRT_VDI_Ref/Unref must be called whenever a reference to a
  VCL_BACKEND with longer scope is kept, for example when
  adding/removing a VCL_BACKEND to/from a director.

Details on coollists:

- each task references the current coollist

- all VCL_BACKENDs with refcnt == 0 are put on the coollist ref'd by
  the current task. VCL_BACKENDs with refcnt > 0 are not on any
  coollist

- when a VCL_BACKEND is put on a coollist and the coollist has reached
  the minimum age, we start a new coollist which is then referenced by
  all new tasks. This way, the old coollist with the VCL_BACKEND(s) to
  be killed will eventually get unref'd

- We kill backends on unref'd coollists in the order of their creation

VCL Backends get created with one initial reference.

Dynamic backends get created on the coollist, so their initial scope
is just the current task.
v00003.vtc exposes some situation which I have not yet fully understood,
but it seems that VBP_Control should only act if anything needs to be
done.
@nigoroll
Copy link
Member Author

panic seen with this patch:

Panic at: Tue, 28 Aug 2018 20:39:15 GMT
Assert error in VRT_UnrefDirector(), cache/cache_vcl_vrt.c line 282:
  Condition((vcl)->magic == 0x214188f2) not true.
version = varnish-trunk revision 17f7f28f93ac8fc772282918c0c19b08c2f11c4f, vrt api = 7.0
ident = Linux,3.10.0-327.10.1.el7.x86_64,x86_64,-jnone,-smalloc,-sdefault,-hcritbit,epoll
now = 76665540.644865 (mono), 1535488733.976607 (real)
Backtrace:
  0x443636: /opt/varnish/sbin/varnishd() [0x443636]
  0x443b27: /opt/varnish/sbin/varnishd() [0x443b27]
  0x4c1c51: /opt/varnish/sbin/varnishd(VAS_Fail+0x4d) [0x4c1c51]
  0x4586a4: /opt/varnish/sbin/varnishd(VRT_UnrefDirector+0x11e) [0x4586a4]
  0x7f4521df35f5: ./vmod_cache/_vmod_directors.HLCXTVFDFUDNIOBOIUIPWOVCRQHUDPWK(+0xa5f5) [0x7f4521df35f5]
  0x7f4521df453e: ./vmod_cache/_vmod_directors.HLCXTVFDFUDNIOBOIUIPWOVCRQHUDPWK(shardcfg_delete+0x37) [0x7f4521df453e]
  0x7f4521df56c8: ./vmod_cache/_vmod_directors.HLCXTVFDFUDNIOBOIUIPWOVCRQHUDPWK(sharddir_delete+0xb8) [0x7f4521df56c8]
  0x7f4521defe27: ./vmod_cache/_vmod_directors.HLCXTVFDFUDNIOBOIUIPWOVCRQHUDPWK(+0x6e27) [0x7f4521defe27]
  0x422836: /opt/varnish/sbin/varnishd() [0x422836]
  0x422f3f: /opt/varnish/sbin/varnishd(VDI_Cool_Unref+0x2db) [0x422f3f]

@nigoroll
Copy link
Member Author

20ab2ab very likely helps this patch, will need to get back to it later (not before september release)

@nigoroll
Copy link
Member Author

nigoroll commented Sep 3, 2018

closing this PR as it is not ready any more after recent changes and needs one issue checked

@nigoroll
Copy link
Member Author

FTR: We had discussed this idea during the last VDD and @mbgrydeland 's concern was that another per-task mutex operation was unacceptable for performance reasons. I now realized that we could probably combine this with VCL_Ref() / VCL_Rel(), which, besides avoiding another lock operation, could actually simplify the patch also.

Might get back to this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants