Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Three fixes for deadlocks #24672

Merged
merged 3 commits into from
Apr 11, 2023

Commits on Apr 11, 2023

  1. endpoint, proxy: Fix deadlock in writeHeaderfile

    This fixes a deadlock with the IPCache and the endpoint mutex where
    writeHeaderfile calls into the proxy to fetch the endpoint's DNSRules
    while holding on to the endpoint mutex. That pattern causes a deadlock,
    because the proxy code itself then calls into IPCache while the endpoint
    mutex is held. This violates the assumed lock ordering, as we require
    that the IPCache mutex is always acquired _before_ any endpoint mutex.
    
    We fix this deadlock by hoisting call to update the endpoint's DNSRules
    out of writeHeaderfile and into the syncEndpointHeaderFile function,
    which is triggered whenever there is a change in DNSRules.
    
    Fixes: cilium#23836
    Ref: cilium#20915
    
    Co-authored-by: David Bimmler <david.bimmler@isovalent.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
    gandro and bimmlerd committed Apr 11, 2023
    Configuration menu
    Copy the full SHA
    d5092d2 View commit details
    Browse the repository at this point in the history
  2. dnsproxy: fix potential deadlock with IPCache

    GetRules held the DNSProxy lock while accessing IPCache. This
    established a partial lock order in which DNSProxy > IPCache. (Read
    DNSProxy > IPCache as "DNSProxy lock held while attempting to acquire
    IPCache lock." We use '>' as a convenient notation as it intuitively
    implies transitivity.)
    
    Since we also have IPCache > Endpoint (cf InjectLabels), this
    established DNSProxy > Endpoint by transitivity. That, however, is
    incompatible with instances where we hold the Endpoint lock while
    acquiring the DNSProxy lock (as that establishes Endpoint > DNSProxy).
    Having both EP > DNSProxy and DNSProxy > EP is not allowed and leads to
    deadlocks. To break the cycle, hoist the IPCache lookups out of the
    DNSProxy critical section.
    
    Without DNSProxy > IPCache, Endpoint > DNSProxy is fine, as there's no
    more DNSPRoxy > IPCache > Endpoint and wait cycle.
    
    Of course, any other instance where X > IPCache and Endpoint > X can
    still result in a deadlock. Somewhat insidiously, these wait chains can
    be arbitrarily long, and hence arbitrarily difficult to find.
    
    Suggested-by: Sebastian Wicki <sebastian@isovalent.com>
    Suggested-by: Chris Tarazi <chris@isovalent.com>
    Suggested-by: Joe Stringer <joe@cilium.io>
    Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
    bimmlerd authored and gandro committed Apr 11, 2023
    Configuration menu
    Copy the full SHA
    83ce298 View commit details
    Browse the repository at this point in the history
  3. ipcache: fix potential deadlock in GetNamedPorts

    We have a partial lock ordering of IPCache > Endpoint as established in
    InjectLabels, where the IPCache lock is held and all endpoint locks are
    acquired one by one. On the other hand, GetNamedPorts was called in a
    context with the Endpoint lock held (cf *Endpoint.GetNamedPortLocked),
    and attempts to acquire the IPCache lock. This would establish EP >
    IPCache, which potentially deadlocks with the above.
    
    To fix it, remove the requirement for a write lock on IPCache for
    GetNamedPorts. Instead, we always pre-compute the namedPorts map in
    Upsert and Delete and simply return it in GetNamedPorts. To ensure
    atomicity, we use atomics to load and store read-only pointer to the
    namedPorts map and the needNamedPorts flag.
    
    Co-authored-by: Sebastian Wicki <sebastian@isovalent.com>
    Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
    bimmlerd and gandro committed Apr 11, 2023
    Configuration menu
    Copy the full SHA
    5b155ce View commit details
    Browse the repository at this point in the history