Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid iptables lock error for endpoint port mapping #44330

Closed
wants to merge 3 commits into from
Closed

Avoid iptables lock error for endpoint port mapping #44330

wants to merge 3 commits into from

Conversation

rcj4747
Copy link

@rcj4747 rcj4747 commented Oct 19, 2022

Fixes: #44331 Run failure due to iptables xtables lock contention

On Ubuntu Focal the iptables implementation returns immediately with an
error if the xtables lock is held. This is a change in behavior that
has exposed a latent issue in Docker's implementation.

After moving to Focal hosts in our CI environment, contention on the
iptables xtable lock is leading to the following error message (these
failures are easy to trigger and frequent):

  Error response from daemon: driver failed programming
  external connectivity on endpoint foo
  (4ee313e3bf3f375e70c3ee5d00b9000523cb80e5fb29b9ccff565913c74ab6ec):
  (iptables failed: iptables -t nat -A POSTROUTING -p tcp -s 172.16.128.7
  -d 172.16.128.7 --dport 8081 -j MASQUERADE: Another app is currently
  holding the xtables lock. Perhaps you want to use the -w option?

The source of that error message is endpoint.sbJoin in
libnetwork/endpoint.go. It calls:

  endpoint.sbJoin
   driver.ProgramExternalConnectivity (libnetwork/drivers/bridge/bridge.go)
    bridgeNetwork.allocatePorts
     bridgeNetwork.allocatePortsInternal
      bridgeNetwork.allocatePort
       PortMapper.MapRange
        PortMapper.AppendForwardingTableEntry
         PortMapper.Forward
          ChainInfo.Forward
           iptable.ProgramRule
            iptable.Exist
             iptable.exists
              iptable.existsRaw

iptable.raw calls iptables with --wait if supported, but
iptable.existsRaw does not. The invocation of iptables in existsRaw
has not changed since 2017, but the OS behavior has. In Ubuntu 20.04
(Focal) the call to iptables -S will fail immediately if the lock is
held by another process, this is not the case in earlier (Bionic) or
later (Jammy) releases.

This patch adds the --wait option to this invocation of iptable (or
if that is not supported the bestEffortLock is used) when listing
existing rules.

Signed-off-by: Robert C Jennings rcj4747@gmail.com

- What I did
Addressed the iptables lock contention

- How I did it
Added the same handling for iptable.existsRaw as is found in iptable.raw; specifically yhis patch adds the --wait option to this invocation of iptable (or if that is not supported the bestEffortLock is used) when listing existing rules.

- How to verify it
On Ubuntu 20.04 (Bionic) you can explicitly take the lock with flock /run/xtables.lock sleep 60& and run a container with port mappings. (edit: running flock will block the other iptables calls which wait for the lock, you need contention and timing to recreate)

- Description for the changelog
Wait for iptables lock when listing existing rules with bridge networks

@rcj4747 rcj4747 marked this pull request as ready for review October 19, 2022 16:46
On Ubuntu Focal the iptables implementation returns immediately with an
error if the xtables lock is held.  This is a change in behavior that
has exposed a latent issue in Docker's implementation.

After moving to Focal hosts in our CI environment, contention on the
iptables xtable lock is leading to the following error message (these
failures are easy to trigger and frequent):

```
  Error response from daemon: driver failed programming
  external connectivity on endpoint foo
  (4ee313e3bf3f375e70c3ee5d00b9000523cb80e5fb29b9ccff565913c74ab6ec):
  (iptables failed: iptables -t nat -A POSTROUTING -p tcp -s 172.16.128.7
  -d 172.16.128.7 --dport 8081 -j MASQUERADE: Another app is currently
  holding the xtables lock. Perhaps you want to use the -w option?
```

The source of that error message is endpoint.sbJoin in
libnetwork/endpoint.go.  It calls:

```
  endpoint.sbJoin
   driver.ProgramExternalConnectivity (libnetwork/drivers/bridge/bridge.go)
    bridgeNetwork.allocatePorts
     bridgeNetwork.allocatePortsInternal
      bridgeNetwork.allocatePort
       PortMapper.MapRange
        PortMapper.AppendForwardingTableEntry
         PortMapper.Forward
          ChainInfo.Forward
           iptable.ProgramRule
            iptable.Exist
             iptable.exists
              iptable.existsRaw
```

`iptable.raw` calls iptables with `--wait` if supported, but
`iptable.existsRaw` does not.  The invocation of iptables in `existsRaw`
has not changed since 2017, but the OS behavior has.  In Ubuntu 20.04
(Focal) the call to `iptables -S` will fail immediately if the lock is
held by another process, this is not the case in earlier (Bionic) or
later (Jammy) releases.

This patch adds the `--wait` option to this invocation of iptable (or
if that is not supported the `bestEffortLock` is used) when listing
existing rules.

Signed-off-by: Robert C Jennings <rcj4747@gmail.com>
Fixes: #44331
@rcj4747
Copy link
Author

rcj4747 commented Oct 20, 2022

Local testing shows that this fix for #44331 is incomplete. With the added --wait for iptable.existsRaw (which I do believe is needed) we still see failures.

Looking again at the error message I see that this is coming from iptable.raw which already had the logic to add the --wait arg if supportsXlock was true. Our environment is new enough that this bool should be true and we've validated by running the same test used to set the variable (iptables --wait -L -n). I think the problem is how this is initialized in iptable.detectIptables:

func detectIptables() {
	path, err := exec.LookPath("iptables")
	if err != nil {
		return
	}
	iptablesPath = path
	path, err = exec.LookPath("ip6tables")
	if err != nil {
		return
	}
	ip6tablesPath = path
	supportsXlock = exec.Command(iptablesPath, "--wait", "-L", "-n").Run() == nil
	mj, mn, mc, err := GetVersion()
	if err != nil {
		logrus.Warnf("Failed to read iptables version: %v", err)
		return
	}
	supportsCOpt = supportsCOption(mj, mn, mc)

The code to initialize ip6tablesPath was added and if ip6tables doesn't exist the rest of the checks for iptables command items, including setting supportsXlock are not run. We can fix this by inverting the err check and only setting ip6tablesPath if err is nil and then continuing. This doesn't address the fact that ip6tablesPath being unset will still cause issues if ipv6 rules are added, that should be addressed as a separate issue (to be clear that's not a regression but just a latent issue found by inspection while looking at this issue).

I will test this on Monday in our production environment to confirm my suspicion.

@thaJeztah
Copy link
Member

Thanks for your PR! I had a quick glance, and changes in the PR looked good; I only wanted to double-check if the bestEffortLock could safely be used in this location (I know libnetwork has some fairly "complicated" code flows, so wanted to double check if there's no chance on a deadlock if the same lock is used in other places); from a first check on that, it looked fine though!

For the iptables init; ISTR there was another PR touching that part. Let me see if I can find that one (and why it wasn't merged yet, or perhaps it was, and it made another change).

@thaJeztah
Copy link
Member

Ah! I think this was the one; #43060

It's been a while since I looked at that one, and if there were still remaining changes to be addressed (reviews are always welcome; more eyes never hurt!)

@thaJeztah
Copy link
Member

@rcj4747 looks like your last two commits are missing a DCO sign-off (which makes CI fail) could you amend those commits? (let me know if you need help with "rebase" / "amend" instructions 👍 )

The check for the optional ip6tables executable short-circuits other
initialization using the iptables command.  This patch allows the other
initialization to occur even after ip6tables is not found.

Signed-off-by: Robert C Jennings <rcj4747@gmail.com>
ip6tables is treated as an option executable, unlike the iptables
command, but there are no guards against calling exec.Command with an
empty string for the executable name if an ipv6 rule is provided.  This
patch adds checks where ip6tablesPath is referenced to allow for more
graceful and helpful failure.

Signed-off-by: Robert C Jennings <rcj4747@gmail.com>
@rcj4747
Copy link
Author

rcj4747 commented Oct 20, 2022

@thaJeztah I fixed the commits up with sign-offs but I had pushed them initially before I saw your comment about #43060. That PR would also seemingly address the issue handled in these last 2 commits. How would you like to proceed?

@rcj4747
Copy link
Author

rcj4747 commented Oct 20, 2022

Thanks for your PR! I had a quick glance, and changes in the PR looked good; I only wanted to double-check if the bestEffortLock could safely be used in this location (I know libnetwork has some fairly "complicated" code flows, so wanted to double check if there's no chance on a deadlock if the same lock is used in other places); from a first check on that, it looked fine though!

I did my best to search for this as well and I did not find any chances for deadlock. It's taken just before the exec.Command call to iptables and I didn't find usage further up the call chain.

@thaJeztah
Copy link
Member

No worries! Let's keep them in for now (we can always drop them if we decide to go for the other PR). Let me try to get more eyes on both to see what direction we'll go.

Might not be Today, but I'll try to get back to this (in case I don't; don't hesitate to give me a @ ping ❤️ 😂 )

@rcj4747
Copy link
Author

rcj4747 commented Oct 25, 2022

@thaJeztah, I like the direction #43060 is taking and I'd like to drop the 2 patches I added that overlap with that PR.

We're seeing iptables lock contention when running docker-compose up and it's causing spurious CI run failures. This PR was initially concerned only with the lack of --wait on the iptables.existsRaw call to eliminate contention. I'd suggest we drop the 2 additional fixes I put in for ip6tables in this PR because they're well handled by #43060. Without those, this PR should be small enough in scope for cherry-picking. As for the issue fixed by #43060 we're working around that by simply adding the ip6tables executable to our image; it's a much smaller change in the short-term and we can drop it when the other PR is in a release or maybe we'll add IPv6 support to our code. I'll wait for feedback before making any changes. Thanks!

@thaJeztah
Copy link
Member

(close/reopen to re-trigger (and re-load) github actions)

@thaJeztah thaJeztah closed this Nov 3, 2022
@thaJeztah thaJeztah reopened this Nov 3, 2022
@akerouanton
Copy link
Member

Hey @rcj4747, as you discovered the real issue lies in Docker early exiting detectIptables() when ip6tables isn't available. Once this got fixed, --wait & -C flags are detected properly, even if ip6tables isn't available.

As mentioned in my PR (#43060), all distributions supported by Docker are now shipping iptables with -C flag. This flag is used in exists to check if a given rule exists, and existsRaw is used as a fallback (doing substring matching, which is buggy) when -C flag isn't available. In my PR I'm getting rid of existsRaw because it's now irrelavant.

Same applies to your PR: since you're fixing ip6tables detection in your PR, Docker doesn't enter existsRaw anymore 😉

@rcj4747 rcj4747 closed this Jan 11, 2023
@thaJeztah thaJeztah removed this from the v-next milestone Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run failure due to iptables xtables lock contention
3 participants