Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec_cmd(): pass stderr along to sos_get_command_output() #2306

Closed
wants to merge 1 commit into from

Conversation

mfoliveira
Copy link
Contributor

The sosreport of a system with this issue:

	# ip netns
	Error: Peer netns reference is invalid.
	Error: Peer netns reference is invalid.
	test-ns

Shows this difference between the plugins ebpf and networking (callers
of ip netns):

	# ./bin/sos report -o ebpf,networking --batch

	# tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
	.../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
	.../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
	.../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
	...
	.../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
	...

Only the networking plugin called ip netns exec <ns> <cmd> on Error:
lines. Hmm.

...

The networking plugin calls collect_cmd_output(), which has parameter
stderr=True` as default:

    def collect_cmd_output(self, cmd, suggest_filename=None,
                           ...
                           stderr=True, ...

The ebpf plugin calls exec_cmd(), BUT it has parameter stderr=True
as default AS WELL:

    def exec_cmd(self, cmd, timeout=cmd_timeout, stderr=True, ...

So, why the difference?

...

Well, it turns out exec_cmd() does NOT pass the stderr parameter
along to sos_get_command_output(): [but collect_cmd_output() does.]

      return sos_get_command_output(cmd, timeout=timeout, chroot=root,
                                    chdir=runat, binary=binary, env=env,
                                    foreground=foreground)

And sos_get_command_output() has stderr=False as default:

      def sos_get_command_output(command, timeout=300, stderr=False, ...

Thus, despite exec_cmd() has stderr=True as default it is ignored,
as its callee sos_get_command_output() has stderr=False as default.

This explains the output difference between the ebpf/networking plugins.

...

Looking back for details, the origin of the change is in PR#1807 [1],
with commit e8bb94c (refactor command functions/introduce exec_cmd())
and commit e51d3e6 (update caller plugins).

The PR has no mentions/discussion around stderr in comments/review,
so it's apparently an oversight during the refactor, I guess.

Because, previously the refactored callers of sos_get_command_output()
(get_command_output(), call_ext_prog(), [_]get_cmd_output_now())
all had stderr=True as parameter/default, passing it along to callee.

(The only exception is in _collect_cmd_output() which does the first
call in the chroot with stderr, and falls back to non-chroot without
stderr, but it was already that way. Thus not changing this case.)

...

So, it seems right and safe and opportune to fix this inconsistency
between default stderr in exec_cmd() / sos_get_command_output()
by just passing it along between them:

Right: because it restores the previous behavior (before refactor)
assumed/accepted by callers that did not specify stderr=False.

Safe: because it restores the actual behavior to previous behavior,
and if plugins handled stderr lines previously, they should also
handle it now, for the most part (i.e., not any new stderr lines,
but they are currently ignoring those, and should get them covered.)

Opportune: because if any plugin breaks now because of this, it is
a new/post-refactor change that decided to use exec_cmd() with the
default that is inconsistent with actual behavior, and it seems okay
to take bugs for that, so to identify and fix such cases.

Hopefully those points are reasonable with the project's philosophy.

...

Test-case:

Setup:

	$ sudo python3

	from sos import SoS
	from sos.report.plugins import Plugin

	sos = SoS(['report'])
	plugin = Plugin(sos._component.get_commons())

Before:

	>>> print(plugin.exec_cmd('ip netns')['output'])
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=True)['output'])
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=False)['output'])
	test-ns

After:

	>>> print(plugin.exec_cmd('ip netns')['output'])
	Error: Peer netns reference is invalid.
	Error: Peer netns reference is invalid.
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=True)['output'])
	Error: Peer netns reference is invalid.
	Error: Peer netns reference is invalid.
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=False)['output'])
	test-ns

...

Test-suite:

	# ./tests/simple.sh
	<...>
	Summary

	 failures false          time 2                  -l
	 failures false          time 3                  --list-presets
	 failures false          time 3                  --list-profiles
	 failures false          time 17                 --batch --build
		 --no-env-vars
	 failures false          time 3          Size 14.23KiB   --batch
		 --no-report -o hardware
	 failures false          time 20         Size 2.99MiB    --batch
		 --label TEST -a -c never
	 failures false          time 20         Size 2.81MiB    --batch
		 --debug --log-size 0 -c always
	 failures false          time 20         Size 2.84MiB    --batch
		 -z xz --log-size 1
	 failures false          time 19         Size 4.01MiB    --batch
		 -z gzip
	 failures false          time 32         Size 2.76MiB    --batch
		 -t 1 -n hardware
	 failures false          time 19                 --batch --quiet
		 -e opencl -k kernel.with-timer
	 failures false          time 22         Size 4.31MiB    --batch
		 --case-id 10101 --all-logs --since=20201110
	 failures false          time 21         Size 3.06MiB    --batch
		 --verbose --no-postproc
	 failures false          time 39         Size 2.82MiB    --batch
		 --mask
	Everything worked!

[1] #1807

Signed-off-by: Mauricio Faria de Oliveira mfo@canonical.com


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname email@example.com?
  • If this commit closes an existing issue, is the line Closes: #ISSUENUMBER included in an independent line?
  • If this commit resolves an existing pull request, is the line Resolves: #PRNUMBER included in an independent line?

mfoliveira added a commit to mfoliveira/sos that referenced this pull request Nov 11, 2020
The sosreport of a system with this issue:
```
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Shows that the networking plugin runs commands for the `Error:` lines:
(the ebpf plugin does not; due to differences addressed in PR sosreport#2306.)
```
    # ./bin/sos report -o ebpf,networking --batch

    # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
    .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
    ...
    .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
    ...
```

This happens because the networking plugin calls `collect_cmd_output()`
with default `stderr=True` and does not handle such line type.

For the purposes of getting the list of network namespaces, it is OK to
ignore `stderr` since it does not provide that information. However, we
do want it in the archive, so it is fully documented for analysis/debug.

And if we get rid of `stderr` there, we can drop the existing check for
`Object unknown` in `ip`, as `iproute2` has always had it in `stderr`.
(This applies to the ebpf plugin as well; it's copied from networking.)

```
@ iproute2.git:

    commit aba5acdfdb347d2c21fc67d613d83d4430ca3937
    Author: Stephen Hemminger <stephen@networkplumber.org>
    Date:   Thu Apr 15 20:56:59 2004 +0000

        (Logical change 1.3)
    ...
    diff --git a/ip/ip.c b/ip/ip.c
    ...
    +               fprintf(stderr, "Object \"%s\" is unknown, try
                    \"ip help\".\n", argv[1]);
    +               exit(-1);
    ...
```

Note that both plugins _currently_ do not need `exec_cmd(stderr=False)`
(see output above) as described in PR sosreport#2306, but it will once/if that's
applied.  So, add that proactively; and it doesn't hurt/has any effects
without that PR anyway.

Before:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
  ...
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  ...
```

After:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_test-ns_ip_-s_-s_neigh_show
  ...
```

And the `ip netns` contents with `stderr` still remain in the archive:
```
    # tar xf /tmp/sosreport-*.tar.xz --to-stdout \
      --wildcards '*/sos_commands/networking/ip_netns'
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Test suite:

```
    # ./tests/simple.sh
    ...
    Everything worked!
```

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
mfoliveira added a commit to mfoliveira/sos that referenced this pull request Nov 12, 2020
The sosreport of a system with this issue:
```
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Shows that the networking plugin runs commands for the `Error:` lines:
(note the ebpf plugin does not due to differences addressed in PR sosreport#2306)
```
    # ./bin/sos report -o ebpf,networking --batch

    # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
    .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
    ...
    .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
    ...
```

This happens because the networking plugin calls `collect_cmd_output()`
with default `stderr=True` and does not handle such line type.

For the purposes of getting the list of network namespaces, it is OK to
ignore `stderr` since it does not provide that information. However, we
do want it in the archive, so it is fully documented for analysis/debug.

So change the call from `collect_cmd_output()` to `add_cmd_output()` to
include both `stdout` and `stderr` in the archive, and call `exec_cmd()`
that ignores `stderr` to get the list of network namespaces.

Note that the plugin _currently_ does not need `exec_cmd(stderr=False)`
to ignore `stderr`, as described in PR#2306, but will once/if that is
applied. However, with the next patch, `stderr=False` won't be needed.

Before:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
  ...
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  ...
```

After:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_test-ns_ip_-s_-s_neigh_show
  ...
```

And the `ip netns` contents with `stderr` still remain in the archive:
```
    # tar xf /tmp/sosreport-*.tar.xz --to-stdout \
      --wildcards '*/sos_commands/networking/ip_netns'
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Test suite:

```
    # ./tests/simple.sh
    ...
    Everything worked!
```

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
mfoliveira added a commit to mfoliveira/sos that referenced this pull request Nov 12, 2020
The sosreport of a system with this issue:
```
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Shows that the networking plugin runs commands for the `Error:` lines:
(note the ebpf plugin does not due to differences addressed in PR sosreport#2306)
```
    # ./bin/sos report -o ebpf,networking --batch

    # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
    .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
    ...
    .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
    ...
```

This happens because the networking plugin calls `collect_cmd_output()`
with default `stderr=True` and does not handle such line type.

For the purposes of getting the list of network namespaces, it is OK to
ignore `stderr` since it does not provide that information. However, we
do want it in the archive, so it is fully documented for analysis/debug.

So change the call from `collect_cmd_output()` to `add_cmd_output()` to
include both `stdout` and `stderr` in the archive, and call `exec_cmd()`
that ignores `stderr` to get the list of network namespaces.

Note that the plugin _currently_ does not need `exec_cmd(stderr=False)`
to ignore `stderr`, as described in PR#2306, but will once/if that is
applied. However, with the next patch, `stderr=False` won't be needed.

Before:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
  ...
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  ...
```

After:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_test-ns_ip_-s_-s_neigh_show
  ...
```

And the `ip netns` contents with `stderr` still remain in the archive:
```
    # tar xf /tmp/sosreport-*.tar.xz --to-stdout \
      --wildcards '*/sos_commands/networking/ip_netns'
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Test suite:

```
    # ./tests/simple.sh
    ...
    Everything worked!
```

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
The sosreport of a system with this issue:
```
	# ip netns
	Error: Peer netns reference is invalid.
	Error: Peer netns reference is invalid.
	test-ns
```

Shows this difference between the plugins ebpf and networking (callers
of `ip netns`):
```
	# ./bin/sos report -o ebpf,networking --batch

	# tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
	.../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
	.../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
	.../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
	...
	.../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
	...
```

Only the networking plugin called `ip netns exec <ns> <cmd>` on `Error:`
lines. Hmm.

...

The networking plugin calls `collect_cmd_output()`, which has parameter
stderr=True` as default:

```
    def collect_cmd_output(self, cmd, suggest_filename=None,
                           ...
                           stderr=True, ...
```

The ebpf plugin calls `exec_cmd()`, BUT it has parameter `stderr=True`
as default AS WELL:

```
    def exec_cmd(self, cmd, timeout=cmd_timeout, stderr=True, ...
```

So, why the difference?

...

Well, it turns out `exec_cmd()` does NOT pass the `stderr` parameter
along to `sos_get_command_output()`: [but `collect_cmd_output()` does.]

```
      return sos_get_command_output(cmd, timeout=timeout, chroot=root,
                                    chdir=runat, binary=binary, env=env,
                                    foreground=foreground)
```

And `sos_get_command_output()` has `stderr=False` as default:

```
      def sos_get_command_output(command, timeout=300, stderr=False, ...
```

Thus, despite `exec_cmd()` has `stderr=True` as default it is _ignored_,
as its callee `sos_get_command_output()` has `stderr=False` as default.

This explains the output difference between the ebpf/networking plugins.

...

Looking back for details, the origin of the change is in PR#1807 [1],
with commit e8bb94c (refactor command functions/introduce `exec_cmd()`)
and commit e51d3e6 (update caller plugins).

The PR has no mentions/discussion around `stderr` in comments/review,
so it's apparently an oversight during the refactor, I guess.

Because, previously the refactored callers of `sos_get_command_output()`
(`get_command_output()`, `call_ext_prog()`, `[_]get_cmd_output_now()`)
all had `stderr=True` as parameter/default, passing it along to callee.

(The only exception is in `_collect_cmd_output()` which does the first
call in the chroot with `stderr`, and falls back to non-chroot without
`stderr`, but it was already that way.  Thus not changing this case.)

...

So, it seems right and safe and opportune to fix this inconsistency
between default `stderr` in `exec_cmd()` / `sos_get_command_output()`
by just passing it along between them:

Right: because it restores the previous behavior (before refactor)
assumed/accepted by callers that did not specify `stderr=False`.

Safe: because it restores the actual behavior to previous behavior,
and if plugins handled `stderr` lines previously, they should also
handle it now, for the most part (i.e., not any new `stderr` lines,
but they are currently ignoring those, and should get them covered.)

Opportune: because if any plugin breaks _now_ because of this, it is
a new/post-refactor change that decided to use `exec_cmd()` with the
default that is inconsistent with actual behavior, and it seems okay
to take bugs for that, so to identify and fix such cases.

Hopefully those points are reasonable with the project's philosophy.

...

Test-case:

Setup:
```
	$ sudo python3

	from sos import SoS
	from sos.report.plugins import Plugin

	sos = SoS(['report'])
	plugin = Plugin(sos._component.get_commons())
```

Before:
```
	>>> print(plugin.exec_cmd('ip netns')['output'])
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=True)['output'])
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=False)['output'])
	test-ns
```

After:
```
	>>> print(plugin.exec_cmd('ip netns')['output'])
	Error: Peer netns reference is invalid.
	Error: Peer netns reference is invalid.
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=True)['output'])
	Error: Peer netns reference is invalid.
	Error: Peer netns reference is invalid.
	test-ns

	>>> print(plugin.exec_cmd('ip netns', stderr=False)['output'])
	test-ns
```

...

Test-suite:
```
	# ./tests/simple.sh
	<...>
	Summary

	 failures false          time 2                  -l
	 failures false          time 3                  --list-presets
	 failures false          time 3                  --list-profiles
	 failures false          time 17                 --batch --build
		 --no-env-vars
	 failures false          time 3          Size 14.23KiB   --batch
		 --no-report -o hardware
	 failures false          time 20         Size 2.99MiB    --batch
		 --label TEST -a -c never
	 failures false          time 20         Size 2.81MiB    --batch
		 --debug --log-size 0 -c always
	 failures false          time 20         Size 2.84MiB    --batch
		 -z xz --log-size 1
	 failures false          time 19         Size 4.01MiB    --batch
		 -z gzip
	 failures false          time 32         Size 2.76MiB    --batch
		 -t 1 -n hardware
	 failures false          time 19                 --batch --quiet
		 -e opencl -k kernel.with-timer
	 failures false          time 22         Size 4.31MiB    --batch
		 --case-id 10101 --all-logs --since=20201110
	 failures false          time 21         Size 3.06MiB    --batch
		 --verbose --no-postproc
	 failures false          time 39         Size 2.82MiB    --batch
		 --mask
	Everything worked!
```

[1] sosreport#1807

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
@TurboTurtle
Copy link
Member

Nice catch on this!

Copy link
Contributor

@pmoravec pmoravec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great investigation and a nice catch! Thanks.

mfoliveira added a commit to mfoliveira/sos that referenced this pull request Nov 18, 2020
The sosreport of a system with this issue:
```
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Shows that the networking plugin runs commands for the `Error:` lines:
(note the ebpf plugin does not due to differences addressed in PR sosreport#2306)
```
    # ./bin/sos report -o ebpf,networking --batch

    # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
    .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
    ...
    .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
    ...
```

This happens because the networking plugin calls `collect_cmd_output()`
with default `stderr=True` and does not handle such line type.

For the purposes of getting the list of network namespaces, it is OK to
ignore `stderr` since it does not provide that information. However, we
do want it in the archive, so it is fully documented for analysis/debug.

So change the call from `collect_cmd_output()` to `add_cmd_output()` to
include both `stdout` and `stderr` in the archive, and call `exec_cmd()`
that ignores `stderr` to get the list of network namespaces.

Note that the plugin _currently_ does not need `exec_cmd(stderr=False)`
to ignore `stderr`, as described in PR#2306, but will once/if that is
applied. However, with the next patch, `stderr=False` won't be needed.

Before:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
  ...
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  ...
```

After:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_test-ns_ip_-s_-s_neigh_show
  ...
```

And the `ip netns` contents with `stderr` still remain in the archive:
```
    # tar xf /tmp/sosreport-*.tar.xz --to-stdout \
      --wildcards '*/sos_commands/networking/ip_netns'
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Test suite:

```
    # ./tests/simple.sh
    ...
    Everything worked!
```

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
mfoliveira added a commit to mfoliveira/sos that referenced this pull request Nov 19, 2020
The sosreport of a system with this issue:
```
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Shows that the networking plugin runs commands for the `Error:` lines:
(note the ebpf plugin does not due to differences addressed in PR sosreport#2306)
```
    # ./bin/sos report -o ebpf,networking --batch

    # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
    .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
    ...
    .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
    ...
```

This happens because the networking plugin calls `collect_cmd_output()`
with default `stderr=True` and does not handle such line type.

For the purposes of getting the list of network namespaces, it is OK to
ignore `stderr` since it does not provide that information. However, we
do want it in the archive, so it is fully documented for analysis/debug.

So change the call from `collect_cmd_output()` to `add_cmd_output()` to
include both `stdout` and `stderr` in the archive, and call `exec_cmd()`
that ignores `stderr` to get the list of network namespaces.

Note that the plugin _currently_ does not need `exec_cmd(stderr=False)`
to ignore `stderr`, as described in PR#2306, but will once/if that is
applied. However, with the next patch, `stderr=False` won't be needed.

Before:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
  ...
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  ...
```

After:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_test-ns_ip_-s_-s_neigh_show
  ...
```

And the `ip netns` contents with `stderr` still remain in the archive:
```
    # tar xf /tmp/sosreport-*.tar.xz --to-stdout \
      --wildcards '*/sos_commands/networking/ip_netns'
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Test suite:

```
    # ./tests/simple.sh
    ...
    Everything worked!
```

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
mfoliveira added a commit to mfoliveira/sos that referenced this pull request Nov 19, 2020
The sosreport of a system with this issue:
```
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Shows that the networking plugin runs commands for the `Error:` lines:
(note the ebpf plugin does not due to differences addressed in PR sosreport#2306)
```
    # ./bin/sos report -o ebpf,networking --batch

    # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
    .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
    ...
    .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
    ...
```

This happens because the networking plugin calls `collect_cmd_output()`
with default `stderr=True` and does not handle such line type.

For the purposes of getting the list of network namespaces, it is OK to
ignore `stderr` since it does not provide that information. However, we
do want it in the archive, so it is fully documented for analysis/debug.

So change the call from `collect_cmd_output()` to `add_cmd_output()` to
include both `stdout` and `stderr` in the archive, and call `exec_cmd()`
that ignores `stderr` to get the list of network namespaces.

Note that the plugin _currently_ does not need `exec_cmd(stderr=False)`
to ignore `stderr`, as described in PR#2306, but will once/if that is
applied. However, with the next patch, `stderr=False` won't be needed.

Before:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
  ...
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  ...
```

After:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_test-ns_ip_-s_-s_neigh_show
  ...
```

And the `ip netns` contents with `stderr` still remain in the archive:
```
    # tar xf /tmp/sosreport-*.tar.xz --to-stdout \
      --wildcards '*/sos_commands/networking/ip_netns'
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Test suite:

```
    # ./tests/simple.sh
    ...
    Everything worked!
```

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
TurboTurtle pushed a commit that referenced this pull request Nov 23, 2020
The sosreport of a system with this issue:
```
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Shows that the networking plugin runs commands for the `Error:` lines:
(note the ebpf plugin does not due to differences addressed in PR #2306)
```
    # ./bin/sos report -o ebpf,networking --batch

    # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
    .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
    .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
    ...
    .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
    ...
```

This happens because the networking plugin calls `collect_cmd_output()`
with default `stderr=True` and does not handle such line type.

For the purposes of getting the list of network namespaces, it is OK to
ignore `stderr` since it does not provide that information. However, we
do want it in the archive, so it is fully documented for analysis/debug.

So change the call from `collect_cmd_output()` to `add_cmd_output()` to
include both `stdout` and `stderr` in the archive, and call `exec_cmd()`
that ignores `stderr` to get the list of network namespaces.

Note that the plugin _currently_ does not need `exec_cmd(stderr=False)`
to ignore `stderr`, as described in PR#2306, but will once/if that is
applied. However, with the next patch, `stderr=False` won't be needed.

Before:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_Error_ip6tables-save.1
  ...
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  ...
```

After:
```
  # tar tf /tmp/sosreport-*.tar.xz | grep ip_netns_exec
  .../sos_commands/ebpf/ip_netns_exec_test-ns_bpftool_net_list
  .../sos_commands/networking/ip_netns_exec_test-ns_ip6tables-save
  .../sos_commands/networking/ip_netns_exec_test-ns_ip_-s_-s_neigh_show
  ...
```

And the `ip netns` contents with `stderr` still remain in the archive:
```
    # tar xf /tmp/sosreport-*.tar.xz --to-stdout \
      --wildcards '*/sos_commands/networking/ip_netns'
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    test-ns
```

Test suite:

```
    # ./tests/simple.sh
    ...
    Everything worked!
```

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants