Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update 310_network_devices.sh #2907

Merged
merged 4 commits into from Mar 27, 2023
Merged

Update 310_network_devices.sh #2907

merged 4 commits into from Mar 27, 2023

Conversation

hpannenb
Copy link
Contributor

@hpannenb hpannenb commented Jan 8, 2023

Relax-and-Recover (ReaR) Pull Request Template

Please fill in the following items before submitting a new pull request:

Pull Request Details:
  • Type: Bug Fix

  • Impact: Normal

  • Reference to related issue (URL):
    Rescue/recovery networking does not work with IPv6 only NICs. #2902

  • How was this pull request tested?
    Tested on a CentOS7 VM with two NICs; 2nd NIC IPv6 only;
    tested on lab and production RHEL7 VMs with hybrid IPv4/6 and IPv6 only NICs.

  • Brief description of the changes in this pull request:
    With this PR all available intefaces (IPv4 & IPv6) will be chosen for the rescue environment.
    Previously just IPv4 interfaces with IP addresses and routing only were selected.

Select all available NICs (IPv4 & IPv6) for the rescue environment.
@jsmeix jsmeix added this to the ReaR v2.8 milestone Jan 9, 2023
@jsmeix jsmeix added enhancement Adaptions and new features external tool The issue depends on other software e.g. third-party backup tools. and removed external tool The issue depends on other software e.g. third-party backup tools. labels Jan 9, 2023
Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me from plain looking at the code changes.
Actually the new code looks better (much simpler).
But I did not test it.

@jsmeix jsmeix requested a review from a team January 9, 2023 12:47
@jsmeix
Copy link
Member

jsmeix commented Jan 9, 2023

@hpannenb
thank you for your pull request!

Could you please add a reference to your issue
#2902
by enhancing your existing comment like

# Use output of 'ls /sys/class/net/' to select all available interfaces
# in particular to make networking also work with IPv6 only NICs
# see https://github.com/rear/rear/issues/2902

so others can later easily understand the reason behind,
cf. https://github.com/rear/rear/wiki/Coding-Style

Updated the comments on request.
@jsmeix
Copy link
Member

jsmeix commented Jan 9, 2023

@hpannenb
thanks!

@jsmeix jsmeix self-assigned this Jan 10, 2023
@jsmeix
Copy link
Member

jsmeix commented Jan 10, 2023

@rear/contributors
could someone please also review it if time permits?

@pcahyna
Copy link
Member

pcahyna commented Jan 11, 2023

I am checking how it behaves on a machine with many NICs.

@pcahyna
Copy link
Member

pcahyna commented Jan 11, 2023

On a machine with 6 NICs, 2 of which are configured:

ip r | awk '$2 == "dev" && $8 == "src" { print $3 }' | sort -u
enp2s0f0
enp2s0f2

ls /sys/class/net/
enp14s0  enp15s0  enp2s0f0  enp2s0f1  enp2s0f2  enp2s0f3  lo

@pcahyna
Copy link
Member

pcahyna commented Jan 11, 2023

I have not yet found a case where the new code produces a different output from the old code, but so far I have not invested much in it. I still feel that it would be better to keep the old code. I believe there is probably a good reason to keep only interfaces with routing information and I am afraid of breaking it. Unfortunately, currently I have very little time for investigating it.
I don't entirely agree with the reasoning in #2902 (comment) "there are new user options available for NICs (i.e. EXCLUDE_NETWORK_INTERFACES=()) giving the user enough power to excluded unwanted NICs instead of implicitly filtering them elsewhere." It would be unfortunate if we forced users for whom the previous autodetection has worked well to use an explicit configuration (which then would need to be maintained - interfaces can change their names, be different on different machines even if the machies are otherwise similar, etc.)
@hpannenb could you please change you approach to use the output of ip -6 r, as suggested by @jsmeix #2902 (comment) ? (I know that it is not straightforward, because the IPv6 routes lack the src keyword used in the filter for the IPv4 routes.)

@pcahyna
Copy link
Member

pcahyna commented Jan 12, 2023

I have finally found one meaningful difference between the result of the old code and of the proposed code.

diff -U10 -r /var/tmp/rear.cjIYxe4t6h4ksM2/rootfs/etc/scripts/system-setup.d/60-network-devices.sh /var/tmp/rear.PgacJne54eJTRRI/rootfs/etc/scripts/system-setup.d/60-network-devices.sh
--- /var/tmp/rear.cjIYxe4t6h4ksM2/rootfs/etc/scripts/system-setup.d/60-network-devices.sh       2023-01-12 18:43:30.128396285 -0500
+++ /var/tmp/rear.PgacJne54eJTRRI/rootfs/etc/scripts/system-setup.d/60-network-devices.sh       2023-01-12 04:00:48.923395070 -0500
@@ -18,10 +18,12 @@
 # The following is autogenerated code to setup network interfaces
 # in the recovery system which have all these on the original system:
 # - they are UP
 # - they have an IP address
 # - they are somehow linked to a physical device
 # For details see the rescue/GNU/Linux/310_network_devices.sh script.
 ip link set dev eno1 up
 ip link set dev eno1 mtu 1500
 ip addr add 10.16.216.49/23 dev eno1
 ip addr add 2620:52:0:10d8:1658:d0ff:fed3:31ab/64 dev eno1
+ip link set dev eno2 up
+ip link set dev eno2 mtu 1500

This is on a machine with two interfaces, one of them (eno2) is connected, but not configured.
I am not sure if the difference matters or not - probably not.
In any case, the comment

# - they have an IP address

is no longer correct with the proposed code.

@hpannenb
Copy link
Contributor Author

@pcahyna Well done and valid point.

So the task in the code is to gather all interfaces that are configured with an IP (either IPv4 and/or IPv6).

@hpannenb
Copy link
Contributor Author

In the earlier times of the code all available network interfaces were selected. My PR is reverting to this approach. See e.g.

network_interfaces=$( ls /sys/class/net )

With the commit 15567ed this changed.

Maybe @rmetrich can explain the intention behind introducing his approach.

@rmetrich
Copy link
Contributor

In the earlier times of the code all available network interfaces were selected. My PR is reverting to this approach. See e.g.

network_interfaces=$( ls /sys/class/net )

With the commit 15567ed this changed.

Maybe @rmetrich can explain the intention behind introducing his approach.

Hello,
the idea was to select only interfaces backed on physical device and which would be useful when starting the recovery (i.e. to fetch the backup).
Everything not backed by a physical interface doesn't need to be restored while in the rescue env.

@jsmeix
Copy link
Member

jsmeix commented Jan 16, 2023

@rmetrich
thank you for your explanation!

What @pcahyna reports in his above
#2907 (comment)
is

... two interfaces, one of them (eno2) is connected, but not configured.

I think this means the connected but not configured interface
is "somehow linked to a physical device"
but it has no IP when "rear mkrescue" is run.

I think any interface that is "somehow linked to a physical device"
could be later useful to fetch the backup during "rear recover"
(the admin would then have to assign an IP manually)
regardless whether or not an interface was actually usable
when making the backup during "rear mkbackup".

In particular for third party backup tools normally
"rear mkbackup" is useless (because normally ReaR does
not implement making a backup with third party backup tools)
so only "rear mkrescue" is used for third party backup tools
(also for BACKUP=REQUESTRESTORE)
and then it is possible that an interface is not usable
during "rear mkrescue" time but that interface could be used
when the backup is made with a third party backup tool.
Of course this example is rather theoretical (I don't assume
admins disable special interfaces that are only used for
special things like making a backup with some backup tool)
but this is only an offhanded example for my actual point
which is:

I wonder if ReaR should automatically exclude something
when there is no hard reason to exclude it?

I think something should be automatically excluded only when
including it would cause problems in the ReaR recovery system.
E.g. when things fail in the recovery system if it was inculded
or possible security/privacy issues if secrets were included.

I am not a sufficient networking expert to make a decision
if it could cause problems in the ReaR recovery system
when interfaces are included that are linked to a physical device
but have no IP.

jsmeix added a commit that referenced this pull request Jan 16, 2023
Added comment that also appears in network_devices_setup_script
that shows the idea behind which interfaces are selected, see
#2907 (comment)
@jsmeix
Copy link
Member

jsmeix commented Jan 16, 2023

Via
2cf67ae
I added a comment that also appears in network_devices_setup_script
that shows the idea behind which interfaces are currently selected
according to
#2907 (comment)

This is only to show how the current code (without this change here)
is meant to work.
Of course that comment (and likely some other comments)
change when we decide here to include more interfaces.

@hpannenb
Copy link
Contributor Author

Everything not backed by a physical interface doesn't need to be restored while in the rescue env.

@rmetrich Thanks for the insights. Indeed this is how this mechanism works (well documented by You in the code). The interesting question to me is what YOur intentions was to change the network interface selection mechanism from previous approach to "select IPv4 with routing NICs only". BTW, so far this seems to have worked flawlessly.

@rmetrich
Copy link
Contributor

Everything not backed by a physical interface doesn't need to be restored while in the rescue env.

@rmetrich Thanks for the insights. Indeed this is how this mechanism works (well documented by You in the code). The interesting question to me is what YOur intentions was to change the network interface selection mechanism from previous approach to "select IPv4 with routing NICs only". BTW, so far this seems to have worked flawlessly.

Honestly I cannot remember. Probably I never got a pure IPv6 system and never saw any Red Hat customer have such, so in the end restoring the network for IPv4 only was sufficient.

@hpannenb
Copy link
Contributor Author

I am not a sufficient networking expert to make a decision
if it could cause problems in the ReaR recovery system
when interfaces are included that are linked to a physical device
but have no IP.

@jsmeix Likewise. But it seems this has never been (reported as) an issue until now with either the old or the new approach of collecting the network interfaces.

@hpannenb
Copy link
Contributor Author

Honestly I cannot remember.

Sure. I just raised the questions because it could have been a special decision by You to do so.

Probably I never got a pure IPv6 system and never saw any Red Hat customer have such, so in the end restoring the network for IPv4 only was sufficient.

The system I am dealing with is a hybrid: E.g. the management interface is IPv4 only whereas the connection to the backup service is via IPv6 only.

@hpannenb
Copy link
Contributor Author

I wonder if ReaR should automatically exclude something when there is no hard reason to exclude it?

I think something should be automatically excluded only when including it would cause problems in the ReaR recovery system. E.g. when things fail in the recovery system if it was inculded or possible security/privacy issues if secrets were included.

@jsmeix Under point 4 in my #2902 (comment) I mentioned it is possible to exclude NICs with EXCLUDE_NETWORK_INTERFACES =() already in case they should be excluded from the rescue environment. So user has the power to do something in case it is harming on the NIC side.

BTW, from purely looking at the code this approach should have made the test passed of @pcahyna putting the eno2 onto this list. Obviously the comment section of the algorithm needs a slight update then.

Removed the comments about IP addresses for
- network_devices_setup_script
- the explanation of the algorithm used
- and the part about "Collect list of all network interfaces to deal with."

Added comment about EXCLUDE_NETWORK_INTERFACES directive.

ref. to rear#2902
@hpannenb
Copy link
Contributor Author

I adjusted the comments in my PR to align it with my code change.

Since this change is working as such (as already shown with earlier ReaR versions) my opinion is my PR is a lightweight change to support IPv4 and IPv6 NICs in the rescue environment.

@jsmeix @rmetrich @rear/contributors : Please leave Your comments.

Copy link
Member

@schlomo schlomo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change, code is much easier to read and makes more sense to check for interfaces instead of for routing entries.

I'd have expected to find something like ip link list or such here, and if we target not too ancient systems we can use JSON output to process it, like this:

$ ip -json link list | jq -r '.[] | .ifname'
lo
eno1
docker0
veth9bdabf8
veth7f188b3
vboxnet0
$ ls -l /sys/class/net/
total 0
drwxr-xr-x  2 root root 0 Feb 16 18:57 ./
drwxr-xr-x 73 root root 0 Feb 16 18:56 ../
lrwxrwxrwx  1 root root 0 Feb 16 19:03 docker0 -> ../../devices/virtual/net/docker0/
lrwxrwxrwx  1 root root 0 Feb 16 19:03 eno1 -> ../../devices/pci0000:00/0000:00:19.0/net/eno1/
lrwxrwxrwx  1 root root 0 Feb 16 19:03 lo -> ../../devices/virtual/net/lo/
lrwxrwxrwx  1 root root 0 Feb 16 19:03 vboxnet0 -> ../../devices/virtual/net/vboxnet0/
lrwxrwxrwx  1 root root 0 Feb 16 19:03 veth7f188b3 -> ../../devices/virtual/net/veth7f188b3/
lrwxrwxrwx  1 root root 0 Feb 16 19:03 veth9bdabf8 -> ../../devices/virtual/net/veth9bdabf8/

But as I said, nice and simple change, big thanks!

What is keeping us from merging this?

Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve it "bona fide" because it was
tested by @hpannenb and it works for him
and the new code looks better (much simpler).
But I did not (and cannot properly) test it
(I use only DHCP with a single network interface
which works for me and matches my networking skills).

@jsmeix jsmeix requested a review from pcahyna February 17, 2023 14:09
@hpannenb
Copy link
Contributor Author

It is a great piece of code @rmetrich provided since it covers almost any situation in different NIC setups already. I just changed a single line of code to include IPv6 only NICs again.

@schlomo @jsmeix Appreciate both Your approvals on this; waiting for @pcahyna about his view/thoughts on my change.

antonvoznia pushed a commit to antonvoznia/rear that referenced this pull request Feb 21, 2023
Added comment that also appears in network_devices_setup_script
that shows the idea behind which interfaces are selected, see
rear#2907 (comment)
@schlomo
Copy link
Member

schlomo commented Mar 1, 2023

@pcahyna any comments from you or can we merge this?

@jsmeix jsmeix requested a review from a team March 23, 2023 10:39
@jsmeix
Copy link
Member

jsmeix commented Mar 23, 2023

@rear/contributors
I would like to be bold here
and merge it next Monday (27. March) afternoon
unless there are objections.

@pcahyna
I think because we are in ReaR 2.8 development phase
it is OK to "just merge" something like this one because
it does not look as if instantly hell broke loose
because of this changes (at least it "just works"
for @hpannenb so it cannot be wrong in any case)
and whether or not it is wrong in some cases
we learn best how it actually goes wrong
when various interested users who use our
current GitHub master code try it out
in their individual networking environments.

@jsmeix jsmeix merged commit 9c66f1a into rear:master Mar 27, 2023
@jsmeix jsmeix added bug The code does not do what it is meant to do fixed / solved / done labels Mar 27, 2023
@hpannenb hpannenb deleted the getallnics branch April 24, 2023 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The code does not do what it is meant to do enhancement Adaptions and new features fixed / solved / done
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants