Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of broken symlinks (fix #2129, fix #2130) #2131

Merged
merged 3 commits into from May 3, 2019

Conversation

OliverO2
Copy link
Contributor

Pull Request Details:

Note: On the test system, there is a relative symbolic link /etc/ssh/sshd_config -> sshd_config.company, which points to a customized configuration.

Output without this PR's changes
$ sudo usr/sbin/rear -v mkrescue
Relax-and-Recover 2.4 / Git
[...]
Copying all files in /lib*/firmware/
Broken symlink './etc/ssh/sshd_config' in recovery system because 'readlink' cannot determine its link target
Broken symlink './usr/share/misc/magic' in recovery system because 'readlink' cannot determine its link target
Failed to copy symlink target '/usr/src/linux-headers-4.18.0-18-generic'
Testing that the recovery system in /tmp/rear.2F8DRaogql1zNej/rootfs contains a usable system
[...]
Output with this PR's changes
$ sudo usr/sbin/rear -v mkrescue
Relax-and-Recover 2.4 / Git
Running rear mkrescue (PID 4478)
[...]
Copying all files in /lib*/firmware/
Symlink '/usr/share/misc/magic' -> '/usr/share/file/magic' refers to a non-existing directory on the rescue system.
It will not be copied by default. You can include '/usr/share/file/magic' via the 'COPY_AS_IS' configuration variable.
Symlink '/lib/modules/4.18.0-18-generic/build' -> '/usr/src/linux-headers-4.18.0-18-generic' refers to a non-existing directory on the rescue system.
It will not be copied by default. You can include '/usr/src/linux-headers-4.18.0-18-generic' via the 'COPY_AS_IS' configuration variable.
Testing that the recovery system in /tmp/rear.5TcC1r3dx1c336B/rootfs contains a usable system
[...]

@jsmeix
Copy link
Member

jsmeix commented Apr 30, 2019

@OliverO2
many thanks for your improvement!

On first glance by plain looking at the code it looks good to me
but I like to test it a bit on Thursday before I approve it...

@jsmeix jsmeix self-assigned this Apr 30, 2019
@jsmeix jsmeix added bug The code does not do what it is meant to do enhancement Adaptions and new features labels Apr 30, 2019
@jsmeix jsmeix requested review from jsmeix and gdha April 30, 2019 14:16
@jsmeix
Copy link
Member

jsmeix commented Apr 30, 2019

@gdha @rmetrich
could you also have a look here and review it?

Do you think we can add it to ReaR 2.5?

Because of the reasoning in
#2129 (comment)
I think this could be good to have in ReaR 2.5.

@jsmeix jsmeix requested a review from rmetrich April 30, 2019 14:17
@gdha gdha added this to the ReaR v2.5 milestone May 1, 2019
Copy link
Member

@gdha gdha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OliverO2 I'm impressed - good catches!

@gdha
Copy link
Member

gdha commented May 1, 2019

@jsmeix I think it would be good to add it to the 2.5 release.

if test "$link_target" ; then
if [[ "$link_target" != /* ]]; then
# convert a relative link target into an absolute one
link_target="$broken_symlink/$link_target"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot reproduce the "$link_target" != /* case.

I have

 /mystuffdir/myotherstuffsymlink -> ../myotherstuff/myotherstuff

and

COPY_AS_IS+=( /mystuffdir )

and at the beginning of build/default/985_fix_broken_links.sh
I exit "rear mkrescue" with

Error "985_fix_broken_links.sh"

In that up to that point created recovery system I run the new code manually

# ROOTFS_DIR=/tmp/rear.KFgvwIS5Z7g8AHd/rootfs/

# broken_symlinks=$( chroot $ROOTFS_DIR find / -xdev -path '/proc' -prune -o -path '/sys' -prune -o -path '/dev' -prune -o -xtype l -print )

# pushd $ROOTFS_DIR

# for broken_symlink in $broken_symlinks ; do echo $broken_symlink ; readlink $v -e $broken_symlink ; echo ============= ; done
/mystuffdir/myotherstuffsymlink
/myotherstuff/myotherstuff
=============
/etc/localtime
/usr/share/zoneinfo/Europe/Berlin
=============
/etc/mtab
/proc/31110/mounts
=============
/etc/termcap
/usr/share/misc/termcap
=============

# popd

All of the link targets start with a leading /.
So from my current point of view the "$link_target" != /* case never happens.

But I can reproduce how the old code failed when I run the old code manually
in that up to that point created recovery system:

# broken_symlinks=$( chroot $ROOTFS_DIR find . -xdev -path './proc' -prune -o -path './sys' -prune -o -path './dev' -prune -o -xtype l -print )

# pushd $ROOTFS_DIR

# for broken_symlink in $broken_symlinks ; do echo $broken_symlink ; readlink -v -e $broken_symlink ; echo ============= ; done
./mystuffdir/myotherstuffsymlink
readlink: ./mystuffdir/myotherstuffsymlink: No such file or directory
=============
./etc/localtime
/usr/share/zoneinfo/Europe/Berlin
=============
./etc/mtab
/proc/31848/mounts
=============
./etc/termcap
/usr/share/misc/termcap
=============

# popd

Copy link
Contributor Author

@OliverO2 OliverO2 May 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking that thoroughly!

I have already been wondering why the message in my above example was

Symlink '/usr/share/misc/magic' -> '/usr/share/file/magic' refers to a non-existing directory on the rescue system.

and not

Symlink '/usr/share/misc/magic' -> '/usr/share/misc/../file/magic' refers to a non-existing directory on the rescue system.

The reason seems to be that readlink /usr/share/misc/magic outputs ../file/magic, while readlink -e /usr/share/misc/magic outputs /usr/share/file/magic. So the result of the -e invocation always seems to be an absolute path. In this case, [[ "$link_target" != /* ]] would never match.

The problem is that the manual page fails to define the term "canonical". So it could mean

  • "return the shortest unique path, which might be a relative one", but it could also mean
  • "return the shortest absolute path".

Lacking a clear definition, we really don't know what readlink -e is supposed to do. For example, here is an opinion where relative paths are deemed "canonical": theory - What's a "canonical path"? - Stack Overflow

I think we have two options here: We could remove that piece of code converting link targets into absolute paths and rely on readlink -e consistently behaving in the way we have observed it. Or, we could keep that piece of code as a (possibly unused) safety net. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I remove the code

        if [[ "$link_target" != /* ]]; then
            # convert a relative link target into an absolute one
            link_target="$broken_symlink/$link_target"
        fi

I get same results - at least in my case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the code but add an explanatory comment why it is there,
cf. https://github.com/rear/rear/wiki/Coding-Style

Copy link
Member

@jsmeix jsmeix May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That safety net code still does not work.

I have that relative symlink and its target file

/mystuffdir/mystuffsubdir/myotherstuffsubsymlink -> ../../myotherstuff/myotherstuff
/myotherstuff/myotherstuff

and in etc/rear/local.conf

COPY_AS_IS+=( /mystuffdir )

When I simulate usage if the safety net code via

        link_target=$( readlink $v -e $broken_symlink )
        test "$broken_symlink" = "/mystuffdir/mystuffsubdir/myotherstuffsubsymlink" && link_target=$( readlink $v $broken_symlink )
        if [[ "$link_target" != /* ]]; then

I get in the "rear mkrescue" output

Failed to make parent directories for symlink target '/mystuffdir/mystuffsubdir/myotherstuffsubsymlink/../../myotherstuff/myotherstuff'
Failed to copy target of symlink '/mystuffdir/mystuffsubdir/myotherstuffsubsymlink' -> '/mystuffdir/mystuffsubdir/myotherstuffsubsymlink/../../myotherstuff/myotherstuff'

I can fix the safety net code via

        if [[ "$link_target" != /* ]]; then
            # convert a relative link target into an absolute one
            broken_symlink_dir=$( dirname $broken_symlink )
            link_target="$broken_symlink_dir/$link_target"
        fi

But menawhile I wonder if we really need that safety net code.

Your Ubuntu "man readlink" manpage
http://manpages.ubuntu.com/manpages/bionic/en/man1/readlink.1.html
any my openSUSE Leap 15.0 "man readlink" manpage show at the bottom

Full documentation at: http://www.gnu.org/software/coreutils/readlink

and that URL leads to
https://www.gnu.org/software/coreutils/manual/html_node/readlink-invocation.html#readlink-invocation
which reads

12.6 readlink: Print value of a symlink or canonical file name

readlink may work in one of two supported modes:

‘Readlink mode’
readlink outputs the value of the given symbolic links.
If readlink is invoked with an argument other than the
name of a symbolic link, it produces no output and
exits with a nonzero exit code.

‘Canonicalize mode’
readlink outputs the absolute name of the given files
which contain no ., .. components nor any repeated
separators (/) or symbolic links.
Note the realpath command is the preferred command
to use for canonicalization.

Regarding the realpath command:
We cannot use that in current ReaR because it is not
available on SLES11 (at least not by default there)
and current ReaR still supports SLES11.

According to the GNU full documentation readlink -e

outputs the absolute name

and I never experienced anything different
so that I think we do not really need that
at least overcomplicated looking safety net code.

Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safety net code must either be fixed according to
https://github.com/rear/rear/pull/2131/files#r280711680
or the safety net code is dropped because we don't need it.

@OliverO2
Copy link
Contributor Author

OliverO2 commented May 3, 2019

@jsmeix Thanks again for trying!

Given the current situation, it seems to be best to leave out code which would solve one problem and might introduce others.

In my view, the GNU coreutils documentation style really sucks: We have two sets of non-synchronized documentation, one of them (the html site) missing version information, the other one (manual page) being vague by failing to define terms. Seems to be the same with dirname: Could you find any definitive statement saying that dirname does a specific sort of canonicalization (as you have observed)?

Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now all looks good to me.

@jsmeix
Copy link
Member

jsmeix commented May 3, 2019

@OliverO2
many thanks for your patient laborious work.
It helps so much to make the ReaR code better.

Regarding my dirname usage in
#2131 (comment)

if [[ "$link_target" != /* ]]; then
    # convert a relative link target into an absolute one
    broken_symlink_dir=$( dirname $broken_symlink )
    link_target="$broken_symlink_dir/$link_target"
fi

I used dirname not for any kind of canonicalization.
I used it to split away the last part (i.e. the symlink name)
from $broken_symlink because $broken_symlink/$link_target is not valid like

/path/to/symlink_dir/relative_symlink_name/../../rel_path/to/target_file_or_dir

but $broken_symlink_dir/$link_target is valid like

/path/to/symlink_dir/../../rel_path/to/target_file_or_dir

Canonicalization of the latter would result

/path/rel_path/to/target_file_or_dir

By the way:
Do you still like symlinks?
(cf. #2129 (comment) ;-)

Again thank you for your valuable contributions to ReaR
and enjoy the weekend!

@OliverO2
Copy link
Contributor Author

OliverO2 commented May 3, 2019

@jsmeix Ah, now I see. Thanks so much for your work. Although the commits were mine, your contribution was certainly the more laborious one!

And yes, I'm still fine with symbolic links. What I have really missed in this case is

  • proper tooling (debugger, code coverage analysis, automated testing) and
  • proper documentation of functions used.

Have an excellent weekend, too! Well deserved! 🥇

@jsmeix
Copy link
Member

jsmeix commented May 3, 2019

Via
b4e70e0
I added comments with examples how broken symlinks and their targets look like
so it is hopefully easier to imagine what goes on here when someone else
may have to look at that code at any later time.

@OliverO2 OliverO2 deleted the fix-broken-symlink-handling branch November 15, 2019 23:43
@jsmeix jsmeix mentioned this pull request Jun 2, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The code does not do what it is meant to do enhancement Adaptions and new features fixed / solved / done
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants