Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

irsync -r fails on symbolic links that create file system loops #3988

Closed
2 tasks done
andrew-irods opened this issue Jun 20, 2018 · 8 comments
Closed
2 tasks done

irsync -r fails on symbolic links that create file system loops #3988

andrew-irods opened this issue Jun 20, 2018 · 8 comments
Assignees
Labels
Milestone

Comments

@andrew-irods
Copy link

andrew-irods commented Jun 20, 2018

  • master
  • 4-2-stable

Bug Report

irsync -r fails on symbolic links that create file system loops. In some cases the collection path names created are so wide as a result of a loop, that they are difficult to remove.

(see output from D. Hu below)

This issue includes dealing with similar issues with iput -r.

iRODS Version, OS and Version

D. Hu - irods 4.2.2
Reproduced at irods on 4-2-stable/Ubuntu 16.04

Description and process to reproduce.

From D. Hu's email from 6/18/2019:

To reproduce the problem is easy,


[donghu@data13 ~]$ pwd
/home/donghu
[donghu@data13 ~]$ mkdir linktofilder
[donghu@data13 ~]$ cd linktofilder/
[donghu@data13 linktofilder]$ ls
[donghu@data13 linktofilder]$ ln -s /home/donghu/linktofilder linktofilder
[donghu@data13 linktofilder]$ ls -l
total 2
lrwxrwxrwx 1 donghu centos 25 Jun 18 10:44 linktofilder -> /home/donghu/linktofilder

With the symlink in place, try a irsync:

irsync -r -v  /home/donghu/linktofilder i:/resarchivezone/home/donghu/f10/
C- /resarchivezone/home/donghu/f10:
C- /resarchivezone/home/donghu/f10/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
...
...
C- /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder:
remote addresses: 172.20.7.31 ERROR: rcObjStat of /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/li failed status = -308000 USER_PACKSTRUCT_INPUT_ERR
remote addresses: 172.20.7.31 ERROR: rsyncDirToCollUtil: mkColl error for /resarchivezone/home/donghu/f10/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/linktofilder/li status = -308000 USER_PACKSTRUCT_INPUT_ERR
remote addresses: 172.20.7.31 ERROR: rsyncDirToCollUtil: put /home/donghu/linktofilder failed. status = -308000 status = -308000 USER_PACKSTRUCT_INPUT_ERR
remote addresses: 172.20.7.31 ERROR: rsyncDirToCollUtil: put /home/donghu/linktofilder failed. status = -308000 status = -308000 USER_PACKSTRUCT_INPUT_ERR
remote addresses: 172.20.7.31 ERROR: rsyncDirToCollUtil: put /home/donghu/linktofilder failed. status = -308000 status = -308000 USER_PACKSTRUCT_INPUT_ERR
...
...
remote addresses: 172.20.7.31 ERROR: rsyncDirToCollUtil: put /home/donghu/linktofilder failed. status = -308000 status = -308000 USER_PACKSTRUCT_INPUT_ERR
remote addresses: 172.20.7.31 ERROR: rsyncDirToCollUtil: put /home/donghu/linktofilder failed. status = -308000 status = -308000 USER_PACKSTRUCT_INPUT_ERR
remote addresses: 172.20.7.31 ERROR: rsyncUtil: rsync error for /resarchivezone/home/donghu/f10 status = -308000 USER_PACKSTRUCT_INPUT_ERR

Related Issues: #3995, #3997, #4006, #4008, #4009, #4013, #4016

@andrew-irods andrew-irods self-assigned this Jun 20, 2018
@andrew-irods
Copy link
Author

andrew-irods commented Jun 21, 2018

Current status:

The current fix can detect the following errors:

  1. Symbolic link to itself (i.e. "ln -s xxx xxx").

  2. Dangling symbolic link (i.e. "ln -s existingfile dangle; rm existingfile").

In both of these cases, the utilitiy will terminate with an error message and a USER_INPUT_PATH_ERR error (this is a change in behavior). Previously the irsync and/or iput utility would work through these errors, and simply report a message if the error were actually detected (which they mostly weren't).

What is currently not detected, are more sophisticated loops, for example - "irsync -r" of a physical directory which has nested directories, under which a symbolic link is pointing to an existing parent physical directory.

The linux "find" utility WILL ALWAYS detect loops, and can be used as a work around, for example - before using "irsync -r" on a folder, let's say "some_folder", using the following command before using irsync, will detect symbolic link loops of this nature, allowing the user to abort the process and fix the file system:

$ find -L some_folder          # and add your favorite find options

Not sure yet if it's irods' job to detect such loops - to be discussed.

Still to finish: lots of test cases for iput and irsync that cover all of the above.

@andrew-irods
Copy link
Author

andrew-irods commented Jun 21, 2018

Example of using find to find loops (taken from the directory where I'm testing).

akelly@akellydt1:~/files$ find -L . > /dev/null
find: File system loop detected; ‘./case2/linktofolder/save1/save3’ is part of the same file system loop as ‘./case2/linktofolder/save1’.
find: File system loop detected; ‘./case2/linktofolder/save1/save’ is part of the same file system loop as ‘./case2/linktofolder/save1’.
find: File system loop detected; ‘./case2/save/save3’ is part of the same file system loop as ‘./case2/save’.
find: File system loop detected; ‘./case2/save/save’ is part of the same file system loop as ‘./case2/save’.
find: File system loop detected; ‘./case1/ddate/prevsym’ is part of the same file system loop as ‘./case1’.
find: ‘./case1/prevsym’: Too many levels of symbolic links
find: ‘./case3/self_symlink’: Too many levels of symbolic links
akelly@akellydt1:~/files$ echo $?
1

@trel trel added this to the 4.2.4 milestone Jun 28, 2018
@andrew-irods
Copy link
Author

andrew-irods commented Jun 29, 2018

Status Update

Much of the discussion above is no longer up to date. Please see here for the latest.

The development work for this bug fix is mostly done, and features:

For both "irsync -r" and "iput -r", a pre-scan (or preflight scan) is run on the icommand physical parameter directories that participate in the icommand operation (copying directories and files into the irods vault). The prescan detects the following issues with the directory/file heirarchy:

  1. Dangling symlinks (symlinks that point to physical files or directories that are no longer there.

  2. Self-looping symlinks (symlinks that point to themselves).

  3. File system loops (symlinks that point to a directory which exists, but which is already participating in the icommand operation).

The scan operates on all the command line directories at one time -- in other words, if a symlink which exists under one of the parameter directories points to an existing directory which is within one of the other command line directories, it will be found as a file system loop.

For most errors, the pre-scan does not terminate when it finds an error. Instead, all the directories are scanned if possible, and all the errors found are reported to stderr, and the icommand (irsync or iput) terminates right after the scan, without making any changes to the catalog or transfering files to the irods vault.

The only exception is the target collection which is created immediately, but will be empty after the icommand exits as a result of errors found during the scan.

The other exception to the behavior described above is the case where one of the directories being scanned does not provide enough authority for the icommand to access it. When this is detected, the scan is aborted with the following error message:

remote addresses: 127.0.0.1 ERROR: filesystem::recursive_directory_iterator directory error: Permission denied

and the icommand exits.

Error messages:

Either/both utilities will display the following error messages for issues found during the scan:

  1. For a file system loop:
remote addresses: 127.0.0.1 ERROR: USER_INPUT_PATH_ERR: File system loop detected
Path: dir_19M/dir9/sym_loop
Canonical path: /media/akelly/home/admin/files/dir_9.4M/dir0
  1. For a symlink which is pointing to itself:
remote addresses: 127.0.0.1 ERROR: USER_INPUT_PATH_ERR: Too many levels of symbolic links
Path = dir_9.4M/some_symlink
  1. For a dangling symlink:
$ cp /dev/null xxx
$ ln -s `pwd`/xxx `pwd`/dir_9.4M/dangling_symlink
$ rm xxx
$ irsync -r dir_9.4M i:tmp
remote addresses: 127.0.0.1 ERROR: USER_INPUT_PATH_ERR: No such file or directory
Path = dir_9.4M/dangling_symlink

Performance Cost

The pre-scan is extremely inexpensive - much less than 1% of the total time it takes the icommand to run in most cases. Please see the next comment below for some performance numbers.


A note about issue #3997: As a result of this issue, under certain circumstances, iput and irsync will terminate with an error when there is more than one physical directory participating in an operation into a collection. For example:

irsync -r dir1 dir2 i:tmp

If the tmp collection does not yet exist in the catalog, the following error will appear:

remote addresses: 127.0.0.1 ERROR: resolveRodsTarget: target /tempZone/home/rods/tmp does not exist status = -310000 USER_FILE_DOES_NOT_EXIST
remote addresses: 127.0.0.1 ERROR: rsyncUtil: resolveRodsTarget status = -310000 USER_FILE_DOES_NOT_EXIST

As a workaround, the collection can be created before the icoomand is issued:

imkdir tmp

This will avoid these errors. This will disappear once issue #3997 is fixed.

@andrew-irods
Copy link
Author

andrew-irods commented Jun 29, 2018

Performance Impact of the Pre-Scan:

For each of the directories listed below, a count of files, directories, disk space taken, and the scan duration is reported, followed by the output of the time utilitiy while running the iput command. This was run on a Ubuntu 16.04 system with an Intel i7-3770 CPU @ 3.40GHz × 8, and with 24Gb of RAM:

dir_9.4M:

Files: 2,300
Directories: 84
Disk space: 9.4M 
Directory scan duration: 0.04854 seconds

+ time iput -r dir_9.4M
real    1m36.153s
user    0m0.218s
sys     0m0.217s

Note: The pre-scan took 0.04854 seconds out of 1m36.153s for the complete iput -r operation

dir_19M:

Files: 4,600
Directories: 168
Disk space: 19M 
Directory scan duration: 0.09436 seconds

+ time iput -r dir_19M
real    1m40.813s
user    0m0.312s
sys     0m0.396s

Note: The pre-scan took 0.09436 seconds out of 1m40.813s for the complete iput -r operation

dir_45M:

Files: 10,950
Directories: 392
Disk space: 45M 
Directory scan duration: 0.2414 seconds

+ time iput -r dir_45M
real    2m28.738s
user    0m0.708s
sys     0m1.179s

Note: The pre-scan took 0.2414 seconds out of 2m28.738s for the complete iput -r operation

dir_69M:

Files: 17,000
Directories: 603
Disk space: 69M 
Directory scan duration: 0.3872 seconds

+ time iput -r dir_69M
real    8m57.829s
user    0m1.306s
sys     0m1.796s

Note: The pre-scan took 0.3872 seconds out of 8m57.829s for the complete iput -r operation

@andrew-irods
Copy link
Author

Status update: currently writing test cases for our Continuous Integration testing environment. The bug fixes and enhancements associated with this issue are complete.

andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 13, 2018
[irods#4009] iput/irsync: Fixed bad logical path created for symbolic links
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 13, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 13, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 13, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 14, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 15, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 15, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 15, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 15, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 16, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 16, 2018
@andrew-irods
Copy link
Author

andrew-irods commented Jul 16, 2018

CI build 1156 for 4-2-stable pull-request - passed
CI build 1157 for master pull-request - passed

andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 16, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 16, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 17, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 17, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 18, 2018
andrew-irods pushed a commit to andrew-irods/irods that referenced this issue Jul 18, 2018
trel pushed a commit that referenced this issue Jul 18, 2018
[#4009] iput/irsync: Fixed bad logical path created for symbolic links
@andrew-irods
Copy link
Author

CI Build 1165 (4-2-stable) and 1168 (master) passed.

trel pushed a commit that referenced this issue Jul 18, 2018
trel pushed a commit that referenced this issue Jul 18, 2018
@trel
Copy link
Member

trel commented Jul 18, 2018

merged - please update checkmarks and close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants