sftp: add symlink support - fixes #5011#8040
sftp: add symlink support - fixes #5011#8040NikitaCOEUR wants to merge 4 commits intorclone:masterfrom
Conversation
|
What do you think of changing this to work the way the local backend does with the If this was compatible with the local backend then we'd be able to copy symlinks from local <-> sftp as well as sftp <-> sftp which would make this a lot more generally useful |
|
Thank you for your feedback! I'd be happy to try using the |
|
I didn't mean use --links, I meant make --sftp-links work in the same way, making files with the contents of the link with suffix .rclonelink |
|
Ok I'll work on it. |
|
Hello, I wanted to provide an update on my progress with the SFTP backend and symlink implementation for this PR. I've encountered a few challenges and would appreciate your feedback on the following points: Comparison of Symlink Sizes [Use target size instead of Symlink Size]Instead of comparing the size of the symlink itself at the source and destination, I prefer comparing the size of the file pointed to by the symlink. This is because some SFTP servers correct symlinks with relative paths to absolute paths (e.g., Handling Symlinks Pointing to Directories [Ensure target is a directory on source and destination]The same issue arises with symlinks pointing to directories. It is challenging to verify if the target directories are the same, as comparing sizes or hashing directories is impractical. Timestamp Management for Symlinks [Ignore Timestamp for symlink on sftp backend]I've encountered some surprises regarding timestamp management for symlinks. Most SFTP servers don't seem to support modifying the access and modification times of symlinks directly. I’ve opened an issue on the lutimes topic in the SFTP library to gather more input: Issue #594. I wanted to keep you informed of my progress. I’m working on this in my spare time, which may explain some delays in communication. Thank you for your feedback. |
I’ve responded to your request at the SFTP package itself, but I want to add here for quick reference: your suspicion here is correct. There is sadly no way within the protocol to operate on symlinks directly, except for All the workarounds I could find “escape-hatch” out of SFTP into the greater SSH protocol to execute an arbitrary command. |
|
Thanks for working on this @NikitaCOEUR
That might cause a problem with continual syncing if we are syncing symlinks to the sftp server.
If we want to emulate the handling of the I guess we could if necessary mark the size of the symlinks as unknown (
I think this should just be the symlink size and the code shouldn't care if it is pointing to a file or directory.
Hmm, we should certainly avoid setting the timestamp of the destination when we meant to set the timestamp of the symlink. If we just ignore the SetModTime for the symlinks then potentially you'll have rclone trying to modify the time on each sync. Returning
No worries, work at whatever pace you need. I work in an entirely interrupt driven way so if you write something on the PR I'll look at it, be that in 1 day or week or month! |
|
Hi,
This causes the recreation of each symlink every time. When we create a symlink, the ModTime is set to the same as the Created Time. When rclone is launched for the first time and the symlink was created by a preceding rclone run, the symlink on the destination and source appear similar, but the ModTime differs. Raising this error triggers the systematic recreation of the symlink. I believe we need to find an alternative because systematically recreating the symlink doesn't make sense to me. I think we should completely ignore the ModTime check for symlink objects. But I’m not sure how to go about it yet. |
|
Any progress on it? |
|
Please just implement it by recreating the symlinks first. People choosing to run with this flag would foremost want the symlinks, regardless of whatever they might have pointed at in the source filesystem, and regardless of potential inefficiencies. I think recreating all symlinks is not a huge cost, and would greatly increase the value of rclone as a whole to me. |
Add new flag (--sftp-links) for backend sftp to recreate symlink from source to destination.
773620c to
55b2d83
Compare
|
Hi everyone, I took some time to revisit this and tried to implement the approach that ncw suggested for the local backend. I can confirm that it works, but I’ve run into some important limitations along the way. That said, the scenario that I think will interest most people—synchronizing a local backend with an OpenSSH SFTP server, or synchronizing between two OpenSSH SFTP backends while keeping the same directory structure—should work. This is probably the use case that matters the most in practice. It’s important to note that the SFTP protocol itself does not allow this feature to be implemented perfectly. There are inherent constraints (like path normalization, chroot environments, and lack of lutimes for symlinks) that can cause edge cases to fail. I also want to apologize for the time I spent on this. The problems I encountered initially were quite demotivating, and I had even worked around them by adding a script to replicate symlinks between the servers I wanted to sync. However, the recent comments made it clear that this feature isn’t only desired by me, so I hope this implementation can serve the purpose and be useful for others as well. Thanks for taking the time to review this! |
|
I checked out your repo and build the |
|
What exactly was the command you executed? |
|
Started with: |
|
I think that’s the problem you encountered, because source symlinks are not transmitted as You should verify the symlinks on the destination side, as I noticed you're transferring them from /var/www to /data/www. Some SFTP servers may rewrite or normalize symlinks depending on their implementation. I ran into a few unexpected behaviors during my tests. |
|
Hmmm. Local and remote are both ext4 filesystems, and the remote SFTP-server is |
|
I think you can just use |
|
Okay, it seems that copying to an existing remote copy doesn't overwrite existing files... When writing to a new remote directory, it works! But this does have me concerned, that old/obsolete content does not get overwritten. What if I in the future replace a file with a symlink, will the remote keep having the file instead of the symlink? |
|
@NikitaCOEUR wrote:
Thanks for persevering with this, and I'm sorry for the demotivation. I will give this a review now - thank you :-) |
ncw
left a comment
There was a problem hiding this comment.
This is looking very good - thank you :-)
I put one minor comment inline and there are some lint errors for you to fix up too.
And some questions:
Do the integration tests still pass? If you run test_all -backends sftp? You'll need docker for that to work. (See the docs https://github.com/rclone/rclone/blob/master/CONTRIBUTING.md#backend-testing )
I guess rclone serve sftp doesn't support symlinks either so I think the integration test will fail there.
Great work :-)
backend/sftp/sftp.go
Outdated
| if len(opt.SSH) != 0 && ((opt.User != currentUser && opt.User != "") || opt.Host != "" || (opt.Port != "22" && opt.Port != "")) { | ||
| fs.Logf(name, "--sftp-ssh is in use - ignoring user/host/port from config - set in the parameters to --sftp-ssh (remove them from the config to silence this warning)") | ||
| } | ||
| // Override --sftp-links with --links if set |
There was a problem hiding this comment.
This needs to be called out in the docs like the relevant section in the local backend
Note that
--sftp-linksjust enables this feature for the sftp backend.--linksand-lenable the feature for all supported backends (includinglocal) and the VFS.
|
Indeed, after quickly re-testing to reproduce the issue reported by pepa65, I managed to trigger the same error using I’m currently working on a detector that evaluates whether the SFTP backend supports symlinks, so that an error can be raised earlier if it doesn’t. The goal is to warn users when the backend cannot handle symlinks and that I also noticed that when trying to replace a regular file with a symlink, the symlink is created, but it gets removed during the cleanup phase (because of the
Finally, I saw that symlinks are always flagged for recreation because their ModTime cannot be compared (due to Should I add a new commit to this PR, or would it be better to handle this in a separate one? Note: I’m currently trying to run I’m not sure if this is expected behavior or if there’s an actual issue. |
|
A detector sounds like a good idea. You can check for that error in the integration tests and skip them.
Effectively there are two files with the same name which confuses the sync.
We need to fix that. Just calculating the hash should do it though shouldn't it?
Yes add to this commit. Let's get everything we need for the feature in this PR.
Don't mind - however you like to work. It will all get squashed into one commit when it gets merged.
Sounds like it is working! It takes a while then opens an HTML page with the results. The can't connect might just be startup things or maybe you have a local firewall?
|
Add SymlinkValidator and SetModTimeCapability interfaces to allow backends
to validate their symlink capabilities before sync operations begin. This
provides early detection of configuration errors when symlink support is
enabled but not actually available on the remote server/filesystem.
- Add SymlinkValidator interface to fs/types.go for backends to validate
symlink support during sync/copy/move operations
- Add SetModTimeCapability interface to declare when SetModTime is supported
for specific object types (e.g., SFTP cannot set modtime on symlinks)
- Implement ValidateSymlinks in local and sftp backends with appropriate
platform-specific testing (Windows requires privileges, Unix always supports)
- Add early validation in sync operations to fail fast with clear errors
- Add CanSetModTime implementation for SFTP to handle translated symlinks
Enhance the equality checking and hash comparison logic to better handle
translated symlinks and backends that don't support SetModTime for certain
object types.
- Add hash comparison fallback for translated symlinks even when filesystems
don't declare common hash support, allowing proper verification of
.rclonelink file contents
- Skip modification time verification when SetModTime is not supported for
specific objects (e.g., SFTP symlinks), preventing false inequality matches
|
Hi @ncw, I've implemented all the changes: Changes Made
Integration Tests I'm having difficulties getting consistent results with
I’m working on Windows through a WSL environment, and I’ve never encountered issues like the ones I’m seeing here. The tests seem to start a container or a service, but those are stopped or restarted between tests — though that doesn’t always seem to happen. When they aren’t running, the tests consistently return a “connection refused” error. |
|
My job with symlinks went fine. |
|
Thanks for the changes - will check those over in a bit
The integration tests work fine on Linux - you can see them running here https://integration.rclone.org I suspect this is a problem with WSL, perhaps with docker images on WSL. I don't have time to fix the tests on WSL though. Maybe @roucc can test this for you? |
What is the purpose of this change?
The primary goal of this change is to enhance rclone's SFTP backend by adding support for preserving symlink structures during file transfers. This is particularly useful for users who need to maintain the integrity of symbolic links when syncing files between systems.
Add a new flag,
--sftp-links, to the SFTP backend of rclone. The--sftp-linksflag allows users to copy symbolic links (symlinks) as symlinks instead of following them.The added functionality preserves the symlink structure on the destination, provided both the source and destination remotes support symlinks
Important
Currently supported only between two SFTP remotes
The validation process ensures that symlinks are copied correctly by checking the size and hash of the target file, except when the target is a directory.
Was the change discussed in an issue or in the forum before?
This issue was opened in 2021 and still requested : #5011
Checklist
Tip
Need Help to add unit tests, i never do that before ...