Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing symbolic links as reference #250

Open
EorlBruder opened this issue Apr 17, 2018 · 64 comments · May be fixed by nextcloud/server#41321 or #6205
Open

Syncing symbolic links as reference #250

EorlBruder opened this issue Apr 17, 2018 · 64 comments · May be fixed by nextcloud/server#41321 or #6205
Labels
discussion enhancement enhancement of a already implemented feature/code epic feature: 🔄 sync engine feature-request new feature

Comments

@EorlBruder
Copy link

Currently when I add a symlink inside my Nextcloud folder it doesn't sync because "Symbolic links are not supported in syncing".
I am trying to use Nextcloud as a home for my data, to have it synced between my PCs. Now I'd also like to be able to access my git-repositories (mostly) code via the directory-structure in Nextcloud but I don't want to sync the git-repositories via Nextcloud. This is why it would be really nice for me to be able syncing only the link to the repository which is on a fixed position on any system. eg: ~/Nextcloud/current_projects/fizzbuzz would point to ~/repositories/fizzbuzz.
If I happen to be on a system where this git-repository isn't present I can just clone the repository there.
This option could be optional (like the syncing of hidden files).

There's also a issue about this for the owncloud-client here: owncloud/client#1440

@camilasan
Copy link
Member

I see the use case, but we also would have to deal with windows links on mac, mac links on Linux... you get the drill. So between confusion and technical difficulty, and adding that most users don't even use symlinks all that much, this mostly seems a low-priority thing for me. Of course, anyone is free to work on it but he/she will have to do a full (cross-platform) solution, and that won't be easy.

@camilasan
Copy link
Member

@EorlBruder
Copy link
Author

Would the proposed implementation in point three of this post https://help.nextcloud.com/t/symbolic-link-support/220/18?u=callegar be a viable way to solve the problem?

@ferdiga
Copy link

ferdiga commented Apr 17, 2018

I agree with the arguments mentioned in the link above.
a symbolic link is a very useful way (especially on Linux) to extend local storage capacity using other partitions and / or disk. at least on Linux the nextcloud client shouldn't exclude links as these are transparent to all Linux programs

@nefelin
Copy link

nefelin commented Sep 5, 2018

I understand the difficult involved but I would love to see a solution to this. Or at least an option to sync links across compatible platforms...

I use symlinks to sort a very large image collection that I use for drawing practice into smaller collections, which lets me have image sets exist in multiple collections, it is very useful and intuitive but would be even better if it synced.

@m1cm1c
Copy link

m1cm1c commented Sep 20, 2018

I see the use case, but we also would have to deal with windows links on mac, mac links on Linux... you get the drill.

Are you saying that if Nextcloud were to support symlinks, it would have to be able to convert them to *.lnk files on Windows? Converting between symlinks and *.lnk files is impossible in either direction, simply because you can't find a generic way of converting between paths on Linux and paths of Windows in either direction. Why not simply only create symlinks on the other system if it's also a Unix-like system? You can't unify Windows with the rest. They chose to be fundamentally incompatible with Linux / Mac / FreeBSD / OpenBSD / Solaris / whatever.

On the windows client, you can choose between:

  • doing nothing
  • dumping a file there that states where the link leads on a Unix-like system

I suppose you currently just dump Windows' *.lnk files as what they are on Unix-like systems, even if they are of no use there (well, they hardly are of any use on Windows too because they can't be used in the middle of paths, but that's a different topic).

@fwsGonzo
Copy link

fwsGonzo commented Oct 7, 2018

No one is asking for conversion between symlinks and "windows links". There is no one-to-one comparison anyway. The dream solution here is to just not care that you found a symlink on linux and just treat it normally. If its a folder, its just a folder - extending storage. I can`t understand why the nextcloud client is hindering this to begin with, as you would have to actively disallow it.

EDIT: This is actually making it hard to use this cloud for me at all. I absolutely needed this feature to work exactly like Dropbox (and others).

@piranhaphish
Copy link

I agree that this is needed. I would go as far as to say that ignoring symlinks is a bug. Like was mentioned above, a symlinked file/folder can simply be treated as a normal file/folder; you have to intentionally go out of your way to avoid doing so (by using lstat() instead of stat(), for instance).

I don't think anybody is asking for the symlink itself to be represented on the cloud-side, moreso just simply recognized by the client side.

@EorlBruder
Copy link
Author

I actually was asking for symlinks to be synced as "symlinks" and not to sync the content. The main reason for me is that I want to work out of my nextcloud-directory by default (I have an organization system in place there) but I do not want to sync git-repositories via nextcloud.

@AncalagonTheBlack
Copy link

Such a Feature would be awesome. So many times it would come handy and today again.

@alexeymuranov
Copy link

I've noticed some seemingly random behaviour of my nextcloud-client 2.5.0 on Ubuntu (installed from nextcloud-devs PPA) with respect to symlinked folders inside a synchronised folder.

I first noticed that files in some local folder A which was not synced itself, but was symlinked from inside a synced folder S_local seemed to have disappeared by themselves. I looked into it and found that in the local synced folder S_local i had a symlink A to the "external" folder A, but in the remote synced folder S_remote there was an actual non-empty subfolder A (with the same name) in that place. When trying to sync, the nextcloud client was simply wiping the contents in my local folder A (which was just symlinked from inside S_local), and it was not downloading anywhere the contents of the remote actual folder A.

So far i didn't manage to reproduce this behaviour with a fresh folder (but i can reproduce it in the A folder). However, i observed some other strange behaviour. For example, i have currently a symlink to a folder locally, but a real folder with the same name remotely, and their contents is only partially synchronising.

Is this related to any known bug? I am not ready to report it myself, because i cannot yet reproduce the file-eating behaviour from scratch.

@alexeymuranov
Copy link

After some more investigation, i decided to report the problems i observed, because when synchronisation client removes files for no reason, it is a serious problem IMO: #899.

@fkbreitl
Copy link

fkbreitl commented Jan 25, 2019

Like many others I also need to sync my folders via symlinks and request this important feature.
From what I read above there are no good reasons not to support it and other cloud clients like Dropbox and tools like rsync support it, too.
It is also not necessary to support all platforms at once. You can start with one platform and later others.
One just needs to start and it is a rather simple thing to do.
So what are the plans?

@sphakka
Copy link

sphakka commented Feb 1, 2019

Some ideas might come from this recent update to rclone

@basos9
Copy link

basos9 commented Feb 3, 2019

EDIT 2023-10-22: This node is actually for Symlink dereferencing #3335
removed comment
see referred issue

@jcklpe
Copy link

jcklpe commented Feb 15, 2019

I'm actually having this problem with the client right now. It has for a long time complained to me about symlinks but now it's refusing to sync them at all, which is causing problems for me. Any suggestions on how to fix this in the short term please let me know!

@rickdoesdev
Copy link

rickdoesdev commented Apr 22, 2019

On the point that is commonly being raised as a counter re crossplatform; it's worth noting that Windows/NFTS -does- support symlinks (and hardlinks, can we please preserve hardlinks while we're at it) which are read and followed same as linux. It's had this support for years and years but hardly anyone uses them. Lately MS has been desparatly trying to claw back a developer user base that was migrating away so they've made the symlinks more obvious; and seem to have removed the previously required privilege escalation required to use them, so sym and hardlinks are a 1st class citizen on windows. .lnk files should not factor into this feature request at all.

It's been many years since I've been on a mac though, so I'm not sure what proprietary nonsense apple is doing, but osx is bsd, and last I checked it also still used sym and hardlinks under the covers (ln / ln -s command at least, I assume the fundamental node created on the HDD is the same)

(Edit: for people wanting a nice explorer ui for handling links on windows; http://schinagl.priv.at/nt/hardlinkshellext/hardlinkshellext.html is an amazing bit of software and a day one installation for me on any fresh windows setup)

@jcklpe
Copy link

jcklpe commented Apr 23, 2019

I can confirm the moves by MS to include symlinks and hardlinks more and the link shell extension linked by rcuddy is cray invaluable. Highly recommend!

@scholer
Copy link

scholer commented May 7, 2019

On the point that is commonly being raised as a counter re crossplatform; it's worth noting that Windows/NFTS -does- support symlinks (and hardlinks, can we please preserve hardlinks while we're at it) which are read and followed same as linux.

I really would like to emphasize this point by rcuddy. Symbolic links are super valuable as-is, which is why Microsoft has supported them for a long time. While symbolic links weren't originally supported in Windows XP, they have been supported since Vista. NTFS has full support for both symlinks and hardlinks (and junctions!). Microsoft initially made the mistake of requiring admin privileges to create symlinks, mostly out of fear of how Windows XP apps would react to symlinks. But Microsoft has now realized that symlinks has fantastic value when organizing the filesystem (just see how ubiquitous symlinks are Linux!). Microsoft has steadily been making it easier and easier for normal users to create symbolic links. (Last I checked, to allow normal users to create symlinks, you still either had to enable "developer mode", or use "Local Security Policies" to allow non-admin users to create them. But I imagine Microsoft will eventually remove this restriction - and until then, it wouldn't be too difficult to just ask NextCloud users to configure this if they want to synchronize symlinks.)

My recommendation is as follows:

Preserve symbolic links, as-is. Relative symbolic links are practically the same on all systems. That is, if I have a relative symlink pointing to ../very/deep/sub/directory/, that symlink can be preserved across all systems. Absolute symlinks are perhaps a bit trickier because of the NTFS drive-letter prefix (although Windows also support "absolute" paths without a drive letter, e.g. cd \Users to go to C:\Users if you are working on the C-drive, and D:\Users if you are on the D-drive). My recommendation would be to ignore symbolic links with a drive-letter specification, with an option to strip the drive-prefix (e.g. C:\Users is converted to /Users). Alternatively, absolute symlinks could be ignored entirely.

If people wants to "sync folders outside the NextCloud folder", NTFS junction points is the obvious solution: They are basically directory hardlinks*, but can be used to link to directories on other drives/partitions.

(*Junctions are not actually hardlinks, since making actual directory hardlinks opens a can of worms. Junction points uses NTFS "reparse points", which basically tells the software to stop and reparse the path in a specific way, depending on the type of reparse point.)

Of course, it would also be possible to just have client-side settings to configure the symlink sync behavior, e.g.

  • Relative symlink:   ☒ Sync, as-is   ☐ Sync, dereferencing link   ☐ Ignore.
  • Absolute symlink: ☐ Sync, as-is   ☐ Sync, dereferencing link   ☒ Ignore.

PS: Windows' .lnk shortcut files have always been an inferior substitute for symbolic links: They must be referencing an absolute path, and they cannot be traversed/followed by e.g. the terminal/command prompt or search utilities.

Refs:

@Adrien-Luxey
Copy link

Adrien-Luxey commented May 13, 2019

Hi there, just want to provide my use-case where I don't want NextCloud to follow the symlinks and synchronize their content:

I have this big "Code" directory (with 100MB+ CSVs and the like) that has been causing constant 100% CPU from NextCloud for a while. You might know that it is not trivial to exclude a folder from synchronization locally (to ask NextCloud to stop trying to upload a folder to the server).
Following suggestions in the aforementioned Reddit thread, I just moved my "Code" folder outside of the synced directory, and created a symlink to its previous location.
This solved my problem. So I'm glad symlinks are not followed by NextCloud, as I don't see any other easy solution to prevent uploading subfolders of the synced directory.

@rico666
Copy link

rico666 commented Oct 22, 2023

I really don't get it. For over 5 years, this babble revolves around how/if to do SYMlinks or not.
What do I care about Windows and MacOS?

Nextcloud runs on Linux. Period. Even if you use a "All-In-One" (https://nextcloud.com/blog/your-guide-to-the-nextcloud-all-in-one-on-windows-10-11/) install on Windows it's still Docker & WSL.

Therefore, the Nextcloud server has native support for symlinks. I have my files on Linux, I do a "ln -s" on my filesystem and see syncing action of the Nextcloud client, still the symlink does not appear even in the web frontend.

It's 2023. low hanging fruits. Minimum viable product. Show the symlink at least in the web frontend, allow clicking on it, show the destination directory - if accessible by permissions. Is that more than 1 hour of work?

@basos9
Copy link

basos9 commented Oct 22, 2023

On the point that is commonly being raised as a counter re crossplatform; it's worth noting that Windows/NFTS -does- support symlinks (and hardlinks, can we please preserve hardlinks while we're at it) which are read and followed same as linux.

...

My recommendation is as follows:

Preserve symbolic links, as-is. Relative symbolic links are practically the same on all systems. That is, if I have a relative symlink pointing to ../very/deep/sub/directory/, that symlink can be preserved across all systems. Absolute symlinks are perhaps a bit trickier because of the NTFS drive-letter prefix (although Windows also support "absolute" paths without a drive letter, e.g. cd \Users to go to C:\Users if you are working on the C-drive, and D:\Users if you are on the D-drive). My recommendation would be to ignore symbolic links with a drive-letter specification, with an option to strip the drive-prefix (e.g. C:\Users is converted to /Users). Alternatively, absolute symlinks could be ignored entirely.
...

PS: Windows' .lnk shortcut files have always been an inferior substitute for symbolic links: They must be referencing an absolute path, and they cannot be traversed/followed by e.g. the terminal/command prompt or search utilities.

Refs:

* https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file

* https://www.tuxera.com/community/ntfs-3g-advanced/junction-points-and-symbolic-links/

* https://www.2brightsparks.com/resources/articles/NTFS-Hard-Links-Junctions-and-Symbolic-Links.pdf

Hello I quote the relevant reply from scholer which is a plan for this. I realize that my coment above is for the complementary feature (sync dereferenced symlink content, which is another issue #3335)

So focusing on the sync sym link as a reference case. There are these mentioned already.

  • Internal (to the sync root) links vs external ones Absolute links vs relevant
  • Non unix os support
  • Web support

SO the MVP is kind of complex already. Since it involves changes in

  • the desktop client for *X oses
  • the server code for web (e.g. for relevant links that point outside of the sync dir, web should not resolve the link.
  • the desktop client for Windows (at least to ignore a symlink file until the implementation of windows links.

But I am with you ! At least I could help for the testing.

@basos9
Copy link

basos9 commented Oct 22, 2023

Hi, I think to implement this feature for all platforms at once with cross-platform synchronization is too big for one feature. Splitting it into more manageable pieces might help to make some progress.

I'd like to work on it and provide an implementation to handle Linux symlinks. I can also have a look at Windows (although I don't have any experience with the APIs there), but I don't have a MacOS installation.

I would implement Linux symlink synching without dereferencing (no server-side handling, sync as file; "git style") first since this should be in my opinion the most useful and easiest one. This feature would include an option in the Linux client (default off) to enable this behavior and otherwise keep the current behavior.

In the next steps, this can be also implemented for Windows and MacOS with another feature being then the cross-platform synchronization. Dereferencing of symlinks for each platform could then be additional three issues together with the configuration option to switch between the different behaviors.

Do you agree and would you accept such a contribution?

In addition I relink the discussion to warm up
This post is useful for clarification https://help.nextcloud.com/t/symbolic-link-support/220/22
and one comment of mine https://help.nextcloud.com/t/symbolic-link-support/220/32 that has some thoughts.

As I remember the sync links as is (no client dereferencing) has some things to consider. Regarding my first point above, it seems more relevant the distinction between internal (to the sync root) link targets and external ones (rather than absolute and relative).

And it seems that the simplest (minimum viable thing) would be to consider only internal links to be semantically sane (remember for many client machines, even for linux ones, absolute links may not have a meaning, or the same meaning)

Also, this option (to consider only internal links), by the way, simplifies the first implementation for windows, since we will not deal with drive letters.

@taminob
Copy link

taminob commented Oct 23, 2023

@basos9 thanks for picking out the comments with possible challenges and solutions.

I think absolute links have to be handled with care, too, since the client's sync directory and the server's data directory won't match, so a conversion will be required.

SO the MVP is kind of complex already. Since it involves changes in

* the desktop client for *X oses

* the server code for web (e.g. for relevant links that point outside of the sync dir, web should not resolve the link.

* the desktop client for Windows (at least to ignore a symlink file until the implementation of windows links.

I will experiment a bit (didn't work with any nextcloud code so far), but could very well be that ignoring links that point outside of the sync dir would be easier for the MVP to remove your second bullet point.
But I'll also have to check how the current web interface will react if symlinks start appearing in the data directory.

Since there seems to be no opposition, I'll start working on this issue and will open a PR with a first proposal soon.

@basos9
Copy link

basos9 commented Oct 23, 2023

Good news.
Lets focus on this functionality (sync links as is). I closed the issue for the client dereferencing issue to help on this direction (see the comments there for rationale).

Regarding the server, We will see but I think that he should be modified to support a new file type for symbolic links. Except if this will work out of the box, but I am not very optimistic based on the comments below.

Quoting some more relevant comments from here (4-5 years ago) to keep them as implementation reference https://help.nextcloud.com/t/symbolic-link-support/220/25

As a final clarification, Nextcloud currently implements (but disables by default for security reasons and due to the hash mismatches) the scenario you are talking about (aka client dereferencing) , but does not implement the one (never dereference) that I am talking about.

https://help.nextcloud.com/t/symbolic-link-support/220/15

symlinks have to point to other files inside your Nextcloud data dir (which to me means relative links only, and no use of ‘…’ to escape the data dir) - so the server would necessarily need to validate this when a new link is uploaded/modified on the server

https://help.nextcloud.com/t/symbolic-link-support/220/18

Recognize the existence of a symlink on the client host and store it in the server as a file. For instance, if I have a symbolic link, such as foo.txt being a symbolic link to ../bar.txt, I would be happy of seeing on the server a foo.txt file with content symlink -> ../bar.txt. Then, the other way round, when getting something that is on the server, but not on the client, recognize if the item to be synced is a file whose content matches the symlink magic, and (i) if so; and (ii) if on a platform supporting symlinks, change the file on the client into a symbolic link.

https://help.nextcloud.com/t/symbolic-link-support/220/32

Regarding type A (noderef), I can see use cases, but I think it should be limited to symbolic links targeting paths inside the sync root. To distribute links that link outside the root is not so practical, since:

The paths environment might be different from client to client (e.g. client a has folder /opt/data/, client b does not). If you need a client to modify a path outside the sync root, with type B (since you know what you are doing) you can add a symlink and only to the environments you want to (e.g. only on client a and not on b)
b. There is no cross platform meaning for e.g…" C:\data" vs “/home/data”

Also I consider this to be more difficult to implement, since you need to modify client and server to be aware of a new file type. In addition, client should consider all platforms specifically. Also you need to sanitize that links will be pointing inside the sync root.

@RokeJulianLockhart
Copy link

Apologies for this, but could someone summarise whether the current consensus in this issue is to synchronize symbolic (and hard?) links, junctions, and .url files without modifying them? If not, which shall be synchronized and which not? I'm a layman – I can't confidently parse some of the technical jargon utilized thus far. Thank you.

@taminob
Copy link

taminob commented Oct 28, 2023

Apologies for this, but could someone summarise whether the current consensus in this issue is to synchronize symbolic (and hard?) links, junctions, and .url files without modifying them? If not, which shall be synchronized and which not? I'm a layman – I can't confidently parse some of the technical jargon utilized thus far. Thank you.

I'm currently working on the synchronization of symbolic links on Unix systems (developing on Linux, might need help to actually test it on Mac).
These links will be synchronized as links to the server.

Hard links will be basically impossible to synchronize (except I miss some possibility, but they'd probably be a mess) - they are probably already getting synchronized via dereference?
I don't know too much about junctions - synchronization in general should be possible, but I won't address them in the first implementation where I'll try to keep the diff as small as possible.
.url files are probably already getting synchronized, aren't they (I think it's just a normal file with a special file ending)? But I won't address any Windows specific links for now since the Linux case is also the most basic one and all the different conversions to a Unix symlink (for storage in Nextcloud server) should be done in the according client as a separate feature.

In my opinion, each of these other features deserve their own issue to discuss details there - they'll probably be big enough to have a full separate PR each.
Smaller PRs might help tracking down bugs and on the other side allow continuous progress step by step.

If someone else already wants to start working on something, feel free to contact me.

@f1d094
Copy link

f1d094 commented Nov 6, 2023

Upon review of the entire discussion chain, I would like to note that under any working circumstance the correct way to synchronize symlinks would be to not dereference them and copy them as symlinks. Under no circumstances should links be processed or followed by synchronization software. They are a basic filetype, where they point to should be irrelevant from the context of file synchronization. Does Nextcloud open .pdf documents and follow any links therein? No. Why would filetype symlink be different? If the link is a broken link, who cares? This is the user's problem and definitely not the problem of the file-sync software. If the target does not exist on "n" number of clients, the server, or even the source...this does not matter. It is a basic file. Please just sync it.

As far as the web/server-side, they should appear as defined by the web-server configuration. Under apache, for example, if "Options -FollowSymLinks" is set then symlinks do not appear/are not followed. Potential security exposures due to lax security of the administrator is a matter of training and/or proper configuration of the webserver. Again, the symlink is just a file and what it contains should not be of any concern of file-synchronization software.

The use case of wanting to easily "link-in" large amounts of content is not valid. De-duplication can be accomplished by the reverse. The content can live under the Nextcloud root and be linked to from the non-nextcloud environment without breaking a fundamental aspect of the filesystem relied on by 99.99967% of *nix users. Symlinks should never be dereferenced.

As far as systems that do not support filetype symlink, then they are not supported and should not be synced to those systems. The end. No conversion, no handling. If a windows user wants similar functionality, they can make a .lnk file of their own which will in turn come back and pollute the filesystem of the linux users but will have no impact on their functionality. Again, no processing of the actual files should be done by file synchronization software.

Files. They should be synchronized unmolested. A symlink is a filetype. Leave them alone and sync them as expected please.

De-referencing or not-sychronizing symlinks breaks the standard operation of *nix filesystems and should be treated as a bug, not a feature.

Handling of this filetype should be the simplest thing in the world. Don't over-complicate things.

@basos9

@f1d094
Copy link

f1d094 commented Nov 6, 2023

As an additional thought: Since Nextcloud's web folder structure isn't based directly off of the OS file structure, why can't symlinks that would be synchronized to the server simply be ignored by the web frontend and still synchronized to all clients?

This eliminates any "need to sanitize" anything.

@basos9

@taminob
Copy link

taminob commented Nov 7, 2023

As an additional thought: Since Nextcloud's web folder structure isn't based directly off of the OS file structure, why can't symlinks that would be synchronized to the server simply be ignored by the web frontend and still synchronized to all clients?

This eliminates any "need to sanitize" anything.

@basos9

A good idea for the first MVP, I might fall back to this behavior if it's too much trouble to catch the invalid symlinks.
For the final implementation, I think that having them also show up in the Web UI will be more user-friendly since they are otherwise hidden files on Windows and you have no way of deleting them (probably not too bad since they only require a couple of bytes, but still).

I did already some work on server and desktop and was able to upload a symlink using the single file upload (via PUT request), currently working on the bulk upload.
I will create the Draft PRs (server and desktop) later today.

@basos9
Copy link

basos9 commented Nov 7, 2023

Good news, I am preparing a developement environemnt to test your changes @taminob .

Regarding the comments of @f1d094 I think that all the concerns have been stated, but I will repeat some thoughts.

Yes this issue #250 is for the client to sync symbolic links as a reference. (And I agree that the client dereferencing should not be implemented (see comments in the relevant issue)).

Regarding the sanitization of link's target. There might be usability issues (and we should carefully consider security related ones, remember that nextcloud supports sharing, federation, external storages and is in general a complex system). Yes for the MVP we could skip this. But we should vote for a secure and user friendly final implementation after all. The issues:

  • links in folders that do not exist in different clients
  • absolute links that do not make sense in different oses or do not exist in different clients
  • We don't break all possible nextcloud deplyments
  • Any security issues (either in client or in server side)

As I am thinking it, if we put and opt-in (default off) switch for the desktop client (per client instance), to sync links (even with out sanitizing the link's target), this could be acceptable as a power user feature, which will not affect the majority of users. Also this would not affect existing configurations during the client upgrade for the version supporting symlinks.

Regarding the server, and more specifically for the web UI a secure representation is to present the link as a special file of type link (and not let the server's OS try to dereference the link and present it to the WEB UI) (say I upload a link afile->/etc/passwd, when I open the file in the webUI I should not see the file's content). Regarding the link's representation in the server, it could be a real link, or maybe a virtual file (only metatada) in the db. (I am curious what did the the test with the PUT request that @taminob made)

As a general note for symbolic links in *X systems (and Junctions (aka mklink) if Windows NTFS filesystems (and not .lnk files) ) are indeed special files from the OS side. This means they are not regular files (like pdf with html links inside, or windows .lnk files). This means that there are special system calls and C's std library functions to deal with them. Actually. the default for a system call open("/path/to/symlink) in a Linux system is to open the target flle (dereference the link). Witch means that we need another call (e.g. readlink) to read the link's target.

@taminob
Copy link

taminob commented Nov 7, 2023

Good news, I am preparing a developement environemnt to test your changes @taminob .

Thanks, although currently there isn't too much to test yet. :)
But will definitely be helpful if there is a first "working" version.

Regarding the link's representation in the server, it could be a real link, or maybe a virtual file (only metatada) in the db.

So far, I only considered "normal" files and actual symlinks - using virtual files with additional meta data could be a pretty good idea to solve a couple of issues. Thanks for the idea. :)

In my opinion, the MVP should already be secure, although usability doesn't have to be perfect.
Thus, either preventing symlinks leading to the outside of the user's data home or ignoring them in the Web UI could be a valid approach for the MVP.

@f1d094
Copy link

f1d094 commented Nov 7, 2023

Good news, I am preparing a developement environemnt to test your changes @taminob .

Hooray!

Yes this issue #250 is for the client to sync symbolic links as a reference. (And I agree that the client dereferencing should not be implemented (see comments in the relevant issue)).

Hooray!

Regarding the sanitization of link's target. There might be usability issues (and we should carefully consider security related ones, remember that nextcloud supports sharing, federation, external storages and is in general a complex system). Yes for the MVP we could skip this. But we should vote for a secure and user friendly final implementation after all. The issues:

* links in folders that do not exist in different clients
* absolute links that do not make sense in different oses or do not exist in different clients

It seems to me that the simplest way to handle these would be to sync all to server and sync none to clients that do not support them. So, Windows .lnk files only go to/from windows client, Unix symlinks to/from posix clients (OSX, Linux, etc) and so on. This avoids complexity. It goes under the category of "not your problem". It isn't the responsibility of the Nextcloud team to solve longstanding multi-platform integration issues that have plagued the world since Windows first connected to a unix system. ;)

* We don't break all possible nextcloud deplyments

Agreed 100%. Stability should always be job #1. That said, the current configuration does interfere with the standard operation of *nix environments by dereferencing symlinks...I am very surprised that more users aren't up in arms about this.

* Any security issues (either in client or in server side)

I would argue that client-side, this isn't really the responsibility of the Nextcloud team to address beyond the function of the Nextcloud application. If I want to shoot my own foot off, please let me...but as someone who does systems penetration testing professionally, I honestly can't think of any malicious use case on the client side as long as the nextcloud client doesn't dereference/follow the symlinks.

Server-side, I would recommend a minor change which I've made: Instead of recommending "+FollowSymLinks" for apache2.conf, change to "+SymLinksIfOwnerMatch". Correctly deployed environments should have a discrete user assigned for the webserver process and this small change should prevent any links being able to reach outside the webroot. This doesn't prevent malicious users from pointing to nextcloud's operational files and other edge cases however, which may be a consideration.

As I am thinking it, if we put and opt-in (default off) switch for the desktop client (per client instance), to sync links (even with out sanitizing the link's target), this could be acceptable as a power user feature, which will not affect the majority of users. Also this would not affect existing configurations during the client upgrade for the version supporting symlinks.

I would be so happy I would do kartwheels. A great way to roll out for MVP

Regarding the server, and more specifically for the web UI a secure representation is to present the link as a special file of type link (and not let the server's OS try to dereference the link and present it to the WEB UI) (say I upload a link afile->/etc/passwd, when I open the file in the webUI I should not see the file's content). Regarding the link's representation in the server, it could be a real link, or maybe a virtual file (only metatada) in the db. (I am curious what did the the test with the PUT request that @taminob made)

See my note above. SymLinksIfOwnerMatch in the web server config should handle this nicely. The links can then safely point to whatever users want. Maybe I want a link to /etc/passwd as part of some dev project? That can be legit...I would be more wary of /etc/shadow of course...but again: should not be Nextcloud's problem to solve.

As a general note for symbolic links in *X systems (and Junctions (aka mklink) if Windows NTFS filesystems (and not .lnk files) ) are indeed special files from the OS side. This means they are not regular files (like pdf with html links inside, or windows .lnk files). This means that there are special system calls and C's std library functions to deal with them. Actually. the default for a system call open("/path/to/symlink) in a Linux system is to open the target flle (dereference the link). Witch means that we need another call (e.g. readlink) to read the link's target.

Indeed, the system call is different. When opening the file with open(), you may be able to use the flag O_NOFOLLOW. This will tell open() not to follow symlink if there is one. However, you will still need to know if the file you opened was a symlink (but not followed) or not. To do this, use the file descriptor returned from open(). Look into lstat() and check the st_mode field of the struct stat for S_IFLNK.

Many thanks to @taminob and @basos9 for working to finally resolve this languishing issue!

@f1d094
Copy link

f1d094 commented Nov 7, 2023

Good news, I am preparing a developement environemnt to test your changes @taminob .

Thanks, although currently there isn't too much to test yet. :) But will definitely be helpful if there is a first "working" version.

Regarding the link's representation in the server, it could be a real link, or maybe a virtual file (only metatada) in the db.

So far, I only considered "normal" files and actual symlinks - using virtual files with additional meta data could be a pretty good idea to solve a couple of issues. Thanks for the idea. :)

In my opinion, the MVP should already be secure, although usability doesn't have to be perfect. Thus, either preventing symlinks leading to the outside of the user's data home or ignoring them in the Web UI could be a valid approach for the MVP.

Please don't try and parse symlinks and make any logical determinations from them...a symlink should be allowed to point to anything and be sync'd intact. Protecting resources should be left to the webserver config and the nextcloud system application function. On any given client, there are a million ways to wind up with symlinks pointing to any number of things that 'USER' should not or does not have access to. It is not the responsibility of any software to police that...that is left to ACLs, permissions, etc.

Nextcloud, as an application, should have built-in funcationality to prevent access to any of its internal components and should not be relying on any individual sub-module, plugin, app, etc to be policing this...as I mentioned in the other post I would recommend that the default Nextcloud install use +SymLinksIfOwnerMatch instead of "+FollowSymLinks" for example...this puts enforcement where it belongs.

@taminob
Copy link

taminob commented Nov 7, 2023

See my note above. SymLinksIfOwnerMatch in the web server config should handle this nicely. The links can then safely point to whatever users want. Maybe I want a link to /etc/passwd as part of some dev project? That can be legit...I would be more wary of /etc/shadow of course...but again: should not be Nextcloud's problem to solve.

While I agree that it will be very helpful to be able to have symlinks to outside of the data directory, I don't think it'll be that easy. For the MVP, we can hide behind the 'localstorage.allowsymlinks' => false option - which already states that it is a security risk to enable it. However, for a production multi-user system that will be an issue because you can peek into other user's home directories and e.g. into config/config.php which contains database passwords etc. (all the same OS user).
I don't really think, dereferencing makes sense on the server, but I have to find every place where this might happen or it introduces a security bug. That's the reason why the proposed solution via virtual files sounds kind of neat.

Indeed, the system call is different. When opening the file with open(), you may be able to use the flag O_NOFOLLOW. This will tell open() not to follow symlink if there is one. However, you will still need to know if the file you opened was a symlink (but not followed) or not. To do this, use the file descriptor returned from open(). Look into lstat() and check the st_mode field of the struct stat for S_IFLNK.

I don't really have any trouble with the desktop client - >80% of the time spent on this issue so far have been on the server side because these are basically the first PHP lines I've ever been in touch with. :)

Please don't try and parse symlinks and make any logical determinations from them...a symlink should be allowed to point to anything and be sync'd intact. Protecting resources should be left to the webserver config and the nextcloud system application function. On any given client, there are a million ways to wind up with symlinks pointing to any number of things that 'USER' should not or does not have access to. It is not the responsibility of any software to police that...that is left to ACLs, permissions, etc.

Nextcloud, as an application, should have built-in funcationality to prevent access to any of its internal components and should not be relying on any individual sub-module, plugin, app, etc to be policing this...as I mentioned in the other post I would recommend that the default Nextcloud install use +SymLinksIfOwnerMatch instead of "+FollowSymLinks" for example...this puts enforcement where it belongs.

No worries, I don't try to limit anything - as long as it won't hurt security (although we might be able to hide for the initial implementation behind that option mentioned above). If we just never dereference them on the server, there will also no issue.
I already synced many symlinks with newlines, quotes and other weird stuff in them in my testing. :)

@basos9
Copy link

basos9 commented Nov 9, 2023

Hello,

I would argue that client-side, this isn't really the responsibility of the Nextcloud team to address beyond the function of the Nextcloud application. If I want to shoot my own foot off, please let me...but as someone who does systems penetration testing professionally, I honestly can't think of any malicious use case on the client side as long as the nextcloud client doesn't dereference/follow the symlinks.

This seems logical. Again with opt in (default off) behavior. Since the security issues might be from bad user interaction or bad software interaction. (somebody I think had said something relavant) But here is an imaginary one.
I have a nextcloud synced folder that has music files, and then I have a web application for a media server service that same directory. Then I accept user shares. And Somebody shares with me. a link with /etc/passwd (or shadow or whatever). That means that the shared file appears on my sync root. Then the media server software has a bug that allows anybody to visit www.domain.com/player/getfile/passwd and serve that file. And such

So again thinking out loud, even if we finally do not sanitize symlink targets to be on the sync root, It might make sense for Sharing symlinks to be disabled, or introduce an option to enable them. Anyway.

That's the reason why the proposed solution via virtual files sounds kind of neat.

Yes, actually there is no need and could indeed introduce bugs, for symbolic links to be stored as such server side. Remember that each file served had a db record for it's metadata. The requirement is to sync sym links between clients. In the server we do not want this per se.

. If we just never dereference them on the server, there will also no issue.
I already synced many symlinks with newlines, quotes and other weird stuff in them in my testing. :)

Yes, If !

But not ! localstorage.allowsymlinks should not be involved here. As it seems this is an advanced admin setting for the server. It enabled a server admin to put parts of a directory of the data dir point to other places. BUT the server code should not be able to create (real) symlinks. Only the server admin (e.g. via cmd, ln -s). This leads to storing client symlinks as virtual files (or special files with appropriate meta data).

I managed to test the above scenario with the master branch with the docker development env with modifications

localstorage.allowsymlinks' => true

Then I created two links inside admin's shared folder

$ ls -l workspace/server-data/admin/files/
total 8620
-rw-r--r-- 1 user user 8822513 Νοε   9 13:02  Nextcloud_Server_Administration_Manual.pdf
lrwxrwxrwx 1 user user      11 Νοε   9 13:08  pas -> /etc/passwd
lrwxrwxrwx 1 user user      33 Νοε   9 13:08  testlink -> /nonexistent

as you see pas is linking to /etc/passwd ON the server and testlink to a non existent file.

Then

runned

occ files:scan --all

and then from the web
file pas appeared

and on the local file system running client

cat  workspace/desktop-test/data/pas 
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin

Not good.

@taminob
Copy link

taminob commented Nov 9, 2023

But not ! localstorage.allowsymlinks should not be involved here. As it seems this is an advanced admin setting for the server. It enabled a server admin to put parts of a directory of the data dir point to other places. BUT the server code should not be able to create (real) symlinks. Only the server admin (e.g. via cmd, ln -s). This leads to storing client symlinks as virtual files (or special files with appropriate meta data).

Fair point, could make sense to add another setting if necessary and not re-use that one.

During development, I was also able to get the content of system files via symlinks in Nextcloud. However, that issue would be already resolved as soon as the server will not actually follow symlinks anymore (although e.g. 3rd party apps could still dereference them and thus compromise the file).
Currently, I try to continue with actual symlinks on the server and get all the necessary steps to work and find the correct files that need to be modified (upload, download, server query).

Once that (at least kind of) works, I'll look into improving it by using metadata-only files ("virtual files" means something different in Nextcloud context I think: https://nextcloud.com/blog/nextcloud-desktop-client-3-2-with-status-feature-and-virtual-files-available-now/).

@basos9
Copy link

basos9 commented Nov 13, 2023

Hello, I am commenting for the final solution's specs, please don't stop your good work at what you're in. I've checked your commits and it seems that you've taken it seriously!.

Fair point, could make sense to add another setting if necessary and not re-use that one.

I think this is not a good approach. Since there is functionality that relies upon server's data folder having symlinks (via the localstorage.allowsymlinks) we should not allow server code to create sym links.

And in general, there is not a functional requirement to actually store a symlink inside the server's data dir.

Once that (at least kind of) works, I'll look into improving it by using metadata-only files ("virtual files" means something different in Nextcloud context I think: https://nextcloud.com/blog/nextcloud-desktop-client-3-2-with-status-feature-and-virtual-files-available-now/).

Yes, let's then do not use this term. What I mean is to create meta files that would be interpreted as symlinks. The best would be for these files to not exist at all in server's data dir, rather they are somehow stored in the database. Or if this breaks the logic where every db entry (I think in oc_filecache) should have a real file (for .e.g occ:scan to work ) we could allow for empty files to be created in server's sync root for every client sym link.

@taminob
Copy link

taminob commented Nov 23, 2023

Just wanted to give you all a short update because I made some progress after experimenting with different server representations.
In the end, I settled on a regular file in the filesystem containing the symlink target. Additionally, to determine if a file is a symlink, I added a new table to the database (oc_symlinks).
Using the existing oc_filecache to store the type as e.g. mimetype has a couple of disadvantages like that occ files:scan --all will overwrite any manually set mimetype. Also, I would have a pretty bad feeling with storing essential data in something called "cache".

Using this, I successfully synchronized basic symlinks to and from the server (PROPFIND, PUT, POST, GET and DELETE).
However, there are still lots of bugs left which I'll have to fix one by one. Like that the symlinks are getting synchronized every time or that the symlinks aren't getting deleted from the database.
Writing tests, cleaning up the code and PRs and maybe indicating symlinks in the Web UI are also still on my TODO list.

@taminob
Copy link

taminob commented Dec 7, 2023

Next update - I resolved the issues mentioned before (and lots of others as well). Using the modified server and client, uploading/downloading/deleting/renaming of symlinks seems to work.

I mentioned the currently known limitations in the PR for the server (they all only affect the server side), namely:

  • symlinks restored from the trash bin will be re-created as regular files with the target as their content
  • symlinks are not indicated in the Web UI
  • symlinks are copied as regular files with the symlink target path as their content (if copied via Web UI)

Additionally, I tested nothing on Windows so far (might not even compile) - most of the implementation would actually also work for .lnk files (so "shortcuts") since Qt handles shortcuts as Windows' symlinks. If that or native NTFS symlinks should be used, might require further discussion since it will be hard to change once it is released with either option.

@basos9 if you're still interested in trying out the changes, I'd appreciate any feedback - I am sure that I missed a lot of bugs since my testing did only cover the most basic cases so far.

@f1d094
Copy link

f1d094 commented Dec 7, 2023

@taminob: First - thank you for your hard work!

Second: "symlink target as their content" sounds a lot like dereferencing the symlinks. I'm gathering that you mean a file that is a special on the serverside, and has the client-side link name as the filename and the contents server-side are just the text of the path of the actual symlink, or something similar? You may want to clarify in your comments, "symlink target" has specific the connotation of being the actual file that the symlink points to.

Also a quick thought: When saving the symlinks, be sure to use relative vs full-path symlinks as they were created on the source system.

Forgive me if this was clarified previously. I've not had the bandwidth to follow the developments and conversation over the past few weeks.

@taminob
Copy link

taminob commented Dec 7, 2023

Second: "symlink target as their content" sounds a lot like dereferencing the symlinks.

@f1d094 thank you for pointing out the ambiguity. "symlink target path" is maybe more accurate, the symlinks are not dereferenced - so e.g. broken symlinks can be synchronized as well.
Also, the "raw" symlink is uploaded, so if it is relative, it remains relative and if it's absolute, it remains absolute. The value returned by "readlink" is used.

@AkechiShiro
Copy link

AkechiShiro commented Apr 17, 2024

What's currently stucking this feature from rolling out ? There are two PRs right now #6205 (client) nextcloud/server#41321 (server side)

@AkechiShiro
Copy link

Given this was discussed since 2018, such a feature missing is a bit sad.

@AkechiShiro
Copy link

I feel like the client Nextcloud has lacks a lot of polished feature, this does not feel like there is a company behind.

Testing of feature such as file syncing seems to not even be implemented or properly handled. As files do not sync with the latest client from upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement enhancement of a already implemented feature/code epic feature: 🔄 sync engine feature-request new feature
Projects
None yet