Skip to content

Notes on my experience and thoughts regarding choosing a sync service for maintaining dotfiles, scripts and documents across Linux installations

License

Notifications You must be signed in to change notification settings

bakkeby/choosing-a-sync-service-under-linux

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Choosing a sync service for dotfiles, scripts and more under Linux

by Stein Bakkeby, July 2019

I happen to be one of those guys that have more than one computer, let alone more than one operating systems. Having certain configuration, scripts, documents and files synchronised across machines have become an integral part of my workflow over the years.

Take web-browsing for example; With both Firefox and Google Chrome it is possible to keep bookmarks and extensions synchronised such that the browsing experience is the same regardless of where you are using it from. I use Linux almost exclusively and my preference is to have a similar consistency across Linux installations; the same .bashrc, aliases, scripts, tools, configuration, notes, etc. on every box. I want to be able to work on a project on my laptop and then switch over to resume working on my stationary desktop. If I add a new script or make a configuration change then I want that to be available on other installations as well.

I first started this back when Dropbox was new; I discovered the simple idea of having a ~/Dropbox/Desktop folder, then replacing ~/Desktop with a symlink to the one in the Dropbox directory. This allowed me to have the same desktop across installations; messy, but consistent. Needless to say things escalated after that.

After having used Dropbox like this for many years a three-device limit was introduced on free accounts in March 2019 which meant that it was no longer going to be viable in the long run for me. Of course I could have just subscribed to Dropbox Plus for an annual fee of €120, but I don't really need 2 TB of cloud space and it seemed like a steep price to pay just to enable more devices.

So I started what would prove to become a long and arduous quest to find a suitable Dropbox replacement. It turned out to be a lot harder than I ever thought it would be.

In hindsight I concluded that it might be worth documenting my experience and thoughts on the various sync services I looked at and my reason for ultimately rejecting them, hence this writeup. This is not meant to be a thorough review of each of the various services, but more of a what worked and what did not work for me. I hope it might be useful to someone.

Definition of sync

There are three principles of sync solutions:

↔️ A two-way (bi-directional) sync; files are kept in sync and identical across multiple machines

➡️ A one-way (upload) sync; this is essentially a backup solution

⬅️ A one-way (download) sync; the machine is a passive consumer of sync updates

For my intensive purposes I need a two-way sync to keep files in sync across multiple installations, ideally identical down to file permissions and timestamps.

Requirements

Features and special requirements I have when considering a sync service. Mind you these were not all defined up-front; most of them only became apparent when experiencing flaws and shortcomings of various sync services.

  • Linux support
  • Two-way sync
  • Continuous sync (files are kept in sync automatically and continuously, rather than manually or at set intervals)
  • Support for synchronisation of symbolic links (symlinks) in a non-dereferencing manner (i.e. don't follow symlinks)
  • Offline access to sync files
  • Retention of file permissions (files should be identical across systems, including file permissions)
  • Selective sync (depending on the machine I may not need everything synced)
  • Performance (a sync operation should preferably not take ages to complete)
  • Initial setup time (when I set up a new VM or a new installation, how long does it take to get it up-to-speed with the sync service?)
  • Conflict resolution (how does the system handle conflicts? I prefer to keep most recently changed files)
  • Security (I don't consider anything on the internet to be truly secure, but the service should have at least some security measures in place)
  • Reliability (Consistent speed, predictable availability and the assurance that files are persisted correctly)
  • No device-count limit
  • Possibility to exclude certain file types (e.g. temporary files)

Nice to have:

  • Privacy (most cloud services require read access to files in order to provide additional services, as such others may have access to your files)
  • Encryption (some services offer file encryption for security reasons, not a must for me and I can always roll my own encryption if need be)
  • Versioning, i.e. the option to be able to view and revert to earlier versions of files
  • Free and Open Source Software (FOSS)
  • Possibility to sync more than one directory
  • LAN sync

There are often a myriad of additonal services offered by cloud solutions. Features such as file sharing, collaboration, etc. is not something that I have taken into account as they are irrelevant to what I want to achieve here.

Sync services

Below is the list of sync services looked at and considered. Many of these I didn't even get around to trying out due to obvious issues conflicting with my requirements, for example no Linux support at the time of review. The order here is more or less the order in which I reviewed options.

OneDrive Microsoft OneDrive (formerly SkyDrive)

❌ No Linux support

❌ No Linux support

ℹ️ There is an unofficial Linux client available called ODrive. While this might give remote access to files stored in Google Drive it does not provide the functionality to sync files locally. This means online access only.

❌ No Linux support

❌ No Linux support

✔️ Offline access

➕ Claimed security

➕ Claimed privacy through encryption

✔️ Linux support

✔️ Offline access

✔️ Selective sync

⚠️ Limited to five devices on the free version

✔️ Fairly reasonably priced

➕ Claimed security

❌ No symlink support (just ignores them)

Zoho Docs comes across as fairly basic with very little configuration options.

⚠️ Max file size of 250M

❌ No Linux support

❌ No Linux support

❌ No Linux support

❌ No Linux support

❌ No Linux support

❌ Turned out to be just a cloud storage with a web front end

❌ No sync options

ℹ️ They used to have a (windows presumably) desktop client which was discontinued back in July 2016

❌ Although it has some file storage options, it turned out to be largely focused on project management for businesses

❌ No single user option

✔️ Linux support

✔️ On premise solution (i.e. host your own)

✔️ Free and Open Source Software (FOSS)

❌ git based

SparkleShare is a git based solution where synced files are preserved in a git repository (for versioning and consistency purposes) and changes are pushed to other sync endpoints. Admittedly this work rather well for dotfiles and scripts, but the main reason why this does not work for me is that being git based it does not support syncing other git based repositories. I have certain projects that I want to keep in sync.

❌ The service was shut down years ago :)

➕ This is just to say that I miss you

➕ Potentially it would have been a good candidate

❌ No symlink support

ℹ️ The underlying software was released as open source

❌ No Linux support

✔️ Linux support

❌ Dereferences symlinks

✔️ Linux support

✔️ Support for symlinks in a non-dereferencing manner

✔️ Two-way sync

➖ Some manual installation / setup / configuration required

✔️ Possible to perform actions via the command line

✔️ Free and Open Source Software (FOSS)

✔️ LAN sync

ℹ️ No cloud solution (peer-to-peer only)

ℹ️ You have to set up both sides to connect two devices

Syncthing actually looks rather interesting. It is a peer-to-peer only two-way sync solution meaning no cloud is involved. Instead files are synced when computers are up. I don't always have more than one machine on so if I would take something like this then I would likely end up installing this on a NAS as well to have that "cloud" functionality.

I did not actually get around to try Syncthing out as I had some initial installation issues and I had concerns that this might potentially be time-consuming to set up for new installations (e.g. let's say I set up a quick virtual machine). The fact that I need both or multiple computers running for sync to work was also a slight turnoff, but can be addressed by installing it on a NAS. Syncthing is clearly a power user kind of option and my reason for not going with this is the assumed setup complexity when adding new (and maybe short-lived) devices / sync endpoints.

✔️ Linux support

✔️ Free and Open Source Software (FOSS)

➖ The project has more or less been abandoned by the original developers

❌ No sync options

❌ It is a cloud service focused in file sharing

❌ Public signups were disabled so was unable to get an account

✔️ Linux support

➖ Only 2GB of free cloud storage for private use

➕ Option to run a TeamDrive Personal Server for free, but limited to 10GB of data

This is clearly something that is more suited for business use to share files between teams. As a private user it does not make much sense buying into this, and the limited storage options for a free private account made it not worth it for me to look further into this solution.

✔️ Linux support

✔️ Generous amount of storage space for a free account

✔️ Possibility to sync more than one directory

✔️ Selective sync

✔️ Possibility to exclude certain file types

➖ There is a transfer quota (per day presumably), not really an issue for my purposes

❌ Dereferences symlinks

❌ De-duplication process deletes files, which can be a disaster when combined with dereferencing symlinks

✔️ Linux support

✔️ Two-way sync

✔️ Selective sync

✔️ Continuous sync (or optionally at specific intervals)

✔️ Possibility to sync more than one directory

✔️ Possible to perform actions via the command line (as long as the application is not already running in the background)

✔️ No limit on the number of devices (though they recommend staying below ten)

✔️ Excellent security (and they accurately state that logging in via mobile or your browser poses a security risk)

✔️ Privacy through zero knowledge encryption

✔️ Versioning

✔️ LAN sync

✔️ Possibility to exclude certain file types

➖ Initial setup time

❌ No symlink support (just ignores them)

❌ Performance

SpiderOak ticks a lot of boxes for me in terms of what their services offer and I have used SpiderOak for years, but for backup purposes only. I figured since I already had a subscription I would try using their sync service as well. Their sync directory is called "SpiderOak Hive" and the name is hardcoded as such (why did it have to contain a space?), but on the other hand you can disable that and set up multiple other directories to sync instead (e.g. ~/bin, ~/Desktop, etc.). One complication with this is that setting up a new sync is kind of backwards; first you need to add the directory that you want to sync as a backup on each and every box that you are going to sync to. Only then can you create a new sync and select what backup folder you want synced to what backup folder on other devices. An additional annoyance is that although you can do a lot from the command line, you can't as far as I could tell create or update these sync settings that way, you have to do it via the GUI front end.

What turned out to be a major blocker for me was performance. Granted it will inevitably take more time to do things properly (secure, encrypted, zero knowledge) and I don't need my syncs to be blazingly fast. The problem was that I had used SpiderOak as a backup service for years and had ended up using something like 5TB of storage space. This means that if I wanted to set up SpiderOak on a new installation I had to go through a two-hour syndication process. In terms of sync it could take 5-10 minutes for a new or changed file to be updated on other sync endpoints. The syndication process also resulted in it being very time consuming to delete / purge data in a desperate attempt to try to free up disk space and improve performance.

In the end my impression is that SpiderOak was designed with backup in mind and that the sync service was more of an afterthought. The performance issues that I encountered are exceptional and should not be a deterrent for trying this service out. I have a lot of respect for what they have accomplished in terms of security and privacy. My only recommendation in hindsight is that it would be better use SpiderOak either as a dedicated backup or as a dedicated sync service. In my opinion they could very well have made two different applications / services out of it rather than having two in one.

While there is no symlink support at the very least the service just ignores them, which is something that I could live with.

✔️ Linux support

✔️ Selective sync

✔️ Offers three modes: two way sync (bidirectional), upload only (backup) and download only

➖ Free account limited to 3GB of storage space and a maximum file size of 150M

✔️ Possibility to sync more than one directory

➖ Very few configuration options

❌ No symlink support (just ignores them)

❌ Sync option requires a business plan

⚠️ No free offering

⚠️ No individual offering (team / business only)

✔️ Linux support

❌ No symlink support (just ignores them)

➖ No free offering, but there is a 14 day free trial period

git Rolling my own

I must admit that with so many lacklustre cloud storage services out there I started considering the option of rolling my own, possibly using git. I figured that I would have to use a git server, then have clients automatically pulling, commiting and pushing at regular intervals. In the end I gave up on these ideas as I wouldn't be able to use something like this for syncing directories containing git repositories, so I might as well just have used SparkleShare.

❌ No Linux support

Resilio Resilio (formerly BitTorrent Sync)

✔️ Linux support

✔️ Selective sync

✔️ Support for symlinks in a non-dereferencing manner

ℹ️ Torrent based peer-to-peer sync solution

Resilio sounds interesting and has many similarities to Syncthing. Being a peer-to-peer solution means that your computers need to be on for sync to work, unless you also install it on a NAS. While the setup of Resilio is supposed to be less complex than that of Syncthing my impression was that the latter had more clear and detail documentation on how it works. Overall I was not entirely convinced that a peer-to-peer sync solution was the right choice for me.

❌ No sync options

Just including this as it offers some cloud storage and is sometimes referred to as a "Dropbox alternative".

✔️ Linux support

⚠️ On Linux there is no front end gui, only command line options (this can be a huge plus or a huge minus depending on your preferences)

➖ Only syncs one folder

✔️ Support for symlinks in a non-dereferencing manner

Yandex Disk might be OK, but it lacked a bit on the feature side for my needs.

❌ No Linux support

❌ Turned out to be just a cloud storage with a web front end

❌ No sync options

❌ Only 500M of storage for a free account

❌ Expensive for a subscription

✔️ Linux support

✔️ Terminal client as well as front end client

✔️ Selective Sync

✔️ Version control

✔️ Free community edition

❌ Dereferences symlinks

❌ Turned out to be just a cloud storage with a web front end

❌ No sync options

❌ Just a service to send (large) files to others

❌ Turned out to be just a cloud storage with a web front end

❌ No sync options

❌ No (native) Linux support

➖ Free version is just a cloud storage with a web front end

✔️ Zero knowledge encryption

❌ Subscription required for sync options

❌ Limited to three desktop devices unless you subscribe to the Ultimate package

❌ No Linux support

✔️ Linux support

✔️ Selective Sync

❌ Dereferences symlinks

✔️ Linux support

❌ Dereferences symlinks

⚠️ Free account has a 25M file size limit

❌ No Linux support

✔️ Linux support

ℹ️ insync is not actually a service, but a standalone client that can talk to multiple cloud services, most notably Google Drive

❌ Dereferences symlinks (if symlinked directory has not already been synced, i.e. exists in sync / backup)

✔️ Linux support

ℹ️ Like insync, ExpanDrive is not actually a service, but a standalone client that can talk to multiple cloud services such as Strongspace, Dropbox, Google Drive, box, OneDrive, Amazon Drive, Nextcloud as well as several other options

ℹ️ No free option, 7 day trial

ℹ️ No files are stored locally; cloud storage files are accessible through a virtual drive as long as ExpanDrive is running and is connected

ℹ️ File permissions appear to be fixed at 755 (rwxr-xr-x) for all files

ℹ️ Practically no additional settings

❌ No offline access

❌ No sync option

❌ Business focus only

❌ No sync option as far as I can tell

✔️ Linux support

✔️ Two-way sync

✔️ Selective sync

✔️ Continuous sync

✔️ Possibility to sync more than one directory

✔️ No hard limit on the number of devices (though they recommend not linking more than five)

✔️ Versioning

✔️ LAN sync

✔️ Possibility to exclude certain file types

✔️ Initial setup time (just install the client and you are done)

➖ Frequent notifications made the service come across as nagware

➕ a crypto service add-on is also offered, which is limited to files being stored in a "Crypto Folder" subdirectory

❌ No symlink support (just ignores them)

❌ File permissions are not preserved (i.e. execute permissions are lost)

Now pCloud is kind of different in that by design it is an Online only cloud service. Essentially it means that anything you move into the pCloud folder is for all practial purposes deleted from your system. Depending on your needs this can be a very good thing or it may not be. For example this could allow you keep all of the photos you take in pCloud, saving up space on your mobile device. This is something that turned out to be an appeal for me as well. For example I don't need to have my documents synced to each and every installation, but it can be useful to have such a straightforward access to them if need be.

Ultimately this means that to be able to see and open your files you will need online access. Should you need offline access to your files then you would have to set up additional syncs between the cloud service and individual local folders. This actually works rather well, but a big issue for me is that due to the nature of how the cloud service works the file permissions are not being retained. The end effect of this is that the executable permissions for files are not preserved, which is a bit of a blocker for me as I rely on quite a few shell scripts. Granted I could manage to work around this, but in the end I concluded that it was not worth the long term effort to do so.

✔️ On premise solution (i.e. host your own) or optionally sign up with a provider

✔️ Linux support

✔️ Two-way sync

✔️ Selective sync

✔️ Continuous sync

✔️ Possibility to sync more than one directory

✔️ No hard limit on the number of devices

✔️ Versioning

➖ No local peer-to-peer LAN sync, however as this is an on premise solution it does not matter much

✔️ Possibility to exclude certain file types

✔️ Preserves file permissions

➖ Conflict resolution

❌ No symlink support (just ignores them)

Nextcloud is pretty awesome. Setting up a self-hosted cloud service can present some challenges even for novice users. I happen to own a Synology DS918+ Network Attached Storage (NAS) which to say is versatile would be an understatement. To set up Nextcloud on this NAS all I really needed to do was to download two docker images (one for Nextcloud, and one for the Mariadb database) and figuring out how to link and make these run with the right settings. On the computers using this host I only needed to install the nextcloud client.

Such a cloud service would then only work within your local network. Should you need this "on the go" as well then getting it exposed on the web can potentially be challenging.

Nextcloud ticks a lot of boxes for me and has a fair amount of features and I may very well end up using this for something else in the future. What was kind of annying for me is that it does not support syncing symlinks. What was more annoying is that every symlink that exists comes up as a warning, which can make it difficult to differentiate them from actual warnings. It was also not quite clear how Nextcloud handles conflicts. This may be wrong, but it seemed that if a client has started syncing and is interrupted (let's say by a restart), then the next time it starts syncing any files that differ from the server are moved away and the files from the server are restored.

That Nextcloud managed to come up with conflicted files when I was only running one host and one client was one of the main reasons why I did not stop looking for alternatives.

ownCloud ownCloud

ℹ️ Given that Nextcloud is built on ownCloud I did not try this one out, I'd expect similar pros and cons though

✔️ On premise solution (i.e. host your own)

✔️ Linux support

➖ No versioning

➖ Deprecated in favour of Pydio Cells; security fixes only, no new features, official end of life scheduled for December 2019

Pydio appears to be more focused on file sharing and if going for something like this then I think Pydio Cells would be a better option.

✔️ On premise solution (i.e. host your own)

✔️ Linux support

✔️ Versioning

➖ PydioSync not supported yet as of time of writing

Pydio Cells is a complete re-implementation / re-design of Pydio into a micro-services architecture and has a stronger focus on collaboration and sharing.

While Pydio seems like a very capable file management platform it is also more of a business solution and is very far from the simple file syncing solution that I am looking for.

✔️ On premise solution (i.e. host your own) through Synology NAS

✔️ Can also work as an online service thanks to Synology QuickConnect

➖ You need to own a Synology NAS in order to be able to install and use this

✔️ Linux support

✔️ Two-way sync, one-way upload and one-way download options

✔️ Options for routine backup tasks

✔️ Selective sync

✔️ Continuous sync

✔️ Possibility to sync more than one directory

✔️ No hard limit on the number of devices

✔️ Versioning

✔️ Conflict resolution (option to keep the last modified version or to keep the version on the server)

➖ No local peer-to-peer LAN sync, however as this is an on premise solution it does not matter much

✔️ Possibility to exclude certain file types

✔️ Preserves file permissions

✔️ Supports synchronisation of symlinks in a non-dereferencing manner

✔️ Easy and intuitive to set up on the NAS side

✔️ Easy and intuitive to set up on client / desktop side

So I happened to receive one of these Synology newsletters informing me that Synology Drive 2.0 had been released. I had never seen it before, nor heard of anyone ever talking about or recommending it. I figured that I might as well just try it out; after all how is one more going to hurt?

I must say that Synology Drive 2.0 came as a huge surprise and ultimately ticked all of the boxes for me. It does everything I want it to do, and nothing that I do not want. It is an excellent example of doing one thing, and doing it really well.

As for other features you have options to share files via Synology QuickConnect links, you can have a Team Folder for collaboration efforts and you can both star and label files and directories if need be.

I think that it is safe to say that Synology Drive concludes my search for a viable Dropbox replacement and that it hits the ball out of the park. It is hands down the best file sync solution that I have come across on this quest.


As an epilogue let's take these experiences into account and revisit the service that I was trying to move away from.

✔️ Linux support

✔️ Two-way sync

✔️ Selective sync (can choose not to download / sync certain subdirectories)

✔️ Continuous sync

➖ Not possible to sync more than one directory

➖ Limited to three devices unless you pay for a subscription

✔️ Versioning

✔️ Conflict resolution (the conflicted file contains the name of the origin host, which is nice)

✔️ LAN sync

➖ Not possible to exclude certain file types

✔️ Preserves file permissions

Dereferences symlinks

Pretty good actually, but not all good. Again we see a sync service that dereferences symlinks.

So what's the deal with that anyway? I believe lay people like it because they only use one computer and they only need to add symlinks to other directories that they want to back up.

This is a clear case of a one-way (upload) sync which translates to a backup solution.

If you have a two-way (bi-directional) sync between computers then things get more complicated. As an example my first encounter with this issue was that within my ~/Dropbox directory I had ended up creating a symlink to another directory, let's say referring to ~/Dropbox/Documents.

When moving from my laptop to my desktop I happen to discover that what was a symlink on the other machine was now an actual folder duplicating every file in the other directory. That's not what I intended. Wanting to clean up this mess and without further ado I deleted the duplicated folder (it held the exact same data after all). What happens next is that Dropbox syncs this action resulting in the symlink being deleted on the other machine, the twist being that it again follows the symlink and deletes the main ~/Dropbox/Documents folder as well. This of course results in another sync action and all documents end up being deleted everywhere. Experimenting with this flaw I found that in scenarios like this Dropbox can also end up deleting files outside of the ~/Dropbox directory. So not quite so safe and isolated as I had been led to believe.

In conclusion following symlinks is something that only makes sense for one-way (upload/backup) solutions and not for two-way (bi-directional) sync solutions. I believe that so many cloud services get this wrong because they try to do both at the same time. There is undoubtedly also pressure to offer features that other more popular services do and I have seen many discussions arguing that following symlinks is justified because "Dropbox does this".

My advice is to be clear about which of the three principles of sync solutions you need (sync, upload or download) and choose your sync service accordingly.


Edit: Turns out that Dropbox no longer dereferences symlinks since mid-2019 [ref]

About

Notes on my experience and thoughts regarding choosing a sync service for maintaining dotfiles, scripts and documents across Linux installations

Resources

License

Stars

Watchers

Forks