Skip to content

v0.14.0

Compare
Choose a tag to compare
@xenoscopic xenoscopic released this 23 Apr 16:44
· 302 commits to master since this release
4f8d812

Overview

Mutagen v0.14 is focused on observability and performance. Logging has been vastly expanded, made more readable, and is now able to provide more granular information on individual filesystem operations. On the optimization front, a number of changes have been made to improve bulk file staging on remote endpoints and a few minor performance tweaks have been made to scanning. Finally, filesystem watching code has been slightly refactored and an initial implementation of fanotify-based watching is available.

Logging

While most observation in Mutagen is made using the list and monitor commands, the Mutagen daemon itself also outputs logs. These can be seen by running the daemon in the foreground using mutagen daemon run. This is the exact same command that's spun off as a background process by mutagen daemon start, so you'll need to do mutagen daemon stop first if the daemon is already running. The granularity of these logs can be controlled by setting the MUTAGEN_LOG_LEVEL environment variable to one of disabled, error, warn, info, debug, or trace. The default log level is info.

While this logging has always existed, logs at the debug and trace level haven't always provided sufficient information to debug issues. This release vastly increases the information output by these logs, which now includes information on filesystem watcher observations, change reconciliation, and file staging. Moreover, the local log level is now propagated to agent processes (whose logs are merged into the daemon log).

Initial fanotify support

Filesystem watching is an area where Linux unfortunately lags behind macOS and Windows (at least in practical terms). In particular, it has never had a usable, general purpose, user space accessible, natively recursive filesystem watching API (i.e. an API that can watch an entire filesystem directory hierarchy with only a single watch). This is despite having three user space watching APIs: inotify, dnotify, and fanotify (plus an underlying kernel framework to support these called fsnotify).

Most tools use inotify to perform filesystem watching on Linux. inotify is a non-recursive Linux filesystem watching API that requires a single "watch descriptor" (similar to a file descriptor) for every file and directory being monitored. While many libraries attempt to emulate recursive watching by combining inotify with logic that automatically creates and destroys watches, this approach is subject to race conditions and the relatively low default watch descriptor limit.

Mutagen has always taken a hybrid approach on Linux systems: use readdir/fstatat-based polling for accuracy, combined with a limited number of inotify watches on most-recently-updated contents in order to reduce latency. This has meant that every sync session that Mutagen creates that includes a Linux endpoint (i.e. most of them) is subject to O(n) filesystem scanning behavior (either synchronously or (in the accelerated case) asynchronously) on that endpoint. This manifests as periodic CPU spikes, especially for very large synchronization roots.

The fanotify API does provide recursive watching facilities, and it has existed for a long time, but it didn't previously provide the granularity of events necessary for Mutagen to avoid full filesystem re-scans. However, fanotify recently received an overhaul in Linux 5.1, making it just workable enough for Mutagen to use. Unfortunately, it still has some caveats, namely the fact that the calling process needs CAP_SYS_ADMIN and CAP_DAC_READ_SEARCH capabilities (i.e. it basically needs to be run as root). However, this is fairly workable in the context of containers, and thus Mutagen has shipped an alternative sidecar container image that (when run on a 5.1+ kernel with the relevant capabilities enabled) is able to use fanotify. Since this requires some non-trivial setup, the support is currently only turned on automatically when using Mutagen Compose. However, it is very possible to use it in other containerized setups, so please reach out on the Mutagen Community Slack Workspace if you're interested in trying it.

Enabling fanotify support allows for two things: significantly faster filesystem re-scans (meaning lower-latency change propagation) and avoidance of poll-based scanning (meaning no CPU spikes). This can make an enormous difference in the context of multi-GB codebases or a large number of sessions.

Support for fanotify outside of containers is also on the horizon, likely enabled via some sort of sudo-based mechanism or system-level service, but feedback from the container use case will go a long way in validating its viability.

It's also worth mentioning that the fanotify watching implementation is licensed under the Server Side Public License. Certain new features (in particular those that are only relevant to SaaS embedders of Mutagen) will likely start falling under the SSPL umbrella, though there are no plans to change the license for any of Mutagen's MIT-licensed code, and most new code will continue to fall under the MIT license. If you are interested in licensing these features for SaaS-based usage under an alternative license, please reach out to hello@mutagen.io.

Changes

A full accounting of changes since v0.13.1 can be found here. Notable changes include:

  • Logging has been vastly expanded and is now more readable
  • Log levels are now propagated to agent processes
  • Agent installation failures yield better feedback
  • Mutagen commands now include a -v/--version flag to print a human-readable version (thanks @rfay)
    • The version command still exists - its functionality will be expanded in Mutagen v0.15
  • Filesystem watching interfaces have been refactored
    • Thanks to @djs55 for fixes to the Windows watching code
  • Accelerated scanning is no longer temporarily disabled by Transition
  • Initial support for fanotify-based filesystem watching has been added in the sidecar 🚀
  • Synchronization file transfers have been further optimized:
    • Path filtering now has a shorthand for cases where all files are required
    • Compression flush boundaries are better optimized
  • Filesystem scanning has been further optimized:
    • Accelerated scanning uses a more optimal cache propagation strategy
    • Directory traversal has removed some unnecessary validation code
  • Cache saves are now event-driven
  • Ancestor saves are now contingent on changes occurring
  • Sidecar staging is now in-volume for all volume paths (not just volume roots)
  • Updated to Go 1.18
  • Other minor bug fixes