braid is a NixOS CLI tool for managing an encrypted, redundant NAS. It wraps two standard Linux tools into a simple interface:
- LUKS -- full disk encryption (passphrase-based, keys never stored on disk)
- btrfs RAID1 -- checksumming filesystem with automatic self-healing from redundant copies
And it leans heavily on systemd, built into NixOS: the unlock/mount lifecycle, scrub timers, and UPS/fan/suspend services all run as systemd units.
# Find your disks
lsblk -d -o NAME,SIZE,MODEL,ID-LINK
# Add disks to the pool
sudo braid add toshiba=/dev/disk/by-id/ata-Toshiba_MN07_XXXX \
ironwolf=/dev/disk/by-id/ata-Ironwolf_ST12_YYYY
# Unlock after boot
sudo braid unlock
# Check pool health
sudo braid status
# Remove a disk
sudo braid remove ironwolf
# Replace a failed disk
sudo braid replace --old ironwolf --new seagate=/dev/disk/by-id/ata-Seagate_NEW_ZZZZ
# Lock the pool
sudo braid lockSee the command reference for full usage of each command.
- Full disk encryption -- passphrase or USB keyfile to unlock
- Redundancy -- data stored on two disks; tolerates a single disk failure
- Dynamic pool -- add or remove drives with a command, no
nixos-rebuild - Self-healing -- btrfs checksums every block and silently repairs corruption from the redundant copy
- Offline-write safety -- the pool mountpoint is sealed immutable while the pool is unmounted, so a process writing it before the pool mounts fails loudly with
EPERMinstead of silently landing data on the root disk (which the pool would then hide on mount) - CLI-owned membership --
braid add/remove/replacemanage the pool; state lives in UUID-keyed/var/lib/braid/pool.json - UPS safety -- with UPS support enabled, NUT drives orderly poweroff on low battery, mutating commands refuse to start unless UPS utility power is verified, and
braid ups status/ the TUI show live UPS state - TUI dashboard --
braid tuishows pool health, disk status, balance progress, SMART data, and (when enabled) chassis fan telemetry plus UPS state
- RAID1 capacity cost -- half your raw capacity goes to redundancy. Four 12 TB drives = 24 TB usable.
- HDD-first -- defaults are tuned for spinning drives (e.g. no TRIM). SSDs may work but are not supported.
- Unstable -- this is pre-v1.0 and I change things when I decide on a better way. Commands, flags, config, and even on-disk state like
pool.jsonformat can change. - Unproven -- I run braid on a daily-use 4x12TB NAS, and there are 180+ NixOS VM tests and 2200+ Rust tests, but there are almost certainly weird/bad edge cases. That said, every mutating command takes a
--dry-runflag, so you can preview exactly what it'll do before it touches your disks.
NixOS/Linux only (x86_64). The CLI wraps Linux storage tooling (LUKS, btrfs, systemd) and does not run on macOS.
Try it without installing anything:
nix run github:danneu/braid?ref=release -- --helpAdd braid to your flake inputs and import the module:
# flake.nix
{
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-26.05";
braid.url = "github:danneu/braid?ref=release";
};
outputs = { nixpkgs, braid, ... }: {
nixosConfigurations.myhost = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
braid.nixosModules.default
./configuration.nix
];
};
};
}# configuration.nix
braid = {
enable = true;
mountPoint = "/mnt/storage"; # default
};?ref=release tracks braid's release channel (nix flake update braid upgrades).
Add braid's public binary cache so the NAS pulls the prebuilt CLI instead of
recompiling -- this relies on the no-follows input above, which keeps braid on
its pinned nixpkgs:
# configuration.nix
nix.settings = {
extra-substituters = [ "https://braid.cachix.org" ];
extra-trusted-public-keys = [ "braid.cachix.org-1:I/p7fx1z5n0+O80KzMuT7aXRdkVyHr/buZKaBu7HvJs=" ];
};Add --dry-run to any pool-lifecycle command to print the exact plan -- every
LUKS, btrfs, and mount step it would run -- without touching your disks. Each
step is tagged [destructive], [safe], or [long] (a long-running step like a
btrfs balance), so you can see at a glance what each step does, and the indented
$ line is the literal command:
sudo braid add ironwolf=/dev/disk/by-id/ata-Ironwolf_ST12_YYYY \
toshiba=/dev/disk/by-id/ata-Toshiba_MN07_XXXX --dry-run
[destructive] LUKS format /dev/disk/by-id/ata-Ironwolf_ST12_YYYY
$ cryptsetup luksFormat --type luks2 --batch-mode '--key-file=-' --uuid 7f9d2e4a-1c3b-4f5a-8e6d-2a1b3c4d5e6f --label braid-ironwolf /dev/disk/by-id/ata-Ironwolf_ST12_YYYY
[safe] LUKS header backup -> /var/lib/braid/luks-headers/braid-ironwolf.luksheader
$ cryptsetup luksHeaderBackup --header-backup-file /var/lib/braid/luks-headers/braid-ironwolf.luksheader /dev/disk/by-id/ata-Ironwolf_ST12_YYYY
[safe] LUKS open -> braid-ironwolf
$ cryptsetup open --type luks '--key-file=-' --perf-no_read_workqueue --perf-no_write_workqueue /dev/disk/by-id/ata-Ironwolf_ST12_YYYY braid-ironwolf
[destructive] LUKS format /dev/disk/by-id/ata-Toshiba_MN07_XXXX
$ cryptsetup luksFormat --type luks2 --batch-mode '--key-file=-' --uuid 3a8c1d9f-5e2b-4a7c-9f1e-6b4d2c8a0e3f --label braid-toshiba /dev/disk/by-id/ata-Toshiba_MN07_XXXX
[safe] LUKS header backup -> /var/lib/braid/luks-headers/braid-toshiba.luksheader
$ cryptsetup luksHeaderBackup --header-backup-file /var/lib/braid/luks-headers/braid-toshiba.luksheader /dev/disk/by-id/ata-Toshiba_MN07_XXXX
[safe] LUKS open -> braid-toshiba
$ cryptsetup open --type luks '--key-file=-' --perf-no_read_workqueue --perf-no_write_workqueue /dev/disk/by-id/ata-Toshiba_MN07_XXXX braid-toshiba
[safe] mkfs.btrfs RAID1 /dev/mapper/braid-ironwolf /dev/mapper/braid-toshiba
$ mkfs.btrfs -d raid1 -m raid1 -O block-group-tree /dev/mapper/braid-ironwolf /dev/mapper/braid-toshiba
[safe] mount -> /mnt/storage
$ mount -o 'noatime,skip_balance,subvolid=5' /dev/mapper/braid-ironwolf /mnt/storage
Without --dry-run, the data-shape commands (add, remove, remove-missing,
replace) show what they are about to do and wait for you to type yes --
anything else aborts:
sudo braid remove ironwolf
Remove from pool:
ironwolf Seagate IronWolf | 12.00 TiB | serial ZL2A1B2C
devid 2 | data will migrate to remaining disks
Pool: 3 disks -> 2 disks
Type 'yes' to continue:
Pass --yes to skip the prompt (for scripts and automation):
sudo braid remove ironwolf --yes
If a mutation is interrupted, braid leaves /var/lib/braid/pending-op.json in place and normal commands refuse until recovery completes. Run sudo braid recover (add --allow-degraded when a member is missing). Recovery repairs pool.json from committed live btrfs membership and, when btrfs balance state is idle, finishes only the owed post-mutation maintenance, such as resize or soft RAID1 balance. If owed RAID1 replay finds a paused, running, or unknown balance state, recover fails closed and preserves pending-op.json for manual inspection.
pool.json is keyed by each member's LUKS UUID. Disk names are still the names you type in commands and see in output; by-id paths are the hardware addresses braid uses to find disks.
| Command | Description |
|---|---|
| add | Add disks to the pool (or create a new pool) |
| remove | Remove a live disk from the pool |
| remove-missing | Forget a dead/missing device entry |
| replace | Replace a live or dead disk |
| unlock | Open LUKS devices and mount the pool |
| lock | Unmount the pool and close LUKS devices |
| seal-mountpoint | Seal the offline mountpoint immutable (boot-managed; manual lever) |
| idle | Check if the pool is idle (for auto-suspend) |
| status | Pool health, disk status, allocation, scrub info |
| doctor | Diagnostic checks for config, pool health, and runtime safety |
| monitor | Health check for alerting (used by systemd timer) |
| ack | Acknowledge and silence an active alert |
| enroll | Enroll a USB keyfile for auto-unlock |
| discover | Scan for braid LUKS devices and rebuild pool.json |
| recover | Recover from an interrupted operation |
| tui | Interactive dashboard with raw-output Browse tab |
| ups status | Live UPS state (NUT); --json for scripts |
| Guide | Description |
|---|---|
| Install NixOS | Install NixOS itself before setting up braid |
| Getting started | First-time setup: find disks, create pool, unlock |
| Day-to-day NAS usage | Subvolumes, file permissions, Samba shares |
| Auto-unlock | USB keyfile setup for unattended reboots |
| Monitoring and alerts | Disk health alerts, beeper, alert commands |
| Power management | Auto-suspend, Wake-on-LAN, RTC wakeups |
| Fan control | HDD-driven chassis fan control via hddfancontrol |
| UPS | NUT-backed orderly poweroff, preflight safety, live status |
| NixOS configuration | Module options, scrub scheduling, pinned toolchain |
| Sharing and permissions | Storage group, mount permissions, Samba |
| Mounting subvolumes | Expose a btrfs subvolume at a custom path |
| Troubleshooting | ENOSPC balance, paused balance, missing devices |
| Recovery scenarios | Interrupted operations, lost pool.json, degraded mount |
See docs/dev/overview.md for the dev workflow, test commands, and dependency upgrade process.
braid is written almost entirely by AI agents. After 20 years of software work, this project is my attempt at finding a state-of-the-art approach to AI-heavy engineering.
It is not vibe-coded. Every change runs through a deliberate plan-first pipeline: I generally have Claude Code (--effort max) draft a plan, then I run a revision loop with other agents -- I answer their clarifying questions, choose among branching decisions, and double-check the direction -- until the plan is ratcheted into a final form. Then an agent implements the plan.
The plan file is the main unit of work in braid. That is where all of my attention is spent. Implementation is derived from the plan.
Contributors, if any, would submit plan files rather than code, then we would revision-cycle the plan until it's ready for agent implementation.