cp example.env .envand fill in details for an editor Tigris access key to an empty bucket.- Run
docker-compose up --buildto start the services (if you receive duplicated output lines, rundocker container pruneas a workaround, see docker issue)
The borg-lock container mounts the Tigris bucket and creates a borg repository in it (if not already created). It then breaks any stale locks and creates a new lock. The lock is held for 20 seconds and then released. After that, the container shuts down.
In parallel, the borg-wait container first waits for the borg-lock container to init the repo and acquire the lock. It then waits for the lock (for up to 60 minutes). While it is waiting, it prints an "alive" message with an incrementing number every 10 secons. Plus, it calls ls /mnt/tigris.
Thus, the expected output is:
borg-lock-1 | {"level":"info","module":"","version":"1.2.1","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:126","message":"Starting TigrisFS version"}
borg-lock-1 | {"level":"info","module":"","bucketName":"tv-mount-test","mountPoint":"/mnt/tigris","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:127","message":"Mounting"}
borg-lock-1 | {"level":"info","module":"","uid":0,"defaultGid":0,"time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:128","message":"Default uid=gid"}
borg-lock-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:120","message":"env AWS_ACCESS_KEY_ID = tid_dYGIftMTQNXmWLNFlOYkODJcfa_MciOkCdydgBBEonBApUGDAU"}
borg-lock-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:129","message":"default endpoint = https://t3.storage.dev"}
borg-wait-1 | {"level":"info","module":"","version":"1.2.1","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:126","message":"Starting TigrisFS version"}
borg-wait-1 | {"level":"info","module":"","bucketName":"tv-mount-test","mountPoint":"/mnt/tigris","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:127","message":"Mounting"}
borg-wait-1 | {"level":"info","module":"","uid":0,"defaultGid":0,"time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:128","message":"Default uid=gid"}
borg-wait-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:120","message":"env AWS_ACCESS_KEY_ID = tid_dYGIftMTQNXmWLNFlOYkODJcfa_MciOkCdydgBBEonBApUGDAU"}
borg-wait-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:129","message":"default endpoint = https://t3.storage.dev"}
borg-wait-1 | {"level":"info","module":"","time":1758099578,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:161","message":"File system has been successfully mounted."}
borg-wait-1 | Waiting for borg repo to exist and being locked
borg-wait-1 | Repo locked, now waiting to acquire borg lock
borg-lock-1 | {"level":"info","module":"","time":1758099578,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:161","message":"File system has been successfully mounted."}
borg-lock-1 | Init borg repo
borg-lock-1 | A repository already exists at /mnt/tigris/repo.
borg-lock-1 | Acquire lock
borg-lock-1 | Lock acquired
borg-wait-1 | alive 1
borg-wait-1 | repo
borg-wait-1 | alive 2
borg-wait-1 | repo
borg-lock-1 | Released lock
borg-lock-1 | tigrisfs still running, waiting
borg-lock-1 | tigrisfs still running, waiting
borg-lock-1 exited with code 0
borg-wait-1 | Successfully acquired borg lock
borg-wait-1 | tigrisfs still running, waiting
borg-wait-1 | alive 3
borg-wait-1 | repo
borg-wait-1 exited with code 0
There are two surprising deviations from the expected behavior:
- Even though, I am using
--stat-cache-ttl 0s, theborg-waitcontainer does not see the released lock until about a minute later. This feels like the directory in fact is not immediately checked again but cached. But, let's not get distracted because the second issue is more severe: - While waiting for the lock to vanish, the
borg-waitcontainer'sls /mnt/tigrislocks up at some point and never returns. This means that the tigrisfs mount is no longer responsive.
The observed output is:
borg-lock-1 | {"level":"info","module":"","version":"1.2.1","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:126","message":"Starting TigrisFS version"}
borg-lock-1 | {"level":"info","module":"","bucketName":"tv-mount-test","mountPoint":"/mnt/tigris","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:127","message":"Mounting"}
borg-lock-1 | {"level":"info","module":"","uid":0,"defaultGid":0,"time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:128","message":"Default uid=gid"}
borg-lock-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:120","message":"env AWS_ACCESS_KEY_ID = tid_dYGIftMTQNXmWLNFlOYkODJcfa_MciOkCdydgBBEonBApUGDAU"}
borg-lock-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:129","message":"default endpoint = https://t3.storage.dev"}
borg-wait-1 | {"level":"info","module":"","version":"1.2.1","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:126","message":"Starting TigrisFS version"}
borg-wait-1 | {"level":"info","module":"","bucketName":"tv-mount-test","mountPoint":"/mnt/tigris","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:127","message":"Mounting"}
borg-wait-1 | {"level":"info","module":"","uid":0,"defaultGid":0,"time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:128","message":"Default uid=gid"}
borg-wait-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:120","message":"env AWS_ACCESS_KEY_ID = tid_dYGIftMTQNXmWLNFlOYkODJcfa_MciOkCdydgBBEonBApUGDAU"}
borg-wait-1 | {"level":"info","module":"","time":1758099577,"caller":"/home/runner/work/tigrisfs/tigrisfs/core/cfg/conf_s3.go:129","message":"default endpoint = https://t3.storage.dev"}
borg-wait-1 | {"level":"info","module":"","time":1758099578,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:161","message":"File system has been successfully mounted."}
borg-wait-1 | Waiting for borg repo to exist and being locked
borg-wait-1 | Repo locked, now waiting to acquire borg lock
borg-lock-1 | {"level":"info","module":"","time":1758099578,"caller":"/home/runner/work/tigrisfs/tigrisfs/main.go:161","message":"File system has been successfully mounted."}
borg-lock-1 | Init borg repo
borg-lock-1 | A repository already exists at /mnt/tigris/repo.
borg-lock-1 | Acquire lock
borg-lock-1 | Lock acquired
borg-wait-1 | alive 1
borg-wait-1 | repo
borg-wait-1 | alive 2
borg-wait-1 | repo
borg-lock-1 | Released lock
borg-lock-1 | tigrisfs still running, waiting
borg-lock-1 | tigrisfs still running, waiting
borg-lock-1 exited with code 0
borg-wait-1 | alive 3
borg-wait-1 | repo
borg-wait-1 | alive 4
borg-wait-1 | repo
borg-wait-1 | alive 5
borg-wait-1 | repo
borg-wait-1 | alive 6
borg-wait-1 | repo
borg-wait-1 | alive 7
Do note the lacking repo after alive 7. This indicates that the ls /mnt/tigris command has locked up.
At this point, you can also
$ docker compose exec borg-wait ash
/ # ls /mnt/tigrisand it will hang forever.
Looking at the stack traces:
$ docker compose exec --privileged borg-wait ash
/ # cat /proc/$(pgrep ls)/stack
[<0>] request_wait_answer+0x10a/0x1e0 [fuse]
[<0>] __fuse_simple_request+0xd5/0x2b0 [fuse]
[<0>] fuse_readdir_uncached+0x12e/0x7f0 [fuse]
[<0>] iterate_dir+0xa7/0x270
[<0>] __x64_sys_getdents64+0x84/0x130
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/ # cat /proc/$(pgrep borg)/stack
[<0>] request_wait_answer+0x10a/0x1e0 [fuse]
[<0>] __fuse_simple_request+0xd5/0x2b0 [fuse]
[<0>] fuse_readdir_uncached+0x12e/0x7f0 [fuse]
[<0>] iterate_dir+0xa7/0x270
[<0>] __x64_sys_getdents64+0x84/0x130
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7eThis shows that both ls and borg are stuck in fuse_readdir_uncached, waiting for a response from the FUSE filesystem.
The stack traces of the tigrisfs threads are:
~ # for task in /proc/$(pgrep tigrisfs)/task/*; do echo $task; cat $task/stack; echo; done
/proc/22/task/22
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/23
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/24
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/25
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/26
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/27
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/28
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/29
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/30
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/32
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/33
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/34
[<0>] do_epoll_wait+0x4ba/0x4f0
[<0>] do_compat_epoll_pwait.part.0+0xe/0x70
[<0>] __x64_sys_epoll_pwait+0x7e/0x130
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/35
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/36
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/37
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/38
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/39
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/40
[<0>] fuse_dev_do_read+0xa51/0xab0 [fuse]
[<0>] fuse_dev_read+0x5c/0x90 [fuse]
[<0>] vfs_read+0x271/0x350
[<0>] ksys_read+0x6a/0xe0
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
/proc/22/task/54
[<0>] futex_wait_queue+0x69/0xc0
[<0>] __futex_wait+0x145/0x1b0
[<0>] futex_wait+0x6b/0x110
[<0>] do_futex+0xc5/0x190
[<0>] __x64_sys_futex+0x112/0x200
[<0>] do_syscall_64+0x7b/0x820
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7eThis looks like a dead-lock or stale lock in tigrisfs.