Conversation
…on collision (#340) WAL filenames used second precision (arc-YYYYMMDD_HHMMSS.wal), so two rotations within the same second produced identical filenames. The second rotation reopened the existing file via O_APPEND and wrote a new header, corrupting the WAL structure. Changed the timestamp format to include nanoseconds (arc-YYYYMMDD_HHMMSS.000000000.wal). The new format is lexicographically sortable and matches all existing *.wal glob patterns used by recovery, purge, and reader code.
There was a problem hiding this comment.
Code Review
This pull request addresses a WAL filename rotation collision bug by increasing the timestamp precision from seconds to nanoseconds in internal/wal/wal.go, which is reflected in the updated release notes. While this change significantly reduces the chance of collisions, a review comment points out that using os.O_APPEND still leaves a potential race condition. It is recommended to create a follow-up task to replace os.O_APPEND with os.O_EXCL for a fully robust solution against file corruption.
|
|
||
| // Generate new filename | ||
| timestamp := time.Now().UTC().Format("20060102_150405") | ||
| timestamp := time.Now().UTC().Format("20060102_150405.000000000") |
There was a problem hiding this comment.
While nanosecond precision significantly reduces the likelihood of filename collisions, it doesn't offer a complete guarantee against race conditions, especially with multiple writers or on systems with low-resolution clocks.
The core issue that leads to corruption is the use of os.O_APPEND on line 283. A more robust approach is to use os.O_EXCL to ensure atomic, exclusive file creation. This would cause the operation to fail safely if a file with the same name already exists, preventing any chance of corruption.
Since the relevant line is not in this diff, I recommend creating a follow-up task to replace O_APPEND with O_EXCL to make the WAL rotation fully safe against collisions.
Summary
arc-YYYYMMDD_HHMMSS.wal), allowing two rotations within the same second to produce identical filenames — the second rotation reopened the file viaO_APPENDand wrote a new header, corrupting the WAL structurearc-YYYYMMDD_HHMMSS.000000000.wal) — still lexicographically sortable, matches all existing*.walglob patternsTest plan
go build ./internal/wal/...passesgo test ./internal/wal/...passesCloses #340