Skip to content

diff: show timestamp changes with full nanosecond precision, fixes #9147#9840

Open
ThomasWaldmann wants to merge 2 commits into
borgbackup:masterfrom
ThomasWaldmann:diff-time-precision
Open

diff: show timestamp changes with full nanosecond precision, fixes #9147#9840
ThomasWaldmann wants to merge 2 commits into
borgbackup:masterfrom
ThomasWaldmann:diff-time-precision

Conversation

@ThomasWaldmann

@ThomasWaldmann ThomasWaldmann commented Jul 2, 2026

Copy link
Copy Markdown
Member

Summary

Fixes #9147borg diff could show confusing output like [ctime: Wed, 2025-11-05 17:45:53 +0000 -> Wed, 2025-11-05 17:45:53 +0000] where both timestamps render identically but actually differ at sub-second level (e.g. POSIX-valid ctime updates of surviving hardlinks on the BSDs).

Two commits:

Commit 1 — always render borg diff time changes with one fixed, higher-precision format instead of conditionally switching formats (as PR #9561 proposed): in diff output a timestamp field only appears when it changed, so a precision that can express the difference is what we want every time — and a data-independent format is friendlier to humans, tests, and parsers alike. Also re-enables test_hard_link_deletion_and_replacement on FreeBSD, NetBSD and OpenBSD, accepting a ctime-only change of the surviving hardlink there (with distinguishable timestamps required).

Commit 2 — use the full stored precision: borg stores timestamps as int nanoseconds, so a sub-microsecond difference is a real difference and must not be silently dropped (nor shown as two identical strings). ItemDiff._time_diffs() compares the raw ns values; OutputTimestamp optionally carries them, so all three diff representations show a fixed 9-digit fraction:

  • text: [ctime: Mon, 2025-05-05 19:45:53.000123111 +0200 -> Mon, 2025-05-05 19:45:53.000123999 +0200] (new format_time_ns() helper — strftime/isoformat can only do µs)
  • {isoctime}/{isomtime} and JSON lines: 2025-05-05T19:45:53.000123111+02:00 (ns-aware OutputTimestamp.isoformat()/to_json(); other OutputTimestamp users are unchanged)

Net effect: any reported time change always renders as two distinguishable timestamps, and no real change is ever omitted.

Based on the analysis and tests of PR #9561 by @hiepau1231 — credited as co-author. Supersedes #9561.

Test plan

  • new unit tests: test_item_diff_time_ns_resolution (ns-resolution change detection incl. the +1ns case), test_diff_formatter_time_precision (9-digit fraction, distinguishable timestamps), test_format_time_ns, test_output_timestamp_ns_isoformat (+ µs behavior unchanged without ns)
  • full diff_cmd_test.py, item_test.py, parseformat_test.py, time_test.py pass locally (macOS)
  • ruff + black clean
  • test_hard_link_deletion_and_replacement on the FreeBSD/NetBSD/OpenBSD CI runners — the BSD-specific behavior can only be validated there

🤖 Generated with Claude Code

…orgbackup#9147

Time changes shown by borg diff are often sub-second (e.g. POSIX-valid
ctime updates of surviving hardlinks). With the previous second-precision
format, both timestamps could render identically, giving confusing output
like [ctime: Wed, ... 17:45:53 +0000 -> Wed, ... 17:45:53 +0000].

- DiffFormatter.format_time: always render with microsecond precision.
- ItemDiff._time_diffs: compare timestamps with microsecond granularity:
  sub-microsecond differences can neither be represented by datetime nor
  displayed, so reporting them would produce the same confusing output.
- re-enable test_hard_link_deletion_and_replacement on FreeBSD, NetBSD
  and OpenBSD: accept a ctime-only change of the surviving hardlink there.

Based on the analysis and tests of PR borgbackup#9561 by hiepau1231 - thanks!

Co-authored-by: hiepau1231 <hiepau1231@gmail.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@codecov

codecov Bot commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.18%. Comparing base (831e47d) to head (1a158dc).
⚠️ Report is 1 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #9840   +/-   ##
=======================================
  Coverage   85.17%   85.18%           
=======================================
  Files          93       93           
  Lines       15372    15382   +10     
  Branches     2318     2320    +2     
=======================================
+ Hits        13093    13103   +10     
  Misses       1583     1583           
  Partials      696      696           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

A sub-microsecond timestamp difference is a real difference - borg stores
timestamps as int nanoseconds, so quantizing the comparison to microseconds
(as the previous commit did) made borg diff silently omit genuine metadata
changes.

Instead, compare at full nanosecond resolution again and make the displayed
precision match the stored precision:

- OutputTimestamp optionally carries the raw nanoseconds value; its
  isoformat()/to_json() then emit a 9-digit fraction, so the ISO format
  keys and the JSON-lines output can also represent sub-microsecond
  differences.
- new helpers.time.format_time_ns(): like format_time() (default format),
  but with a 9-digit seconds fraction taken from the raw nanoseconds.
- ItemDiff._time_diffs() compares the raw ns ints and passes them to
  OutputTimestamp; DiffFormatter.format_time() renders via format_time_ns().

With the fixed 9-digit fraction, any reported time change always renders
as two distinguishable timestamps, in text as well as in JSON output.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@ThomasWaldmann ThomasWaldmann changed the title diff: always use microsecond precision for timestamp changes, fixes #9147 diff: show timestamp changes with full nanosecond precision, fixes #9147 Jul 2, 2026
@ThomasWaldmann

Copy link
Copy Markdown
Member Author

Pros and cons, by Claude Fable 5 — merging only the first commit (µs display, µs-quantized comparison) vs. both commits (ns comparison, fixed 9-digit display):

Only commit 1 (982494d: µs display, µs-quantized comparison)

Pros:

  • Less noisy output: 6 fractional digits instead of 9, and sub-µs-only ctime differences (arguably closer to filesystem measurement noise than to meaningful changes) never clutter the diff.
  • Simpler code: no ns plumbing — OutputTimestamp stays a plain datetime wrapper, no format_time_ns() helper, no manual ISO-string surgery for the 9-digit fraction.
  • One natural precision: everything aligns with what Python's datetime can represent, so display, comparison, and internal representation can't drift apart.
  • JSON/ISO output keeps the established 6-digit fraction that any ISO 8601 consumer handles; 9-digit fractions are valid ISO but some stricter or older parsers only expect up to µs.

Cons:

  • borg diff lies by omission: borg faithfully stores ns, and two stored timestamps that genuinely differ compare as "unchanged". For the auditing/forensics use cases where people actually look at ctime diffs, silently dropping a real change is the worst failure mode.
  • The tool's reporting precision no longer matches its storage precision — an inconsistency that needs documenting and explaining forever ("why does borg store ns but diff ignores part of it?").
  • Debugging value lost: a ns-level restore/transfer timestamp bug would be invisible in borg diff.

Commits 1+2 (1a158dc: ns comparison, fixed 9-digit display)

Pros:

  • Fully faithful: every stored difference is reported, and every reported difference renders as two visibly distinguishable timestamps — in text, {isoctime}/{isomtime}, and JSON. The "changed but looks equal" bug class is structurally impossible.
  • One invariant: display precision = comparison precision = storage precision. Nothing to explain away.
  • Still a fixed, data-independent format (9 digits always), so output stays aligned and parseable.

Cons:

  • Wider output: every time diff carries 9 fractional digits, mostly trailing zeros for the common ≥1s case.
  • More code to maintain: the optional ns on OutputTimestamp (dual µs/ns behavior), the index-based fraction splice in isoformat(), the extra helper.
  • The 9-digit ISO fraction in JSON-lines is a (compatible-in-spirit, but observable) output format change for machine consumers.

Recommendation: merge both. The only real costs of commit 2 are cosmetic width and a little code, while the cost of stopping at commit 1 is a correctness gap in a tool whose entire job is reporting differences. History-wise the two commits are each self-consistent with passing tests (bisect-safe), and the pair documents why ns won over µs.

@ThomasWaldmann

Copy link
Copy Markdown
Member Author

@hiepau1231 @PhrozenByte what do you think?

@PhrozenByte

Copy link
Copy Markdown
Contributor

per #9561 (comment)

That variable formatting is a mixed blessing:

  • for human readers, it might be the best way
  • for automated parsing, it is the worst if the format is not constant

I think this is a good opportunity to make a more fundamental decision: should we consider the regular text output (not just of borg diff) a stable API, or is it strictly intended for humans?

Parsing human-readable text output is inherently fragile. It's usually only done when there is no better alternative. Borg has significantly expanded its JSON support recently. I don't have the full picture, but is anything important still missing? Since Borg 2 is a breaking release anyway, we could explicitly document that text output is not a stable API and that downstream projects should switch to JSON instead.

I'd handle it like this: JSON should always expose the full available precision (i.e. timestamps with ns precision). Text output should use adaptive formatting: show 1s precision by default, and only include sub-second precision when the difference is <1s.

I'd also prefer JSON to print a float (i.e. a Unix timestamp with ns precision) instead of a datetime string. Not just for borg diff, but everywhere.

I'm aware this makes the implementation more complex. However, as a human, I don't care about the exact number of µs or ns. If the difference is <1s, all I care about is that there is a difference. Showing the exact value avoids confusion, but beyond that it's unnecessary detail for humans IMHO. Non-humans can get full precision by using JSON output instead.

If you don't think that's reasonable, I'd probably merge both commits and therefore always print ns precision. It's the worst for humans, but at least avoids having the same issue with sub-µs differences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

freebsd test failure in test_hard_link_deletion_and_replacement

2 participants