Skip to content

Fix sparse literal byte accounting#2146

Merged
oferchen merged 1 commit into
masterfrom
implement-rsync-3.4.1-parity-in-rust
Nov 14, 2025
Merged

Fix sparse literal byte accounting#2146
oferchen merged 1 commit into
masterfrom
implement-rsync-3.4.1-parity-in-rust

Conversation

@oferchen
Copy link
Copy Markdown
Owner

Summary

  • make sparse writer return the number of literal bytes actually written instead of the logical chunk length
  • extend sparse writer unit tests to validate literal-byte accounting and zero-only chunk handling

Testing

  • cargo test -p engine sparse_writer

Codex Task

@oferchen oferchen merged commit 67724c6 into master Nov 14, 2025
@oferchen oferchen deleted the implement-rsync-3.4.1-parity-in-rust branch November 14, 2025 11:41
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 61 to +83
state.accumulate(leading);

if leading == chunk.len() {
return Ok(chunk.len());
return Ok(0);
}

let trailing = trailing_zero_run(&chunk[leading..]);
let data_start = leading;
let data_end = chunk.len() - trailing;

if data_end > data_start {
let literal_written = if data_end > data_start {
state.flush(writer, destination)?;
writer
.write_all(&chunk[data_start..data_end])
.map_err(|error| LocalCopyError::io("copy file", destination.to_path_buf(), error))?;
}
data_end - data_start
} else {
0
};

state.replace(trailing);

Ok(chunk.len())
Ok(literal_written)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Flag sparse copies as having written data

Changing write_sparse_chunk (crates/engine/src/local_copy/executor/file/sparse.rs) to return only the number of literal bytes written means it now returns 0 whenever a chunk is entirely zeros. However, file::copy::transfer still computes wrote_data = outcome.literal_bytes() > 0 || append_offset > 0 (around line 386) and uses that flag to mark checksum changes and, when --times is disabled, to decide whether to touch the destination mtime. As a result, overwriting an existing file in sparse mode with data that is all zeros will now leave wrote_data == false, so the change set shows no checksum change and the modification time is left untouched even though the file contents were replaced. This silently skips itemized notifications and mtime updates for a real data change. wrote_data needs to be recomputed to account for logical bytes (e.g. compare the logical length or track “any chunk processed”) instead of relying solely on literal_bytes().

Useful? React with 👍 / 👎.

oferchen added a commit that referenced this pull request May 14, 2026
Upstream rsync (options.c:2382) rejects the combination
`--append --whole-file` with the message
`--append cannot be used with --whole-file` and exits with
RERR_SYNTAX (1). oc-rsync was silently accepting both flags
together.

Add the conflict check to both validation layers that gate
config construction:
- `ClientConfigBuilder::validate` (core) - the primary path
  used by the CLI; surfaces a `ConfigConflict` whose `Display`
  matches upstream wording.
- `LocalCopyOptionsBuilder::validate` (engine) - keeps the
  engine-level builder in sync so programmatic embedders also
  reject the combination.

Only an explicit `--whole-file` (`Some(true)`) conflicts; the
default (`None`) and `--no-whole-file` (`Some(false)`) remain
accepted so existing call sites that set the tri-state from
auto-detection are unaffected.

Tests cover the engine builder, the core config builder
(including the exact upstream error string), and an end-to-end
CLI integration test that drives `cli::run` and asserts the
RERR_SYNTAX exit code plus the diagnostic mentions both flags.

Refs: #2145, #2146, #2147
oferchen added a commit that referenced this pull request May 18, 2026
Upstream rsync (options.c:2382) rejects the combination
`--append --whole-file` with the message
`--append cannot be used with --whole-file` and exits with
RERR_SYNTAX (1). oc-rsync was silently accepting both flags
together.

Add the conflict check to both validation layers that gate
config construction:
- `ClientConfigBuilder::validate` (core) - the primary path
  used by the CLI; surfaces a `ConfigConflict` whose `Display`
  matches upstream wording.
- `LocalCopyOptionsBuilder::validate` (engine) - keeps the
  engine-level builder in sync so programmatic embedders also
  reject the combination.

Only an explicit `--whole-file` (`Some(true)`) conflicts; the
default (`None`) and `--no-whole-file` (`Some(false)`) remain
accepted so existing call sites that set the tri-state from
auto-detection are unaffected.

Tests cover the engine builder, the core config builder
(including the exact upstream error string), and an end-to-end
CLI integration test that drives `cli::run` and asserts the
RERR_SYNTAX exit code plus the diagnostic mentions both flags.

Refs: #2145, #2146, #2147
oferchen added a commit that referenced this pull request May 18, 2026
Upstream rsync (options.c:2382) rejects the combination
`--append --whole-file` with the message
`--append cannot be used with --whole-file` and exits with
RERR_SYNTAX (1). oc-rsync was silently accepting both flags
together.

Add the conflict check to both validation layers that gate
config construction:
- `ClientConfigBuilder::validate` (core) - the primary path
  used by the CLI; surfaces a `ConfigConflict` whose `Display`
  matches upstream wording.
- `LocalCopyOptionsBuilder::validate` (engine) - keeps the
  engine-level builder in sync so programmatic embedders also
  reject the combination.

Only an explicit `--whole-file` (`Some(true)`) conflicts; the
default (`None`) and `--no-whole-file` (`Some(false)`) remain
accepted so existing call sites that set the tri-state from
auto-detection are unaffected.

Tests cover the engine builder, the core config builder
(including the exact upstream error string), and an end-to-end
CLI integration test that drives `cli::run` and asserts the
RERR_SYNTAX exit code plus the diagnostic mentions both flags.

Refs: #2145, #2146, #2147
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant