Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

working_copy: Only serialise clock if only clock changed #3928

Closed
wants to merge 1 commit into from

Conversation

mlcui-corp
Copy link
Collaborator

When the working copy is identical to the last snapshot - which is quite often - only the watchman_clock changes. When this happens, the entirety of the working copy proto is reserialised, including the O(n) file_states field, which is slow.

As non-repeated proto fields always take the last field in the serialised message (1), we can skip the serialisation of the whole working copy by only serialising the fields that have changed and appending it to the last serialised proto.

This speeds up most jj commands by ~50ms on a large repository:

$ hyperfine --sort command --warmup 3 --runs 20 -L bin jj-before,jj-after \
  "target/release/{bin} -R ~/chromiumjj/src show -s @"
Benchmark 1: target/release/jj-before -R ~/chromiumjj/src show -s @
  Time (mean ± σ):     401.5 ms ±   4.5 ms    [User: 227.2 ms, System: 172.9 ms]
  Range (min … max):   394.3 ms … 414.2 ms    20 runs

Benchmark 2: target/release/jj-after -R ~/chromiumjj/src show -s @
  Time (mean ± σ):     347.3 ms ±   4.6 ms    [User: 179.3 ms, System: 166.4 ms]
  Range (min … max):   340.7 ms … 358.0 ms    20 runs

Relative speed comparison
        1.16 ±  0.02  target/release/jj-before -R ~/chromiumjj/src show -s @
        1.00          target/release/jj-after -R ~/chromiumjj/src show -s @

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

When the working copy is identical to the last snapshot - which is quite
often - only the `watchman_clock` changes. When this happens, the
_entirety_ of the working copy proto is reserialised, including the O(n)
`file_states` field, which is slow.

As non-repeated proto fields always take the last field in the
serialised message ([1]), we can skip the serialisation of the whole
working copy by only serialising the fields that have changed and
appending it to the last serialised proto.

This speeds up most `jj` commands by ~50ms on a large repository:

```
$ hyperfine --sort command --warmup 3 --runs 20 -L bin jj-before,jj-after \
  "target/release/{bin} -R ~/chromiumjj/src show -s @"
Benchmark 1: target/release/jj-before -R ~/chromiumjj/src show -s @
  Time (mean ± σ):     401.5 ms ±   4.5 ms    [User: 227.2 ms, System: 172.9 ms]
  Range (min … max):   394.3 ms … 414.2 ms    20 runs

Benchmark 2: target/release/jj-after -R ~/chromiumjj/src show -s @
  Time (mean ± σ):     347.3 ms ±   4.6 ms    [User: 179.3 ms, System: 166.4 ms]
  Range (min … max):   340.7 ms … 358.0 ms    20 runs

Relative speed comparison
        1.16 ±  0.02  target/release/jj-before -R ~/chromiumjj/src show -s @
        1.00          target/release/jj-after -R ~/chromiumjj/src show -s @
```

[1]: https://protobuf.dev/programming-guides/encoding/#last-one-wins
// https://protobuf.dev/programming-guides/encoding/#last-one-wins
// Reuse the existing serialisation of the proto in the current state to avoid
// serialising non-changed fields.
let bytes_written = std::fs::copy(&target_path, temp_file.path()).map_err(|err| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to use std::io::copy() to read/write contents over open fds. It might be a bit slower, but can avoid possible problems (like mandatory file locking on Windows.)

@@ -671,6 +684,56 @@ impl TreeState {
Ok(())
}

fn save_only_clock(&mut self) -> Result<(), TreeStateError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we somehow add test exercising this code path?

// https://protobuf.dev/programming-guides/encoding/#last-one-wins
// Reuse the existing serialisation of the proto in the current state to avoid
// serialising non-changed fields.
let bytes_written = std::fs::copy(&target_path, temp_file.path()).map_err(|err| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just rubber ducking)

  • this never copies inconsistent tree state because the in-memory version is loaded after lock was acquired.
  • even if it could, it's safe to rewind the watchman clock.

BTW, it might be simpler to extract watchman_clock to separate proto file if the second bullet point holds.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if it could, it's safe to rewind the watchman clock.

Yes, that is correct. From @arxanas:

It's not ever incorrect to ignore the clock update, but then you have to redo all the same incremental processing work on the next query. (If there is no additional work, then the inefficiency would be minimal.)

We should still save the watchman clock due to this "incremental processing work" - for example, consider a large build directory that is jj-ignored but not .watchmanconfig ignored.


it might be simpler to extract watchman_clock to separate proto file if the second bullet point holds.

This matches @arxanas' idea of splitting the states:

Another alternative is to split the O(repo) working copy state from the O(1) state and invalidate/update them separately. Then you could update the clock without having to rewrite the whole file state map.

I think this is quite reasonable, but I was a bit concerned about the atomicity of having two working copy protos. However, this might not be a problem?

I guess it's possible that you'd want to write all that information atomically somehow? I don't see when it would matter, though, at least for the file states map, which IIRC is just an optimization and not incorrect to fail at updating?

What do you think about splitting the working state in general? I think that going forward, this might be easier to maintain than special-casing the clock in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think atomic update in general is required here (because we'll take a lock and reload on mutation, and actual working-copy files can't be updated atomically anyway), but I'm not pretty sure. @martinvonz, any concerns?

From @arxanas' comment, I think it's safe to split watchman_clock to a separate file. The combination of (new_tree_state, old_watchman_clock) won't cause consistency problem.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - as long as we can't end up writing (old_tree_state, new_watchman_clock), there shouldn't be any problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was slightly worried about the following situation:

  • Time 0: foo.txt has contents 1. (old_watchman_clock)
  • Time 1: foo.txt has contents 2. (new_tree_state)
  • Time 2: foo.txt has contents 1.

At time 2, querying watchman with the time 0 clock successfully shows foo.txt being updated, even though its contents technically have not been updated.

I'll abandon this PR and will work on splitting out the proto. Just a few questions before I get started:

  • Splitting - what part of TreeState should we "split out"? I think there's two reasonable options:
    • Move only watchman_clock into a new message, as discussed. I think this is most reasonable as an incremental update to TreeState.
    • Move only file_states into a new message. This notably allows us to update tree_ids and sparse_patterns without writing file_states, but I think this is probably not needed.
  • Naming - what should we name the new message / the on-disk file? Thankfully, protobuf makes it easy to rename messages, so we don't need to think about this too much.
    • If moving watchman_clock: Should we keep it tied with the idea of a file system watcher, or keep it more general? I think TreeStateSecondary is reasonable - it emphasises the fact that a "primary" update can be done without updating the "secondary".
    • If moving file_states: I think it's reasonable for it to be called FileStates.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At time 2, querying watchman with the time 0 clock successfully shows foo.txt being updated, even though its contents technically have not been updated.

That should just be a performance problem, not a correctness problem. It would mean that we read the file contents an extra time before we determine that it hasn't changed.

  • Move only file_states into a new message. This notably allows us to update tree_ids and sparse_patterns without writing file_states, but I think this is probably not needed.

More importantly, those needs to be written atomically.

Naming - what should we name the new message / the on-disk file?

I would go with something more specific since we don't yet know what else might go into that file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was dealing with the same problem at work recently, specifically concerned with the case that the user creates a new file at time 0, we process it at time 1 and then abort, and then the user deletes the file at time 2, so that the working copy state compared to the last saved clock looks identical to us, but we created extra state somewhere that now needs to be cleaned up. But Watchman will still report the file change so that we can notice and delete the extra state.

let proto = crate::protos::working_copy::TreeState {
watchman_clock: self.watchman_clock.clone(),
..Default::default()
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I assume this never serializes fields other than watchman_clock because Default values can be omitted in protobuf, and the serialization should be skipped by prost.

Can you add a short comment about that?

@@ -1768,7 +1840,7 @@ impl LockedWorkingCopy for LockedLocalWorkingCopy {
message: "Failed to read the working copy state".to_string(),
err: err.into(),
})?;
self.tree_state_dirty |= tree_state.snapshot(options)?;
self.tree_state_dirty = self.tree_state_dirty.max(tree_state.snapshot(options)?);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you add inline comment to TreeStateChanged type saying the order matters.

Alternatively, maybe we can use bitflags.

@mlcui-corp
Copy link
Collaborator Author

Closing this PR as discussed in #3928 (comment) - we will move watchman_clock out into a separate proto.

@mlcui-corp mlcui-corp closed this Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants