Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make rev() detect all parents of the given commits #1326

Closed
wants to merge 1 commit into from

Conversation

RalfJung
Copy link
Contributor

@RalfJung RalfJung commented Apr 27, 2024

The current implementation of rev does not work well when we have a history like

 o
/ \
| o $commit (the one given in rev)
o |
|\|
| o $parent

Without this PR, $parent will be processed both with and without the filter: when josh traverses the left branch, it eventually hits $parent without previously having seen $commit and therefore now processes the ancestors of this commit without applying the filter. At least I think that's what happens in #1325.

This implements what I sketched here: I need rev($commit=FILTER) to apply the given filter to all parents of commit no matter how they are referenced.
So this changes rev semantics to instead detect when we reach any ancestor of the given $commit, and then we start applying the filter. This has successfully extracted the rust-analyzer history in a way that I can then also do pushes.

This is a draft because

  • it needs a test
  • the docs need updating
  • I don't know if I can just change the semantics of the existing filter (vs making this a new filter, or adding new syntax like :rev(55d9a533b309119c8acd13061581b43ae8840823+:prefix=src/tools/rust-analyzer) to indicate "plus all parents" -- note the +)
  • I don't know if the cache version number should be bumped up

(I also won't have more time to work on this for a bit, so if the approach seems acceptable than I wouldn't mind someone else taking over. :)

@RalfJung
Copy link
Contributor Author

I have also checked that Miri still works with this version of josh -- i.e., even though the filter used by Miri technically changed its meaning (since it now checks for "ancestors"), the extracted history is the same as before. The ancestors semantics is what we always wanted in Miri, and it is what I originally implemented in #961, which means it is also how Miri was originally josh-ified. Then when we switched to the official rev filter we switched to non-ancestors semantics, and with this PR we'd be switching back.

return Ok(Some(start));
} else {
return Ok(None);
for (&filter_tip, startfilter) in filters.iter() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This iterates the filters in unspecified order, as HashMap does not have stable iteration order. This means if the parent-sets of the commits given in a rev overlap, behavior would be non-deterministic...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filters is already a BTreeMap for exactly that reason.
It's probably slower, but determinism is more important.
Looking at the usage we could change it to an ordered Vec though 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, maybe it should error if two of the parent-sets overlap?

@RalfJung RalfJung force-pushed the rev-ancestors branch 2 times, most recently from f26332b to 76f4488 Compare April 27, 2024 11:29
@RalfJung
Copy link
Contributor Author

RalfJung commented Apr 27, 2024

The subtree_prefix test still works, but the start test (which likely should be renamed to rev) fails. That makes sense, as that test involves a history like

  *   1d69b7d2651f744be3416f2ad526aeccefb99310
  |\  
  | * 86871b8775ad3baca86484337d1072aa1d386f7e
  | * 975d4c4975912729482cc864d321c5196a969271  <--- this commit is set in `rev`
  * | e707f76bb6a1390f28b2162da5b5eb6933009070
  |/  
  * 0b4cf6c9efbbda1eada39fa9c1d21d2525b027bb

The last commit (0b4cf6c9efbbda1eada39fa9c1d21d2525b027bb) is now an ancestor of the rev commit and so has the filter applied, which it did not before.

The test in fact exactly shows the duplication of history -- after the filter is applied, we have (before this PR):

  $ git log --graph --decorate --pretty=%H:%T refs/heads/filtered
  *   8b4097f3318cdf47e46266fc7fef5331bf189b6c:5f47d9fdffdc726bb8ebcfea67531d2574243c5d
  |\  
  | * ee931ac07e4a953d1d2e0f65968946f5c09b0f4c:5d0da4f47308da86193b53b3374f5630c5a0fa3e
  | * cc0382917c6488d69dca4d6a147d55251b06ac08:8408d8fc882cba8e945b16bc69e3b475d65ecbeb
  | * 9f0db868b59a422c114df33bc6a8b2950f80490b:a087bfbdb1a5bad499b40ccd1363d30db1313f54
  * e707f76bb6a1390f28b2162da5b5eb6933009070:5d8a699f74b48c9c595f4615dd3755244e11d176
  * 0b4cf6c9efbbda1eada39fa9c1d21d2525b027bb:3d77ff51363c9825cc2a221fc0ba5a883a1a2c72

Note there are two roots now, 0b4cf6c9efbbda1eada39fa9c1d21d2525b027bb got duplicated and now exists as 0b4cf6c9efbbda1eada39fa9c1d21d2525b027bb and 9f0db868b59a422c114df33bc6a8b2950f80490b.

So is that truly the intended behavior of rev? For my case I definitely need the variant that does not duplicate history. I am not creative enough to imagine a case where one would want the duplicated history. But obviously it is possible to make rev support both, even on a per-commit basis, if that is what you want to happen.

@christian-schilling
Copy link
Member

The reasoning of implementing :rev they way it is was the following:
The most basic principle that makes memoization in Josh work is that the filtered commit only depends on the original commit and the filter itself, without any further inputs. My thinking was, that a commit can be made a descendant of another commit (the one mentioned in :rev) without altering the commit itself. Thus the more recent commit is an additional input that impacts the filtering result and thus brakes Josh's most basic invariant.

However thinking about this again now it becomes clearer that this is not really the case:
Because the :rev filter contains the sha of the descendant, the relationship to the ancestor commit is deterministic and fixed.
So the only thing we need to make shure is that the commit mentioned in :rev actually exists at filtering time, and that if it does not the filtering will fail (same as an invalid syntax filter).

So thanks again for finding, reporting and analysing this brain twister bug :)

I'll gladly take this one over and deal with test etc.

@christian-schilling
Copy link
Member

Merged as #1329

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants