New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop filtering formulae with unchanged versions from the update report #13243
Conversation
`brew update` fast-forwards each tap from one commit to another. `update-report` lists all the formulae that were modified in that commit range. The clause this commit removes was filtering out modified formulae _whose `pkg_version` had not changed._ The outcome of this filtering: - If you rewind homebrew/core **2,000 commits** (or 2 weeks), **819** formulae were modified. This clause would have taken **29.0 seconds** to filter out **153** (18.7%) of them. - If you rewind homebrew/core **20,000 commits** (or 5 months), 3532 formulae were modified. This clause would have taken **119.4 seconds** to filter out **1121** (31.7%) of them. Data ========= | Homebrew Branch | Command | Commits Rewound | Time Elapsed | Updated Formulae | | --- | --- | --- | --- | --- | | `master` | `git -C "$(brew --repo homebrew/core)" reset --hard aa1b3d5; time sh -c "brew update &> output-2k.txt"` | 2,000 | 35.6s | 666 | | `master` | `git -C "$(brew --repo homebrew/core)" reset --hard 11e6919; time sh -c "brew update &> output-20k.txt"` | 20,000 | 129.8s | 2411 | | _this one_ | `git -C "$(brew --repo homebrew/core)" reset --hard aa1b3d5; time sh -c "brew update &> output-2k.txt"` | 2,000 | 6.6s | 819 | | _this one_ | `git -C "$(brew --repo homebrew/core)" reset --hard 11e6919; time sh -c "brew update &> output-20k.txt"` | 20,000 | 10.4s | 3532 | Root Cause ========= This operation was so expensive because it shelled out to `git cat-file` for each formula that had been touched in the commit window. Fixes Homebrew#13224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! This is good with me but I'd like to see at least one other @Homebrew/brew maintainer agree that this is the right move.
Hey! I think this PR is a better approach (it's simpler, less brittle, and increases consistency between |
Not a fan of showing a formula as updated when the |
As a maintainer who frequently makes changes to formulae/casks without modifying the version, I find the existing filtering behavior valuable. I agree with Carlo that displaying a formula as updated when the version hasn't changed isn't likely to be beneficial to most users. For comparison, the same filtering doesn't apply to casks (i.e., any change leads to a cask showing as updated). I update some installed casks using Ignoring performance for a moment, I think filtering is the better default for most users (i.e., don't surface non-actionable changes). If there's an alternative to this PR that will improve performance while maintaining the filtering behavior (e.g., #13244), I'm all for it. However, if we go down a path where we remove this filtering behavior in the name of performance, it sounds like there are some of us who may prefer to continue using it (e.g., opting in using an environment variable?) despite any performance hit. |
Given feedback above: I think we shouldn't merge this as-is. Some thoughts:
|
Oh, +1! I think that's a better default too! It does not really help the performance here because it's applied as a filter in Could we rearrange that? Or does anything else consume the report object ( |
Yeh, that seems like a good idea. Makes sense to filter the expensive operation before rather than after, right? For your personal use-case: would that address things sufficiently, assuming the speedup is large? |
@carlocab @samford Sorry to disagree when I've asked for your feedback but something just occurred to me: do you still think this behaviour is useful and/or necessary given that it only applies to manual Much like |
@RandomDSdevel @neersighted can you explain your 👎🏻, please? |
Despite the fact that the default settings/upstream's presets do not require the user to manually run With this set of PRs, you can either trade moderate code complexity for significantly more idiomatic usage of the git plumbing, or trade more verbose/less focused output for a significant reduction in code complexity and expensive operations. Either way, @boblail has two very solid approaches to a 'free' performance improvement depending on what complexity/output compromises the project wants to make -- I don't think it's worth rejecting them because the suggested way of using Homebrew doesn't regularly traverse this codepath. |
I personally consider the existing behaviour of That said, with my maintainer hat on: if it really is the case that most users don't do |
I probably shouldn't have said "most users", as I wouldn't be surprised if With my previous comments, I was thinking about any users like myself who manually run That said, my opinion isn't very strong and I can probably adapt my workflow to any changes here. If, in the end, the performance gains would benefit most users (and/or increasing consistency of However, if we can find performance gains that don't require removing the filtering behavior, I would prefer that, of course. As mentioned above, I would prefer for this filtering behavior to be extended to casks rather than removed but I'm biased. |
@neersighted You misunderstand me here: I'm suggesting that the slow code, instead of being optimised, can be avoided being called at all.
@carlocab @samford I guess my question is: why do you not care about this output being missing when |
It's situational for me. There are times when I care about In practical terms, removing the filtering behavior would mean A) I can't assume that an updated formula is a version change and B) there would simply be more output to look through (PRs that bulk modify formulae/casks are a problem for unfiltered output). From my perspective, the change would save my computer some time but would increase the amount of time I have to spend thinking, which isn't a great tradeoff in my case. To be clear, I'm not against the idea of modifying the default behavior if it benefits most users. I just think it would be helpful to have some way to continue to opt-in to the filtering behavior for those who want it (e.g., a |
I do have Despite that, I do still find myself running |
Yeh, this is what I'm leaning towards. I also think we should combine this change with defaulting to only listing updates to installed formulae by default. This was changed before but reverted after backlash but I think we/I should do so again but with an opt-out this time. |
That sounds reasonable to me, so long as affected users have ways to control related behavior (i.e., one environment variable for only displaying version updates and another for displaying updates for all formulae (not just installed)). It may trip up some users (as it goes against their existing mental model) but it should only be a temporary inconvenience if there's a way for them to get back to their desired configuration. |
Sounds good to me. |
I think #13299 may remove the need for this but interested in thoughts. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Passing on this given recent |
brew update
fast-forwards each tap from one commit to another.update-report
lists all the formulae that were modified in that commit range. The clause this commit removes was filtering out modified formulae whosepkg_version
had not changed.The outcome of this filtering:
If you rewind homebrew/core 2,000 commits (or 2 weeks), 819 formulae were modified. This clause would have taken 29.0 seconds to filter out 153 (18.7%) of them.
If you rewind homebrew/core 20,000 commits (or 5 months), 3532 formulae were modified. This clause would have taken 119.4 seconds to filter out 1121 (31.7%) of them.
Data
master
git -C "$(brew --repo homebrew/core)" reset --hard aa1b3d5df78; time sh -c "brew update &> output-2k.txt"
master
git -C "$(brew --repo homebrew/core)" reset --hard 11e6919661c; time sh -c "brew update &> output-20k.txt"
git -C "$(brew --repo homebrew/core)" reset --hard aa1b3d5df78; time sh -c "brew update &> output-2k.txt"
git -C "$(brew --repo homebrew/core)" reset --hard 11e6919661c; time sh -c "brew update &> output-20k.txt"
Root Cause
This operation was so expensive because it shelled out to
git cat-file
for each formula that had been touched in the commit window.Fixes #13224
Alternatives
brew update
by batchinggit cat-file
operations #13244brew style
with your changes locally?brew typecheck
with your changes locally?brew tests
with your changes locally?