cask: always return short cask tokens from core cask tap #16867

apainintheneck · 2024-03-10T01:09:14Z

Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same change?
Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your changes? Here's an example.
Have you successfully run brew style with your changes locally?
Have you successfully run brew typecheck with your changes locally?
Have you successfully run brew tests with your changes locally?

For some reason Tap#formula_names returns the short name while Tap#cask_tokens returns the full name for some reason. I'd overlooked that when generating the internal JSON. ~~This just gets the short name for casks from the casks themselves.~~

brew(main):004:0> ENV["HOMEBREW_NO_INSTALL_FROM_API"] = "1"
=> "1"
brew(main):005:0> CoreCaskTap.instance.cask_tokens.sample(5)
=> ["homebrew/cask/ssh-config-editor", "homebrew/cask/deskpad", "homebrew/cask/zoom-for-it-admins", "homebrew/cask/ukelele", "homebrew/cask/baiduinput"]
brew(main):006:0> CoreTap.instance.formula_names.sample(5)
=> ["lilv", "libccd", "chmlib", "mpfrcx", "cgit"]
brew(main):007:0> :deskpad.c.token
=> "deskpad"

Edit: Based on code review we've decided to change the behavior of CoreCaskTap.instance.cask_tokens to always return the short tokens. This matches the behavior when the core cask tap is loaded from the API. The other bug with brew untap is being handled in a separate PR: #16875

apainintheneck · 2024-03-10T01:10:57Z

This is a follow-up to #16798 that I discovered while trying to write the loader code.

apainintheneck · 2024-03-10T01:14:06Z

It's also interesting how the behavior is different when using the API. Tap#cask_tokens returns short names in that case.

brew(main):008:0> ENV.delete("HOMEBREW_NO_INSTALL_FROM_API")
=> "1"
brew(main):009:0> CoreCaskTap.instance.cask_tokens.sample(5)
=> ["mediainfo", "movist-pro", "fluor", "opendnsupdater", "wavebox"]

ZhongRuoyu · 2024-03-10T01:34:58Z

Would it be reasonable to tweak CoreCaskTap#cask_tokens instead? Since its result is different with and without API.

ZhongRuoyu · 2024-03-10T01:46:03Z

I also noticed that CoreCaskTap.instance.cask_tokens is used in cmd/untap.rb for checking installed casks before untapping, but its doesn't seem to do the intended work as I could still untap the cask tap:

$ export HOMEBREW_NO_INSTALL_FROM_API=1
$ brew developer off
$ brew untap homebrew/cask
Untapping homebrew/cask...
Untapped 4397 casks (6,488 files, 423.9MB).

But with the following change (the same override exists in CoreTap):

diff --git a/Library/Homebrew/tap.rb b/Library/Homebrew/tap.rb
index 8e779c6ec..68058e7f5 100644
--- a/Library/Homebrew/tap.rb
+++ b/Library/Homebrew/tap.rb
@@ -1337,6 +1337,12 @@ class CoreCaskTap < AbstractCoreTap
     Homebrew::API::Cask.all_casks.keys
   end
 
+  # @private
+  sig { params(file: Pathname).returns(String) }
+  def formula_file_to_name(file)
+    file.basename(".rb").to_s
+  end
+
   # @private
   sig { override.returns(T::Hash[String, Pathname]) }
   def cask_files_by_name

(After I have homebrew/cask back again:)

$ brew untap homebrew/cask
Error: Refusing to untap homebrew/cask because it contains the following installed formulae or casks:
[list of casks installed]

apainintheneck · 2024-03-10T02:34:20Z

Yeah, that's probably a better approach especially since the behavior in brew untap is clearly a bug. I'll check the other places it's used like in the tap auditor and descriptions code to make sure nothing else needs to be changed.

This seems to be a bug with how we handle name shortening for the core cask tap. The core tap always returns short formula names and returning long names from the core cask tap when not using the API leads to unexpected behavior. Specifically this can trick the `brew untap` command into thinking that there aren't any installed casks in the core cask tap and that it can be removed even when that is not the case. One risk here is that the full names were used when caching descriptions so descriptions could be out of date for people in the short term though hopefully that's not the end of the world.

It was possible for the cask tokens to be included twice in the cask_tokens array. This was cancelled out by the fact that we |= the arrays together but still it was unnecessary work that is best avoided and makes the code harder to reason about. This is simpler.

apainintheneck · 2024-03-10T03:47:44Z

Now that I look at it a bit more I think that the bug in brew untap is actually unrelated to the cask token being the full name. This wouldn't have worked for other cask taps either.

diff --git a/Library/Homebrew/cmd/untap.rb b/Library/Homebrew/cmd/untap.rb
index 90696f7f55..d087c1cc17 100644
--- a/Library/Homebrew/cmd/untap.rb
+++ b/Library/Homebrew/cmd/untap.rb
@@ -52,7 +52,7 @@ module Homebrew
           next unless installed_cask_tokens.include?(cask_token)
 
           cask = begin
-            Cask::CaskLoader.load("#{tap.name}/#{cask_token}")
+            Cask::CaskLoader.load(cask_token)
           rescue
             # Don't blow up because of a single unavailable cask.
             next

We also should probably warn here at the very least because we've seemingly been swallowing this error for months now.

apainintheneck · 2024-03-10T05:13:15Z

brew/Library/Homebrew/cmd/untap.rb

Line 37 in cd1f040

Formulary.factory("#{tap.name}/#{formula_name}")

It looks like it's broken for formula too for everything except the core tap and it seems like it was changed in December 2023.

cmd/untap: fix untapping syntax failure. #16330

Bo98 · 2024-03-10T05:22:37Z

Formula.installed_formula_names doesn't return full names for technical reasons, so that patch looks OK but you will need to change the installed check to consider formula_name.split("/").last

apainintheneck · 2024-03-10T05:48:13Z

@Bo98 You beat me to it. I think that brew untapped is broken beyond what I originally thought.

brew/Library/Homebrew/cmd/untap.rb

Lines 49 to 52 in cd1f040

    
           installed_tap_casks = tap.cask_tokens.filter_map do |cask_token| 
        
             # initialise lazily in case there's no casks in this tap 
        
             installed_cask_tokens ||= Set.new(Cask::Caskroom.tokens) 
        
             next unless installed_cask_tokens.include?(cask_token)

brew/Library/Homebrew/cmd/untap.rb

Lines 31 to 34 in cd1f040

    
           installed_tap_formulae = tap.formula_names.filter_map do |formula_name| 
        
             # initialise lazily in case there's no formulae in this tap 
        
             installed_formula_names ||= Set.new(Formula.installed_formula_names) 
        
             next unless installed_formula_names.include?(formula_name)

Both of the open rescue blocks for loading casks and formulae in the loop are unreachable because we're comparing full cask or formula names to Formula.installed_formula_names and Cask::Caskroom.tokens which are just arrays of the short cask and formula names. IMO it basically needs to be entirely rewritten.

Is it fine if that gets handled in a follow-up PR since it doesn't seem to be related to what I'm trying to do here?

Bo98 · 2024-03-10T06:07:34Z

IMO it basically needs to be entirely rewritten.

Not really given we already check the tap part here:

brew/Library/Homebrew/cmd/untap.rb

Lines 43 to 45 in cd1f040

    
           # Can't use Formula#any_version_installed? because it doesn't consider 
        
           # taps correctly. 
        
           formula if formula.installed_kegs.any? { |keg| keg.tab.tap == tap }

So the logic seems right, just not handling formula_name correctly.

We don't for casks purely because that information is simply not available at all. It is impossible to check which tap a cask was installed from, unless it's a JSON file but that's core-only anyway. So we just assume token shadowing doesn't happen at the moment.

Is it fine if that gets handled in a follow-up PR since it doesn't seem to be related to what I'm trying to do here?

Sure. This PR already improves things rather than regresses.

apainintheneck · 2024-03-10T06:41:23Z

That should still be unreachable for all non-core taps since it will never get past the next unless installed_formula_names.include?(formula_name) check in the loop. installed_formula_names is an array of short names and formula_name will always be a long name. Note that the core tap is never evaluated in this loop to begin with.

I'm also not really sure why we check every formula and cask in each named tap to see if it's installed. It seems like it'd be simpler to check every installed formula and cask to see if it's included in one of the named taps. But I guess that's an implementation detail.

Bo98 · 2024-03-10T07:09:08Z

That should still be unreachable for all non-core taps since it will never get past the next unless installed_formula_names.include?(formula_name) check in the loop. installed_formula_namesis an array of short names and formula_name will always be a long name.

Not if you do

next unless installed_formula_names.include?(formula_name.split("/").last)

In the end, installed_formula_names.include is kinda useless as it does the "real" install check later, but does mean not having to load formulae unnecessarily which is ideal.

apainintheneck · 2024-03-10T07:37:54Z

I'm fine with keeping the current logic intact and making the small changes mentioned here to get it working again. I do want to refactor it into a method or module so that I can add some regression tests though. I'll take a stab at that tomorrow.

reitermarkus · 2024-03-11T16:02:58Z

I'm fine with keeping the current logic intact

To me it makes more sense to eventually make all these methods only return short names/tokens instead of doing .split("/").last almost everywhere we use them.

That said, I think it's fine to fix the logic as-is for now.

apainintheneck · 2024-03-12T04:21:27Z

To me it makes more sense to eventually make all these methods only return short names/tokens instead of doing .split("/").last almost everywhere we use them.

That said, I think it's fine to fix the logic as-is for now.

@reitermarkus That sounds like an interesting refactoring opportunity. Do you want to jump on that or should I create an issue for it?

Bo98 · 2024-03-12T12:20:31Z

I agree with the stance that short tokens make sense, but the problem is that is definitely a backwards incompatible change to a public API and we've already disrupted third party code enough. It's also not a very obvious break as well if they pass it immediately to say Formula#[].

I'm less concerned with cask-only API changes as there's significantly less third-party taps around that, so in the scenario where we have to break one then we usually side on casks.

apainintheneck · 2024-03-13T03:56:31Z

Since the untap command bugs will be handled in a separate PR now, is this fine as is or are their more changes that need to be made?

Bo98 · 2024-03-13T04:02:35Z

Library/Homebrew/search.rb

+      cask_tokens = Tap.each_with_object([]) do |tap, array|
+        # We can exclude the core cask tap because `CoreCaskTap#cask_tokens` returns short names by default.
+        if tap.official? && !tap.core_cask_tap?
+          tap.cask_tokens.each { |token| array << token.sub(%r{^homebrew/cask.*/}, "") }


I know it was there before, but what's the point of this sub?

I'm guessing that we want to only search by cask token for internal taps so that brew search version doesn't return all cask-versions casks.

Feels like this is kinda begging for something like Formula#name vs. Formula#full_name

MikeMcQuaid

Thanks @apainintheneck!

apainintheneck force-pushed the stop-using-full-name-as-key-in-cask-json-v3 branch from 2f1e1cd to 2fdb268 Compare March 10, 2024 02:46

apainintheneck marked this pull request as draft March 10, 2024 02:49

apainintheneck force-pushed the stop-using-full-name-as-key-in-cask-json-v3 branch from 2fdb268 to 6e0e78c Compare March 10, 2024 03:03

apainintheneck mentioned this pull request Mar 11, 2024

Fix untap cmd bugs #16875

Merged

7 tasks

apainintheneck changed the title ~~internal cask json v3: stop using full name as keys in hash~~ cask: cask tokens should always be short form core cask tap Mar 12, 2024

apainintheneck changed the title ~~cask: cask tokens should always be short form core cask tap~~ cask: always return short cask tokens from core cask tap Mar 12, 2024

apainintheneck marked this pull request as ready for review March 12, 2024 04:20

Bo98 reviewed Mar 13, 2024

View reviewed changes

Bo98 approved these changes Mar 13, 2024

View reviewed changes

apainintheneck merged commit 58b84c3 into master Mar 13, 2024
26 checks passed

apainintheneck deleted the stop-using-full-name-as-key-in-cask-json-v3 branch March 13, 2024 05:51

MikeMcQuaid reviewed Mar 13, 2024

View reviewed changes

github-actions bot added the outdated PR was locked due to age label Apr 13, 2024

github-actions bot locked as resolved and limited conversation to collaborators Apr 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cask: always return short cask tokens from core cask tap #16867

cask: always return short cask tokens from core cask tap #16867

apainintheneck commented Mar 10, 2024 •

edited

apainintheneck commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

ZhongRuoyu commented Mar 10, 2024

ZhongRuoyu commented Mar 10, 2024 •

edited

apainintheneck commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

Bo98 commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

Bo98 commented Mar 10, 2024 •

edited

apainintheneck commented Mar 10, 2024

Bo98 commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

reitermarkus commented Mar 11, 2024

apainintheneck commented Mar 12, 2024

Bo98 commented Mar 12, 2024 •

edited

apainintheneck commented Mar 13, 2024

Bo98 Mar 13, 2024

apainintheneck Mar 13, 2024

MikeMcQuaid Mar 13, 2024

MikeMcQuaid left a comment

cask: always return short cask tokens from core cask tap #16867

cask: always return short cask tokens from core cask tap #16867

Conversation

apainintheneck commented Mar 10, 2024 • edited

apainintheneck commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

ZhongRuoyu commented Mar 10, 2024

ZhongRuoyu commented Mar 10, 2024 • edited

apainintheneck commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

Bo98 commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

Bo98 commented Mar 10, 2024 • edited

apainintheneck commented Mar 10, 2024

Bo98 commented Mar 10, 2024

apainintheneck commented Mar 10, 2024

reitermarkus commented Mar 11, 2024

apainintheneck commented Mar 12, 2024

Bo98 commented Mar 12, 2024 • edited

apainintheneck commented Mar 13, 2024

Bo98 Mar 13, 2024

Choose a reason for hiding this comment

apainintheneck Mar 13, 2024

Choose a reason for hiding this comment

MikeMcQuaid Mar 13, 2024

Choose a reason for hiding this comment

MikeMcQuaid left a comment

Choose a reason for hiding this comment

apainintheneck commented Mar 10, 2024 •

edited

ZhongRuoyu commented Mar 10, 2024 •

edited

Bo98 commented Mar 10, 2024 •

edited

Bo98 commented Mar 12, 2024 •

edited