New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Livecheck: Add Crate strategy #16620
Livecheck: Add Crate strategy #16620
Conversation
a7e5600
to
8f5d453
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just two very minor comments.
URL_MATCH_REGEX = %r{ | ||
^https?://static\.crates\.io/crates | ||
/(?<package>[^/]+) # The name of the package | ||
/.+\.crate # The crate filename | ||
}ix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this prevent using the strategy for alternative/private registries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but I'm happy to modify it if third-party registries can be checked in the same way and it's possible to create a regex that will work for everything (crates.io and third-parties). I'll look into how the listed third-party registries work in a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through the alternative registries RFC, Registry Web API documentation, and the code/documentation of some open source registries, it appears that the API for third-party registries isn't as complete if they're using something other than the crates.io codebase. Namely, the /api/v1/crates/{crate}/versions
endpoint isn't in the Web API documentation and the limited number of open source registries that I looked at only provided a max_version
value from /api/v1/crates/{crate}
without including the versions
array (like crates.io).
max_version
may or may not give us the highest version but I'm not sure if that factors in yanked status, how it would work if a crate publishes unstable versions like 2.0.0-beta1
alongside stable versions like 1.2.3
, etc. Checking all the versions (and feeding them to a strategy
block) seems like a more robust approach, so I don't think we should naively check max_version
just to support third-party registries (not without confirming the behavior in the aforementioned scenarios, at least).
However, Markus's comment about a generic Crate
name being more suitable if we support third-party registries makes sense to me, so I'm leaning in that direction again. Unless we come up with a clear plan of how we can support third-party registries, I think we should just put it off until we have an example we in homebrew/core to work with. This would also give the alternative registry implementations more time to mature and their API response format may be more in line with crates.io at that time.
8f5d453
to
8edbb51
Compare
next unless (match = version["num"]&.match(regex)) | ||
|
||
match[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next unless (match = version["num"]&.match(regex)) | |
match[1] | |
version["num"]&.slice(regex, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the #slice
approach is tidier when we're only working with one capture group, the next unless match...
pattern is what we use in strategy
blocks throughout core/cask.
The general idea is that in the #match
approach, we only execute the regex once regardless of the number of capture groups. Unless Ruby does some magic behind the scenes, the #slice
approach would seemingly execute the regex each time we fetch a capture group.
We're only using one capture group in the default strategy regex, so the #slice
approach would work but I favor the existing #match
approach because it's consistent with what we do in strategy
blocks.
return values unless (match = url.match(URL_MATCH_REGEX)) | ||
|
||
values[:url] = "https://crates.io/api/v1/crates/#{match[:package]}/versions" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return values unless (match = url.match(URL_MATCH_REGEX)) | |
values[:url] = "https://crates.io/api/v1/crates/#{match[:package]}/versions" | |
return values unless (package = url[URL_MATCH_REGEX, :package]) | |
values[:url] = "https://crates.io/api/v1/crates/#{package}/versions" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with the previous comment, this is a neat pattern but we use the #match
approach in other strategies' #generate_input_values
methods and I think there's value in consistency.
The #match
approach executes the regex once regardless of the number of capture groups, so there's no additional overhead if we ever update this method to work with more than one capture group (and it's easy to do so). If we use the url[URL_MATCH_REGEX, :package]
approach, we would be executing the regex each time we fetch a capture group.
The square-bracket pattern may also end up being more verbose when working with more than one capture group, since we would have to check for a good value from each capture group instead of doing one early return
if the regex doesn't match. As above, the suggestion is tidier in this specific context but I would prefer to maintain consistency across strategies (and strategy
blocks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executes the regex once regardless of the number of capture groups
I think we should just use the simpler approach depending on the use case. It is easy enough to switch to a single match
once we actually need multiple capture groups.
We discussed the idea of adding a livecheck strategy to check crate versions years ago but decided to put it off because it would have only applied to one formula at the time (and it wasn't clear that a crate was necessary in that case). We now have a few formulae that use a crate in the `stable` URL (`cargo-llvm-cov`, `pngquant`, `oakc`) and another formula with a crate resource (`deno`), so there's some value to the idea now. I established a standard approach for checking crate versions in a somewhat recent `pngquant` `livecheck` block update and this commit reworks it into a strategy, so we won't have to duplicate that `livecheck` block in these cases. With this strategy, we usually won't even need a `livecheck` block at all. Under normal circumstances, a regex and/or strategy block shouldn't be necessary but the strategy supports them when needed. The response from the crates.io API is a JSON object, so this uses `Json#versions_from_content` internally and a `strategy` block will receive the parsed `json` object and a regex (the strategy default or the regex from the `livecheck` block).
Co-authored-by: Douglas Eichelberger <dduugg@gmail.com> Co-authored-by: Markus Reiter <me@reitermark.us>
8edbb51
to
55ec4c4
Compare
Thanks again @samford! |
brew style
with your changes locally?brew typecheck
with your changes locally?brew tests
with your changes locally?We discussed the idea of adding a livecheck strategy to check crate versions years ago but decided to put it off because it would have only applied to one formula at the time (and it wasn't clear that a crate was necessary in that case). We now have a few formulae that use a crate in the
stable
URL (cargo-llvm-cov
,pngquant
,oakc
) and another formula with a crate resource (deno
), so there's some value to the idea now.I established a standard approach for checking crate versions in a somewhat recent
pngquant
livecheck
block update (Homebrew/homebrew-core#156647) and this PR reworks it into a strategy, so we won't have to duplicate thatlivecheck
block in these cases. With this strategy, we usually won't even need alivecheck
block at all.Under normal circumstances, a regex and/or strategy block shouldn't be necessary but the strategy supports them when needed. The response from the crates.io API is a JSON object, so this uses
Json#versions_from_content
internally and astrategy
block will receive the parsedjson
object and a regex (the strategy default or the regex from thelivecheck
block).Past that, I think the only open question I had was whether we prefer the
Crate
name (referring to the type of project, what we're checking) orCrates
(referring to the crates.io name). I'm on the fence but leaning more towardsCrates
, so I wouldn't mind some other perspectives.