resource_auditor: audit PyPI resources that exist as formulae#21731
resource_auditor: audit PyPI resources that exist as formulae#21731
Conversation
There was a problem hiding this comment.
Pull request overview
Adds/renames a PyPI resource audit in ResourceAuditor and integrates it into FormulaAuditor to (a) validate resource names against PyPI filenames and (b) flag certain PyPI resources that should be replaced by Homebrew dependencies, with tap-level exceptions.
Changes:
- Rename
audit_resource_name_matches_pypi_package_name_in_urltoaudit_pypi_resourcesand extend it to flag dependency-replacement candidates. - Add
pypi_resources_allowlisttap audit exception handling to skip the PyPI resource audit for specific resources. - Extend
formula_auditor_specwith new examples covering dependency-replacement reporting, allowlist behavior, and skipping top-level formula PyPI URLs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| Library/Homebrew/test/formula_auditor_spec.rb | Adds spec coverage for dependency-replacement auditing and allowlist behavior. |
| Library/Homebrew/resource_auditor.rb | Implements audit_pypi_resources, adds a dependency candidate set, and emits a new audit message. |
| Library/Homebrew/formula_auditor.rb | Wires tap exceptions into resource auditing by conditionally extending the except list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
27650b5 to
c445820
Compare
|
|
Some widely-used PyPI packages are available in Homebrew as dependencies for other Python-based formulae. We encourage their use either because they take a lot of time to build (f.e. `pydantic` or `scipy`) or we don't want to do hundreds of revision bumps when new security updates come out (f.e. `cryptography` or `certifi`). The problem I see with new contributors is that they don't know it. A lot of the time, they read the cookbook, create a Python-based formula, and it passes audit and tests. They did nothing wrong, but a maintainer still have to point out, "Hey, numpy takes a lot of time to build, and it exists as a formula, let's use it instead". I'd rather add an audit for such cases and make exceptions for formulae where it cannot be used I'd also take a look at [Python for Formula Authors](https://docs.brew.sh/Python-for-Formula-Authors) but it should be revised in another PR Signed-off-by: botantony <antonsm21@gmail.com>
c445820 to
67ccd0d
Compare
MikeMcQuaid
left a comment
There was a problem hiding this comment.
Thanks! Looks good so far, a few suggestions.
| allowed_pypi_packages = formula.tap&.audit_exception(:pypi_resources_allowlist, formula.name) | ||
| allowed_pypi_packages = case allowed_pypi_packages | ||
| when String | ||
| allowed_pypi_packages.split(/\s+/i).to_set | ||
| else | ||
| Set.new | ||
| end | ||
|
|
There was a problem hiding this comment.
the case here is a bit weird; an if/else here would make more sense and can use .blank? or .presence to avoid calling on an empty string. Also, I'd just take the nil/empty array/full array and use Set.new for all to avoid branching.
| except = if allowed_pypi_packages.include?(resource.name) | ||
| @except.to_a + ["pypi_resources"] | ||
| else | ||
| @except | ||
| end |
There was a problem hiding this comment.
| except = if allowed_pypi_packages.include?(resource.name) | |
| @except.to_a + ["pypi_resources"] | |
| else | |
| @except | |
| end | |
| except = @except | |
| except = [*Array(except), "pypi_resources"] if allowed_pypi_packages.include?(resource.name) |
or similar. can simplify more if @except is already an array.
| class ResourceAuditor | ||
| include Utils::Curl | ||
|
|
||
| DEPENDENCY_PACKAGES = Set.new(%w[certifi cffi cryptography numpy pillow pydantic rpds-py scipy torch]).freeze |
There was a problem hiding this comment.
We should avoid a hardcoded list here. A new JSON file in homebrew/core would be better.
| return if DEPENDENCY_PACKAGES.exclude?(pypi_package_name.to_s.downcase) | ||
|
|
||
| problem "`resource` name should be '#{pypi_package_name}' to match the PyPI package name" | ||
| problem "PyPI package should be replaced with Homebrew dependency and excluded using `pypi_package` method" |
There was a problem hiding this comment.
Would be good to spell out the specific depends_on line
brew lgtm(style, typechecking and tests) with your changes locally?Some widely-used PyPI packages are available in Homebrew as dependencies for other Python-based formulae. We encourage their use either because they take a lot of time to build (f.e.
pydanticorscipy) or we don't want to do hundreds of revision bumps when new security updates come out (f.e.cryptographyorcertifi). The problem I see with new contributors is that they don't know it. A lot of the time, they read the cookbook, create a Python-based formula, and it passes audit and tests. They did nothing wrong, but a maintainer still have to point out, "Hey, numpy takes a lot of time to build, and it exists as a formula, let's use it instead". I'd rather add an audit for such cases and make exceptions for formulae where it cannot be usedI'd also take a look at Python for Formula Authors but it should be revised in another PR