Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glib and friends: migrate to python@3.10 #87277

Closed
wants to merge 131 commits into from

Conversation

carlocab
Copy link
Member

Let's hope I don't need to include more in this PR...

@carlocab carlocab added CI-long-timeout Use longer GitHub Actions CI timeout. CI-linux-self-hosted Build on Linux self-hosted runner CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. labels Oct 15, 2021
@carlocab carlocab mentioned this pull request Oct 15, 2021
6 tasks
@carlocab
Copy link
Member Author

Can you explain if/why/how they are only build or runtime?

I'm not sure I understand this question. But, here's a rough classification scheme:

  1. Does the build explicitly refer to them as Python bindings or otherwise look optional? (Here's an example.) If so, this can be build/test-only, as long as there is no linkage with Python. (There probably isn't; we actually have an audit that enforces this.)
  2. What do the basic examples in the documentation and the test block look like? Are they C/C++/not-Python code? Will these work if Python is not installed? If yes, consider making these build/test-only.

What happens if they aren't single formula PRs?

Do you mean what happens if they are single-formula PRs?

If they aren't single-formula PRs, they look like PRs like this one. I'm not sure this can get to a state where it's reasonable to merge this (at least, not while Python 3.9 is maintained):

  1. I can't drop any of these formulae from this PR. (See below.)
  2. If I keep all of them, it's very unlikely that every test (install, audit, livecheck, etc.) will pass. This means dependents won't get tested, so I have no idea how many of them are broken. (I estimate over a hundred broken formulae on Linux from when I merged a PR that didn't run the dependent tests, given there are about 70 linkage failures each from here and here -- there may be some overlap.)

If they are single-formula PRs that keep Python as a runtime dependency, the tap-syntax job will immediately fail with an error like

watchman:
  * watchman contains conflicting version recursive dependencies:
      python@3.9, python@3.10
    View these with `brew deps --tree watchman`.

(This is what I was referring to in point 1 above. "any of these formulae" could be a slight exaggeration -- I may have added some here by mistake. But nearly all of these formulae are connected via the dependency graph.)

We can add caveats recommending a brew install python@3.x to use the Python bindings if we're concerned about UX.

Feels pretty weird to do this given we're a package manager.

I think it's also weird to always install something even when a user isn't going to use it (even if just indirectly), so I think we have to accept some amount of weirdness one way or another. (Unless you don't think this is weird. Maybe I can convince you that it is?)

We're seriously degrading user experience, in order to work around limitations of our process.

I don't think it's clear that this degrades user experience, and if it does, it's not a serious degradation. Here are ways it actually improves it:

  • we avoid installing software users don't use/want
  • we minimise cascades of formulae getting upgraded when brew traverses a formula's runtime dependency graph

The second point above is an especially commonly cited pain point among users.

I would suggest that we'll always have some dependencies that have huge trees, and if it's not python it will be openssl/icu4c/readline/whatever. It feels like it would be better to actually have a way to deal with these directly.

This is true, but there's an important difference: formulae typically need and link with openssl/icu4c/readline. To the extent that they are more like optional Python bindings rather than linked libraries like openssl, then I would say we should treat them the same way.

I agree that it would be nice to have a way to deal with these more directly too, though.

(And yes, it was a bit of a pain, but it worked until recently, didn't it?)

I think "until recently" here is important. Python's dependency tree grows much faster than our CI capacity. When Python3.9 was added to Homebrew/core, 435 formulae formulae depended on Python 3.8 (74 of them build- or test-only). When Python3.10 was added, 587 formulae depended on Python3.9 (123 of them build- or test-only). (For reference, when python@3.8 was added, 338 formulae depended on python, 52 of them build- or test-only.)

@MikeMcQuaid
Copy link
Member

  • Does the build explicitly refer to them as Python bindings or otherwise look optional? (Here's an example.) If so, this can be build/test-only, as long as there is no linkage with Python. (There probably isn't; we actually have an audit that enforces this.)

Can these cases build with and/or be used at runtime with the system Python 3? That's not to say they should be but I think if they can be I'm a lot more 👍🏻 on making these :build.

2. If I keep all of them, it's very unlikely that every test (install, audit, livecheck, etc.) will pass. This means dependents won't get tested, so I have no idea how many of them are broken. (I estimate over a hundred broken formulae on Linux from when I merged a PR that didn't run the dependent tests, given there are about 70 linkage failures each from here and here -- there may be some overlap.)

This is a wider problem that needs fixed. I've moaned on about this for years but we really need to get out of the habit of accepting flaky audits/tests and just delete them instead (or, if we have time [which we don't seem to] replace them).

I think it's also weird to always install something even when a user isn't going to use it (even if just indirectly), so I think we have to accept some amount of weirdness one way or another. (Unless you don't think this is weird. Maybe I can convince you that it is?)

I don't think this is weird if it's needed for any runtime functionality of what we build.

  • we minimise cascades of formulae getting upgraded when brew traverses a formula's runtime dependency graph

The second point above is an especially commonly cited pain point among users.

This all comes down to my first question, to me: can you actually use these Python bindings without installing our Python?

I would suggest that we'll always have some dependencies that have huge trees, and if it's not python it will be openssl/icu4c/readline/whatever. It feels like it would be better to actually have a way to deal with these directly.

This is true, but there's an important difference: formulae typically need and link with openssl/icu4c/readline. To the extent that they are more like optional Python bindings rather than linked libraries like openssl, then I would say we should treat them the same way.

I second this. We need to find better solutions for the bigger problems here rather than just bypassing them in one-off ways like this (that may well come back to bite us).

I think "until recently" here is important. Python's dependency tree grows much faster than our CI capacity. When Python3.9 was added to Homebrew/core, 435 formulae formulae depended on Python 3.8 (74 of them build- or test-only). When Python3.10 was added, 587 formulae depended on Python3.9 (123 of them build- or test-only). (For reference, when python@3.8 was added, 338 formulae depended on python, 52 of them build- or test-only.)

Some of this seems like dependency bloat to me. We keep adding more and more dependencies to more and more formulae for consistency etc. even when we could be using a system SSL/Python/etc. Again, though, I agree that this is a problem that we need to solve.

@carlocab
Copy link
Member Author

Can these cases build with and/or be used at runtime with the system Python 3? That's not to say they should be but I think if they can be I'm a lot more 👍🏻 on making these :build.

This all comes down to my first question, to me: can you actually use these Python bindings without installing our Python?

They can be built with system Python3 -- if the system provides one. Mojave and earlier didn't (which might be why all these formulae suddenly picked up dependencies on our Python3).

Barring weird exceptions, these cases can also be used with any Python3 (system/Python.org/anaconda/etc) of the same minor version as the one that was used for the build. (This is actually what the Python audit I referenced above is meant to check for.) You could probably even get away with using them with a minor version that's one-off from the one that was used for the build, but that's asking for trouble.

This is a wider problem that needs fixed. I've moaned on about this for years but we really need to get out of the habit of accepting flaky audits/tests and just delete them instead (or, if we have time [which we don't seem to] replace them).

I know about your opinion regarding flaky tests. I share your disdain for them, but I'm reluctant to just delete them as they can provide useful information. Maybe a label that allows us to skip non-essential tests when needed might be something we'd find useful? By "essential" I have in mind build, test, linkage, and offline audits, but I'm open to discussion about what this means exactly. (Failing this, I am open to entertaining just deleting these tests.)

I don't think this is weird if it's needed for any runtime functionality of what we build.

I don't think it's correct to view all runtime functionality as equal. We already try to design brew so that it's most useful for most users. I suggest that we maintain that mindset when it comes to runtime functionality and only actually install things that most users need at runtime.

This is true, but there's an important difference: formulae typically need and link with openssl/icu4c/readline. To the extent that they are more like optional Python bindings rather than linked libraries like openssl, then I would say we should treat them the same way.

I second this. We need to find better solutions for the bigger problems here rather than just bypassing them in one-off ways like this (that may well come back to bite us).

I'm a little surprised you agree with me here, since that's not the impression I got from your earlier responses. To make sure we're on the same page, what I meant was that if a formula has additional features that require extra dependencies, we should try to make those dependencies build- and test-only where possible, if we have to build them at all. I'd make exceptions for additional features that actually are of interest to most users of a formula: we should build those, and require the necessary dependencies at runtime.

Some of this seems like dependency bloat to me. We keep adding more and more dependencies to more and more formulae for consistency etc. even when we could be using a system SSL/Python/etc. Again, though, I agree that this is a problem that we need to solve.

Agreed. Dependency bloat is exactly the problem I'm trying to help solve here. I'm not saying that this should be the only thing we do, but I do think that this is one thing that can help and is a good idea.

@MikeMcQuaid
Copy link
Member

They can be built with system Python3 -- if the system provides one. Mojave and earlier didn't (which might be why all these formulae suddenly picked up dependencies on our Python3).

Barring weird exceptions, these cases can also be used with any Python3 (system/Python.org/anaconda/etc) of the same minor version as the one that was used for the build. (This is actually what the Python audit I referenced above is meant to check for.) You could probably even get away with using them with a minor version that's one-off from the one that was used for the build, but that's asking for trouble.

Gotcha. In that case I'm 👍🏻 on having it as a :build dependency on Catalina and later.

I think in general we should be more willing to use system versions of things, particularly on newer OSs.

I know about your opinion regarding flaky tests. I share your disdain for them, but I'm reluctant to just delete them as they can provide useful information. Maybe a label that allows us to skip non-essential tests when needed might be something we'd find useful? By "essential" I have in mind build, test, linkage, and offline audits, but I'm open to discussion about what this means exactly. (Failing this, I am open to entertaining just deleting these tests.)

Perhaps we should just skip these tests that can be flaky when doing dependent testing. If we're actually modifying a formula, though, it's a good time to actually fix the flaky issues. I agree improving the test is the best case for this but removing it is better than having a test that makes a multi-hour build turn red 25% of the time.

I don't think it's correct to view all runtime functionality as equal. We already try to design brew so that it's most useful for most users. I suggest that we maintain that mindset when it comes to runtime functionality and only actually install things that most users need at runtime.

I agree but rather than shipping (on Mojave and below) broken functionality: we should just not ship it at all. Better to skip Python bindings entirely than ship them in a way that they don't work without new, manual, user intervention.

I'm a little surprised you agree with me here, since that's not the impression I got from your earlier responses. To make sure we're on the same page, what I meant was that if a formula has additional features that require extra dependencies, we should try to make those dependencies build- and test-only where possible, if we have to build them at all. I'd make exceptions for additional features that actually are of interest to most users of a formula: we should build those, and require the necessary dependencies at runtime.

I agree with making things build or test only when they actually are i.e. there's no situation in which they are required at runtime. If they are useful/behaviour enhancing at runtime: cool, build or test it is. In the case of bindings (on Mojave and below): things are actually broken at runtime without them.

Agreed. Dependency bloat is exactly the problem I'm trying to help solve here. I'm not saying that this should be the only thing we do, but I do think that this is one thing that can help and is a good idea.

In summary: on Mojave and below I disagree, on Catalina and above I agree. Given we're no longer bottling for Mojave and below: I'm fine with whatever in this case. I do think it's worth talking about the general case, though.

@SMillerDev
Copy link
Member

I think in general we should be more willing to use system versions of things, particularly on newer OSs.

I think this will get us in trouble with the Linux side of the equation though. There we don't make any assumptions about things being installed. And however many dependencies we drop on macOS in favor of the system, they'll stay and block on Linux.

@MikeMcQuaid
Copy link
Member

I think this will get us in trouble with the Linux side of the equation though. There we don't make any assumptions about things being installed.

Given appropriate uses_from_macos usage (and scoped to macOS versions where appropriate): we should be fine here.

And however many dependencies we drop on macOS in favor of the system, they'll stay and block on Linux.

This, however, is likely to be a problem and, as we've added ARM and Linux builds, flaky tests/audits get increasingly likely to happen.

@Bo98
Copy link
Member

Bo98 commented Oct 22, 2021

I generally agree that if a software offers Python bindings and its not the primary use of it then a built-time dep makes sense and will not hamper UX (one doesn't attempt to use Python bindings without having Python installed anyway). We have a few formulae which already do this.

I disagree however that it'll help much here in terms of the audit - the audit would (and should) still flag those kinds of version mismatches as we have no way of knowing whether those bindings are used or not in dependents.

Can these cases build with and/or be used at runtime with the system Python 3?

System Python is stuck on 3.8 so you'll be restricted to that or our python@3.8. We currently build everything with 3.9 so will have to downgrade everything to do that. Pip dependency trees can differ between versions of Python.

But note however there isn't a "system Python 3". There is one in the devtools and we do not require devtools to be installed when installing bottles.

@carlocab
Copy link
Member Author

I disagree however that it'll help much here in terms of the audit - the audit would (and should) still flag those kinds of version mismatches as we have no way of knowing whether those bindings are used or not in dependents.

Yes, I was thinking of formulae that have multiple versions of GHC in their dep tree but only actually use one of them. I just remembered that we actually have an allowlist for this sort of thing.

System Python is stuck on 3.8 so you'll be restricted to that or our python@3.8. We currently build everything with 3.9 so will have to downgrade everything to do that. Pip dependency trees can differ between versions of Python.

But note however there isn't a "system Python 3". There is one in the devtools and we do not require devtools to be installed when installing bottles.

We do require a CLT installation to install Python3, so anyone who might've installed Python already has the devtools installed.

@MikeMcQuaid
Copy link
Member

But note however there isn't a "system Python 3". There is one in the devtools and we do not require devtools to be installed when installing bottles.

Gotcha, point taken then.

We do require a CLT installation to install Python3, so anyone who might've installed Python already has the devtools installed.

I guess I'm ambivalent either way given the conversation above. I just think if we ever get to the point of going "if you want to use Python bindings then run brew install python" we should consider why we're even building them.

@fxcoudert
Copy link
Member

We've seen dependency-creep a lot in the past couples of years. Our general policy has been to build things with (in most cases) as many features as possible, and we carry a lot of heavy dependencies because of that. This is true of Python bindings, but it's also true of Qt: we have many more things that depend on Qt now. Even a simple install of poppler nowadays will bring in Qt as a dependency (550 MB on disk).

I'm not necessarily opposed to it, but we probably want to be consistent. In context of other things, Python is actually a very lightweight dependency. Moreover, it is a quite common dependency, so many users probably have it anyway (like openssl). Mixing Python versions is a real pain, and Apple keeps saying “do not rely on system python”. Also, system Python is missing some really basic modules/packages.

Therefore, I argue that the best course of action is probably to continue including python as dependency (probably even runtime dependency), and address the real problem: making testing smarter so we can do updates of things that have a lot of dependents.

@MikeMcQuaid
Copy link
Member

Even a simple install of poppler nowadays will bring in Qt as a dependency (550 MB on disk).

I'm not necessarily opposed to it, but we probably want to be consistent.

An aside but: I am opposed to it. Dependency bloat makes the package manager worse, in my opinion.

address the real problem: making testing smarter so we can do updates of things that have a lot of dependents.

Agreed.

Copy link
Member

@cho-m cho-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to current discussion, but adding some review comments for some CI failures.

@@ -3,10 +3,10 @@ class Gnuradio < Formula

desc "SDK for signal processing blocks to implement software radios"
homepage "https://gnuradio.org/"
url "https://github.com/gnuradio/gnuradio/archive/refs/tags/v3.9.3.0.tar.gz"
url "https://github.com/gnuradio/gnuradio/archive/refs/tags/v3.10.3.0.tar.gz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to failure for:

brew fetch --retry gnuradio --build-bottle --force

Looks like accidental find-replace:

Suggested change
url "https://github.com/gnuradio/gnuradio/archive/refs/tags/v3.10.3.0.tar.gz"
url "https://github.com/gnuradio/gnuradio/archive/refs/tags/v3.9.3.0.tar.gz"

Also, could optionally drop refs/tags part of URL.

Comment on lines -29 to +26
bin.env_script_all_files(libexec/"bin", PYTHONPATH: ENV["PYTHONPATH"])
# This is already keg-only, so we can install everything into the prefix.
system "python3", *Language::Python.setup_install_args(prefix)
Copy link
Member

@cho-m cho-m Oct 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is causing trouble with dependents as cython library is no longer automatically in PYTHONPATH. This would only work out-of-box if formula was linked (as it would then add cython libraries into default HOMEBREW_PREFIX/lib/python3.x).

Some options are to still re-add env variable for PYTHONPATH, manually set PYTHONPATH in each dependent, or remove keg-only and link it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'm not opposed to making this not-keg-only. I think this being keg-only just comes from our past policy of preferring users to pip install everything they can install with pip. We've relaxed that to allowing Python modules that expose a CLI, and that includes cython.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a minor risk of de-sync'ing formulae if a user runs pip install -U cython or pip install cython==<version>.

Though, that would be a user issue based on Homebrew's stance for python formulae.


Anyway, I'm also fine with just linking it for an easier-to-manage formula.

@cclauss cclauss mentioned this pull request Nov 24, 2021
@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale No recent activity label Nov 29, 2021
carlocab added a commit to carlocab/homebrew-core that referenced this pull request Nov 30, 2021
We're migrating a number of protobuf dependents on Python 3.10 in Homebrew#87277.

We could migrate protobuf fully to Python 3.10 there too, but I'd like
to avoid pulling in protobuf's dependency tree into the CI run there.
BrewTestBot pushed a commit that referenced this pull request Dec 1, 2021
We're migrating a number of protobuf dependents on Python 3.10 in #87277.

We could migrate protobuf fully to Python 3.10 there too, but I'd like
to avoid pulling in protobuf's dependency tree into the CI run there.

Closes #87287.

Signed-off-by: BrewTestBot <1589480+BrewTestBot@users.noreply.github.com>
@github-actions github-actions bot closed this Dec 6, 2021
@carlocab carlocab mentioned this pull request Dec 7, 2021
6 tasks
@chenrui333 chenrui333 mentioned this pull request Jan 7, 2022
carlocab added a commit to carlocab/homebrew-core that referenced this pull request Feb 5, 2022
This is needed for Homebrew#94505. The conflict comes from `freedink` depending
on `python@3.10` and `python@3.9` through `cxxtest` and `glib`
respectively.

`cxxtest` is installed into a virtualenv in `libexec`, so it's unlikely
that `freedink` actually makes use of anything there since it ships no
Python code itself.

An alternative to this is to migrate `glib` to `python@3.10`, which will
take quite a while (see Homebrew#87277), or migrate `cxxtest` back to
`python@3.9`, which is unnecessary.

I've tested `freedink` locally and it seems to work fine despite the
mixed recursive dependency.

See also Homebrew#65831.
BrewTestBot pushed a commit that referenced this pull request Feb 5, 2022
This is needed for #94505. The conflict comes from `freedink` depending
on `python@3.10` and `python@3.9` through `cxxtest` and `glib`
respectively.

`cxxtest` is installed into a virtualenv in `libexec`, so it's unlikely
that `freedink` actually makes use of anything there since it ships no
Python code itself.

An alternative to this is to migrate `glib` to `python@3.10`, which will
take quite a while (see #87277), or migrate `cxxtest` back to
`python@3.9`, which is unnecessary.

I've tested `freedink` locally and it seems to work fine despite the
mixed recursive dependency.

See also #65831.

Closes #94505.

Signed-off-by: BrewTestBot <1589480+BrewTestBot@users.noreply.github.com>
@cho-m cho-m mentioned this pull request Feb 26, 2022
@carlocab carlocab deleted the py310-glib branch September 17, 2022 15:50
@github-actions github-actions bot added the outdated PR was locked due to age label Oct 18, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CI-linux-self-hosted Build on Linux self-hosted runner CI-long-timeout Use longer GitHub Actions CI timeout. CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. outdated PR was locked due to age python Python use is a significant feature of the PR or issue python-3.10-migration stale No recent activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants