Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn pyqt into an optional dependency #157

Closed
wants to merge 1 commit into from
Closed

Turn pyqt into an optional dependency #157

wants to merge 1 commit into from

Conversation

marcelm
Copy link
Contributor

@marcelm marcelm commented May 23, 2018

pyqt pulls in a lot of dependencies which increase the size of each conda environment with matplotlib in it by hundreds of megabytes. See issue #2. This patch removes the hard dependency on pyqt and makes the TkAgg backend (which is also interactive) the default.

The TkAgg backend only requires tk, which is available anyway in every environment that has Python in it, so there is no overhead. Overall, the size of a fresh environment with only matplotlib installed into it decreases from roughly 800 MB to 400 MB.

One downside is that the TkAgg backend looks slightly uglier than the PyQt one on Linux.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@marcelm
Copy link
Contributor Author

marcelm commented May 23, 2018

This is what happens when a user tries to use the qt5agg backend with this version of the package:

>>> import matplotlib
>>> matplotlib.use('qt5agg')
>>> import matplotlib.pyplot as plt
Traceback (most recent call last):
...
ImportError: Matplotlib qt-based backends require an external PyQt4, PyQt5,
PySide or PySide2 package to be installed, but it was not found.

I think this is quite clear. I’ve verified that installing any of pyside, pyside2, pyqt will fix the error and allow the backend to be used.

@rth rth mentioned this pull request May 23, 2018
@rth
Copy link
Member

rth commented May 23, 2018

Thanks for working on this @marcelm !

I am excited about this feature as it would significantly help the usage of conda installed matplotlib with other Python Qt backends. However, I think the consensus in #2 (comment) was to make a matplotlib-core package independent of the Qt backend and keep a matplotlib package around with the current behavior.

Having matplotlib package not install some Qt backend will break expectations (and setups) for users IMO.

I'm wondering if this PR could be used to create the matplotlib-core feedstock (cf #2 (comment))

@marcelm
Copy link
Contributor Author

marcelm commented May 23, 2018

Note that the Qt backend is not installed in macOS, so any user relying on matplotlib coming with a PyQt backend would be disappointed on that platform already.

@rth
Copy link
Member

rth commented May 23, 2018

I think the consensus in #2 (comment) was to make a matplotlib-core package independent of the Qt backend and keep a matplotlib package around with the current behavior.

Hmm, well maybe not after all #2 (comment)

@jakirkham
Copy link
Member

Kind of off topic, but I'm surprised we don't just use the macosx backend on macOS. Guessing we were trying to keep these uniform across OSes.

@jakirkham
Copy link
Member

Thoughts @conda-forge/matplotlib?

@marcelm
Copy link
Contributor Author

marcelm commented May 29, 2018

I’d be happy to answer further questions. Please see also my summary of the consequences of this PR in issue #2.

@jakirkham
Copy link
Member

Friendly nudge 😉

@jakirkham
Copy link
Member

Ping?

@dopplershift
Copy link
Member

I’m ok with it, since it’s an optional dependency for our PyPI packaging. My only concern would be differences vs. the packages in the defaults channel.

@jakirkham
Copy link
Member

@mingwandroid, care to comment?

@mingwandroid
Copy link

The initial assertion in justification of this PR is a simplification:

pyqt pulls in a lot of dependencies which increase the size of each conda environment with matplotlib in it by hundreds of megabytes

Conda uses hardlinks, if you have Qt installed at all you don't pay any cost in other envs.

And the downsides are too serious for my liking.

@rth
Copy link
Member

rth commented Jun 11, 2018

Thanks for the feedback, @mingwandroid !

IMO a larger concert than size is that currently there is no well defined way to use matplotlib installed from this feedstock with a different backend (e.g. pyside/pyside2 etc). As far as I understand that was part of the initial motivation for the parent issue #2 and it was also raised in #155

Do you you think there are more appropriate ways to address #2 than PR ?

@marcelm
Copy link
Contributor Author

marcelm commented Jun 11, 2018

conda uses hardlinks, if you have Qt installed at all you don't pay any cost in other envs.

Sure, I didn’t want to give the impression the savings are per environment. However, if matplotlib is the only reason Qt is installed at all, it’s relevant. For me, this is the case both on my personal machine and in CI, and a couple of people commenting on issue #2 didn’t like the overhead either.

And the downsides are too serious for my liking.

Which ones? Perhaps they can be rectified. So far, the most visible downside I could see was that Tk would become the default backend, which looks different than the Qt one.

@jakirkham
Copy link
Member

Indeed, the size overhead is a serious issue we run into and it only comes in because of matplotlib. In particular, this overhead is not cheap when using containers where space is always precious and Qt/PyQt are quite expensive (not to mention completely unused). So eliminating the dependency would be a big improvement.

Should add this also came up at a recent sprint. Apparently some users currently solve this issue by just using Matplotlib from PyPI where Qt/PyQt are not requirements.

Right now we don't actually install Qt/PyQt on macOS. Though it works fine to install PyQt and start plotting things with the IPython command line console using that backend. So would be curious to know what downsides you are seeing.

@mingwandroid
Copy link

So long as the package called matplotlib depends on qt I'm fine with this, so matplotlib-core could be created without such a dependency, then matplotlib would depend upon qt and matplotlib-core.

cc'ing @jjhelmus, @msarahan, @kalefranz and @nehaljwani for any additional thoughts.

@tacaswell
Copy link
Contributor

I am 👍 on a matplotlib-core which has no GUI dependencies.

Does conda package any other GUI frameworks? If so having matplotlib-wx, matplotlib-qt, etc would be nice.

I am not sure how to solve the issue with the overlapping packages wanting to control the default matplotribrc to set the default backend. You could do it by providing a env variable in the activate scripts, but then you can get weird dependencies on exactly how you get into the enviroment.

@mingwandroid
Copy link

Could a config.d scheme be implemented for that?

@tacaswell
Copy link
Contributor

One reason to stick with Qt over Tk is that Qt support hi-dpi displays and Tk does not. My guess is that flipping the default GUI installed by conda will annoy a lot of users.

Historically the OSX backend was a very different code-path the the other GUI backends which is probably why the original decision to use Qt everywhere was made. OSX also supports hi-dpi so switching the default on mac would not be terrible, but may still surprise some users.

@mingwandroid How would such a scheme work?

@mingwandroid
Copy link

@mingwandroid How would such a scheme work?

Instead of having multiple packages compete to write contents to a single file somehow, you'd amalgamate / merge the contents of any files found in a certain directory. Then you could have both matplotlib-qt.conf and matplotlib-tk.conf

Of course this requires changes to mpl. I suggested the same solution to the JupyterLab team for the same reasons, but I'm not sure if they've had time (or are indeed interested) to implement this.

@mingwandroid
Copy link

mingwandroid commented Jun 12, 2018

Does conda package any other GUI frameworks?

Yes, wxPython (4, pheonix). Not sure if it's on all platforms, @nehaljwani?

@marcelm
Copy link
Contributor Author

marcelm commented Jun 13, 2018

As an alternative to implementing the matplotlibrc.d/ scheme, there is an open PR at matplotlib/matplotlib#9795 which turns the backend setting in matplotlibrc into a list of backends. Backends would be tried one after the other, picking the first one that works.

@marcelotrevisani
Copy link
Member

It seems that matplotlib just depends of Qt5Agg or Qt4Agg in Qt.

@mingwandroid
Copy link

It seems that matplotlib just depends of Qt5Agg or Qt4Agg in Qt.

which in turn depend upon most of Qt?

@nehaljwani
Copy link
Member

Yes, wxPython (4, pheonix). Not sure if it's on all platforms, @nehaljwani?

Sorry, I am late to the party. Yes, wxPython is packaged on all platforms.

@hmaarrfk
Copy link
Contributor

Is there any traction on this? I'm super excited to be able to depend on a matplotlib (or matplotlib-core) package that doesn't depend on PyQt5.

My guess is that flipping the default GUI installed by conda will annoy a lot of users.

Yes, but it would at least make people aware that they are using a GPL'ed library in their development.

@hmaarrfk
Copy link
Contributor

Right. So people pulling in GPL code into their dependencies is has quite important consequences for those that aren't paying attention, even for interpreted languages.

Here is a link to the Riverbank Computing mailing list question that might be relevant

PySide2 is LGPL and that is its main advantage, at least in terms of license compatibility.

@ocefpaf
Copy link
Member

ocefpaf commented Aug 18, 2018

So long as the package called matplotlib depends on qt I'm fine

+1

Is there any traction on this?

There is. I am traveling at the moment but I'll take a better look at this next week.

Another motivation is that, in a jupyter notebook world, the matplotlib-core would be enough; reducing remote/cloud instances sizes significantly.

@mingwandroid
Copy link

Good point about pyqt. Sorry for forgetting about that. We need pyside2 eh?

@hmaarrfk
Copy link
Contributor

@mingwandroid no need to be sorry. It is all very confusing.

I'm just going to quote the GPL FAQ for future reference:

A consequence is that if you choose to use GPL'd Perl modules or Java classes in your program, you must release the program in a GPL-compatible way, regardless of the license used in the Perl or Java interpreter that the combined Perl or Java program will run on.

Perl here being similar to Python in the fact that it is interpreted as Java is more compiled.

The challenge is that making the default PyQt may imply that matplotlib is meant to run on PyQt, and hence, as a whole, this package would become GPL.

I know it sucks, but if you want to ensure that packages that depend on matplotlib don't become GPL, you should tell them to explicitly set their backend themselves. If you want to leave the argument as:

Well for those that don't need GPL, they should specify an other backend, and their problem is solved. Since their code doesn't call the GPL'ed libraries we provide, they should be fine.

Then I don't think you are being fair in the license that you imply when you you let people know that it is a "PSF based license".

For what its worth, the anaconda package also depends on PyQt >=5.9, so it too might be a GPL package. https://anaconda.org/anaconda/matplotlib/files

I know many of you have been saying that you rely that matplotlib is pulling in PyQt, but what packages depend on that behaviour? Maybe those packages should explicitely include the dependency themselves (and probably release their specific distribution as GPL).

@mingwandroid
Copy link

OK, well, I'm fine with splitting this then. One concern is that the tk backend on AD doesn't use freetype and what-not, so text looks pretty awful. We'll have to figure something out about that.

The reason tk doesn't use freetype is to do with build order untangling during bootstrapping introduced by tk being a dependency of python (that and that we've never enabled freetype with it AFAIK).

@hmaarrfk
Copy link
Contributor

Sorry @mingwandroid. I don’t understand what you mean by “splitting”. Can you please rephrase

@mingwandroid
Copy link

You cannot just drop a dependency like this trivially as people who update will end up being in a broken state should this work depend on matplotlib w/ qt. So what needs to happen is that matplotlib needs to be split into two, e.g. matplotlib-core and matplotlib.

Then recipes need to migrate to depending on the former (unless they really want the qt backend).

Conda build supports creating these split packages from a single recipe to save on build time, duplication of recipes and duplication of final artefacts.

@tacaswell
Copy link
Contributor

With matplotlib 3.0 we no longer require packagers to ship a .matplotlibrc and will find the GUI frameworks if they are installed (see https://matplotlib.org/devdocs/users/whats_new.html#improved-default-backend-selection). There is undoubtably more work to be done on making the backend selection / switching better, but we pushed that off to a future release.

A more recent thread on the GPL consequences of PyQt: https://riverbankcomputing.com/pipermail/pyqt/2018-July/040638.html . My (IANAL) summary: you can have BSD/MIT licensed code that depends on pyqt and it is up to the end user to ensure that they have a valid pyqt license for their use.

@marcelm
Copy link
Contributor Author

marcelm commented Aug 19, 2018

Conda build supports creating these split packages from a single recipe to save on build time, duplication of recipes and duplication of final artefacts.

I guess you have in mind the discussion at conda/conda-build#1338, where the idea was to split matplotlib itself. That is, the backend code within matplotlib that talks to PyQt (for example), would be moved to a matplotlib-qt package. However, this isn’t actually necessary. Matplotlib, as far as I understand, always includes all the backends, and whether they are usable or not is determined at runtime. So simply installing pyqt (or pyside) in addition to matplotlib-core will make the Qt5Agg backend work (I tested this).

I think the simplest way forward would be this:

  1. Rename matplotlib to matplotlib-core
  2. Remove the pyqt dependency from matplotlib-core (that is, apply this PR)
  3. Add a separate matplotlib recipe, which is just a metapackage with a dependency on matplotlib-core and pyqt.

The only remaining problem, as discussed above, is that matplotlib-core needs to set the default backend to TkAgg in matplotlibrc, and the matplotlib metapackage would not be able to override it (to Qt5Agg). However, Matplotlib 3 will have a backend fallback mechanism. So perhaps we just need to wait until that is released. (edit: sorry for overlap with @tacaswell’s comment, didn’t see it appear)

@marcelm
Copy link
Contributor Author

marcelm commented Aug 19, 2018

I wonder whether Matplotlib 3 going Python 3 only has implications relevant to this discussion.

@hmaarrfk
Copy link
Contributor

@tacaswell, I guess my questions is:

Is conda-forge the end user that is packaging this product as a whole? And if so, what license of PyQt are they using? Seems to me that conda-forge is using the GPL PyQt, and as such, everything that is depending on this PyQt library is now GPL isn't it?

I can't really receive matplotlib from conda-forge without also downloading a copy of PyQt5 GPL unless I use --force which would force me to crawl through all the other dependencies myself as a user of the library. Therefore, matplotlib could be seen as coming with PyQt5 GPL which would make the particular copy of matplotlib I obtained from conda-forge also protected by the GPL license that protects PyQt5

@tacaswell
Copy link
Contributor

However, this isn’t actually necessary. Matplotlib, as far as I understand, always includes all the backends, and whether they are usable or not is determined at runtime.

That is one way to package it, the big linux distributions tend to deconstruct the library and have matplotlib-qt include both the qt dependancy and the code to support that backend.

(edit: sorry for overlap with @tacaswell’s comment, didn’t see it appear)

Reassuring we are on the same page!

@hmaarrfk I am mostly concerned about the licensing from the libraries point of view (Q: Is Matplotlib GPL because we have import PyQt4 someplace in the code base A: no)

@hmaarrfk
Copy link
Contributor

@tacaswell got it. I am pretty excited to have a package (any name package) that doesn't depend on pyqt that has the power of the conda-forge bots behind it. I'll leave the discussion of the licensing of the package to a seperate issue. I think I've hijaked this one enough. Sorry for changing the focus of the conversations guys.

@marcelm
Copy link
Contributor Author

marcelm commented Aug 19, 2018

That is one way to package it, the big linux distributions tend to deconstruct the library and have matplotlib-qt include both the qt dependancy and the code to support that backend.

Ah ok, sure, in that case split packages would be the way to go. But then that would be a different change and can be done separately after splitting off matplotlib-core.

@tacaswell
Copy link
Contributor

Ah ok, sure, in that case split packages would be the way to go. But then that would be a different change and can be done separately after splitting off matplotlib-core.

👍

I am 💯 % on board with the plan in #157 (comment) for the packaging for mpl 3.0

@hmaarrfk
Copy link
Contributor

Is matplotlib going to be the first package that implements optional dependencies on conda-forge? is the - really the best naming convention?

@tacaswell
Copy link
Contributor

airflow has already been split into many build products from a single recipe.

@hmaarrfk
Copy link
Contributor

Man, -with-qt5 would drive me mad.
Are there other options? I think - might be used for exclusive build variants, but I could be 100% wrong.

@marcelm
Copy link
Contributor Author

marcelm commented Sep 19, 2018

Now after matplotlib 3.0.0 has been released, I wonder whether it is now possible to proceed.

I updated the recipe in this PR to work with current master and renamed the package to matplotlib-core, of course only in meta.yaml, renaming the repo would need to be done manually I guess. I also opened a PR for a matplotlib metapackage: conda-forge/staged-recipes#6697

@ocefpaf
Copy link
Member

ocefpaf commented Sep 19, 2018

renaming the repo would need to be done manually

We don't need to do that.

PR for a matplotlib metapackage: conda-forge/staged-recipes#6697

Let's try to avoid that use see if we can use split packages here. See https://github.com/conda-forge/gdal-feedstock/blob/master/recipe/meta.yaml for an example.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict.
Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

See also the discussion at
#2
@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@marcelm
Copy link
Contributor Author

marcelm commented Sep 20, 2018

Please see #178 for a new PR implementing the split into matplotlib and matplotlib-core with a single recipe as suggested by @ocefpaf .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet