Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Retool the setup.py infrastructure #1454

Merged
merged 24 commits into from

8 participants

@mdboom
Owner

This is a complete reorganization of the setup.py/setupext.py infrastructure, and is the implementation of MEP11.

  • setuptools is now used to integrate with pip/easy_install.
  • Third-party Python dependencies (dateutil, pytz, six and pyparsing) are now installed automatically using setuptools, rather than maintaining local copies. This seems to work quite well for me, but will need to be tested in a number of environments (including with installers) to ensure it's a valid approach.
  • The install is modular: it's possible to turn on/off installing the tests, sample_data and toolkits.
  • We continue to ship Agg and PyCXX because they both require patches that haven't been merged upstream. However, the infrastructure is now present to make these optionally use the system libraries. That remains unimplemented, however, because doing so would produce a broken matplotlib.
  • On Python 3.x, the latest release of pyparsing (1.5.6) has a bug that makes it completely non-functional. This has already been fixed in pyparsing's SVN [1]. The installer will warn when using a version that is too old, and importing matplotlib will raise an exception.

There was some discussion on the mailing list about making dateutil and pyparsing optional dependencies. However, on actually trying to implement that, I'm not sure that makes sense. pyplot imports a number of things directly from dateutil, and we would have to deprecate those things in the API. pyparsing is required merely to parse a font descriptor -- so you'd be losing pretty core functionality without it. I have instead made them hard requirements. The installer will warn if they aren't around, try to install them with pip/easy_install, and failing that importing matplotlib without those dependencies will raise an exception.

[1] http://pyparsing.svn.sourceforge.net/viewvc/pyparsing?view=revision&revision=219

@mdboom
Owner

Sorry this is so hard to review. Most of the "639" changed files are just deleted dateutil and pytz. Github used to have a summary of files at the top, but they don't seem to anymore. The interesting files (obviously) are setup.py and setupext.py.

@WeatherGod
Collaborator

Just click "Show diff stats" in the Files Changed tab.

@mdboom
Owner

Aha! Thanks.

@mdboom
Owner

No comments on this? If not, I'd like to go ahead and merge this so it gets some testing in the wild.

@mdboom
Owner

As Mathew Topper pointed out on the mailing list, installing with pip should be mentioned in the docs. (Do we want to mention easy_install, or should we consider that obsoleted at this point?)

@WeatherGod
Collaborator

One thing that has always bothered me, and I haven't checked to see if it is still the case, is that any call to setup.py, with the exception of a "--help" option, makes an attempt to check for and/or build some of the code. So, if I am trying to call "python setup.py clean", it tries to build stuff first, or would fail if some things can't be found.

@mdboom
Owner

@WeatherGod: I don't think that's any longer the case, before or after this patch.

@cgohlke

This doesn't work for me on Windows: 1) getstatusoutput is not defined on win32 (patch below). 2) running setup.py bdist_wininst spawns new python.exe processes until the system runs out of resources.

diff --git a/setupext.py b/setupext.py
index c839455..547b825 100644
--- a/setupext.py
+++ b/setupext.py
@@ -684,9 +684,12 @@ class FreeType(SetupPackage):
     name = "freetype"

     def check(self):
-        status, output = getstatusoutput("freetype-config --version")
-        if status == 0:
-            version = output
+        if sys.platform != 'win32':
+            status, output = getstatusoutput("freetype-config --version")
+            if status == 0:
+                version = output
+            else:
+                version = None
         else:
             version = None
@ivanov
Collaborator

running setup.py install --user on this branch went out, fetched and installed distribute-0.6.28. I'm not sure what the consequences of that are, if any, but this was not the case with master.

@sandrotosi

Hi Michael,
sorry for jumping in that late; the perspective changes are really really awesome from a packaging POV (at least Debian side). I haven't checked the actual code though.

Can you point me to the changes that hadn't merge upstream for Agg and PyCXX? at least I can try to push them on Debian packages (so others may benefit from them) and then up until upstream. Even clarify what the base versions of those 2 softwares are might be enough, I'll extract the patches myself.

@mdboom
Owner

@sandrotosi : Not too late. This is a complex patch and it will be a while to get everything verified. The patch required to PyCXX is here:

https://github.com/matplotlib/matplotlib/pull/493/files

The patch to Agg is not in any sort of state that it could be permitted upstream -- it involves throwing a PyCXX exception from within Agg. I'll spend a little time seeing if that can be addressed in a better way that would be acceptable upstream -- or even acceptable as a Debian patch.

@ivanov: It shouldn't have installed distribute -- only fetched a local copy to use during the matplotlib install (if I understand correctly how all this stuff is supposed to work). That's so if dateutil and pyparsing aren't present, they can be automatically downloaded by setuptools.

@cgohlke: Do you have any instructions about how you set up your Windows build environment? Would you be interested in adding such to the docs? It looks like I'm going to need to fire up a virtual machine and get this going.

@mdboom
Owner

Note also that there should be a new release of pyparsing coming out shortly. 2.0.0 will be the minimum version on Python 3, and 1.5.6 will be the minimum version on Python 2.

@mdboom
Owner

@sandrotosi: I have made a better patch against Agg 2.4 here:

https://gist.github.com/4112950

We can submit it upstream, or at least consider it for inclusion in Debian's Agg package.

This requires a corresponding patch to matplotlib's _backend_agg.cpp (now included in this pull request).

Note that we don't support building against Agg 2.5 because it is GPL'd (not LGPL'd).

(We actually patch a few other things, but they are mainly for broader compiler support -- this is the only one that is critical).

@mdboom
Owner

@sandrotosi: Also to clarify: the PyCXX we include is based on 6.2.3. I just noticed there is a 6.2.4 (not on the website, but in the downloads list on Sourceforge). Unfortunately, it doesn't address the lack of Python 2.7 support for PyCapsule. (It completely prevents it from compiling matplotlib on Python 2.7).

@mdboom
Owner

@cgohlke: I now have the build "completing" on Windows. Would you mind conforming in your binary-building infrastructure when you have a chance?

@mdboom
Owner

@sandrotosi: An update on the PyCXX front. The lack of PyCapsule support for Python 2.7 is not a problem -- I was building things incorrectly leading me to believe that erroneously. The obstacle now is the lack of buffer support for Python 3.x. I'll whip up a patch for that and send it upstream to Barry Scott (maintainer of PyCXX). I'll also update our local copy to 6.2.4 + patches to work with Python 3.x, since that has a few minor improvements vs. what we have now.

@mdboom
Owner

@cgohlke, @sandrotosi: Let me know when you've had a chance to look at these recent changes. I'd like to merge this soon in the development cycle to give it lots of testing -- while not throwing something that breaks for a lot of people either ;)

@dmcdougall
Collaborator

@cgohlke You don't need to clone @mdboom's repo. If you have matplotlib/matplotlib.git cloned, you can just add mdboom as a remote:

$ git remote add mdboom https://github.com/mdboom/matplotlib.git
$ git fetch mdboom
$ get checkout mdboom/setup_rework

Hope that helps.

@cgohlke

This revision fails to find png.h and tk.h (mpl 1.2 finds them).

============================================================================
Edit setup.cfg to change the build options

BUILDING MATPLOTLIB
            matplotlib: yes [1.3.x]
                python: yes [2.7.3 (default, Apr 10 2012, 23:24:47) [MSC
                        v.1500 64 bit (AMD64)]]
              platform: yes [win32]

REQUIRED DEPENDENCIES AND EXTENSIONS
                 numpy: yes [version 1.6.2]
              dateutil: yes [using dateutil version 1.5]
             pyparsing: yes [using pyparsing version 1.5.6]
                 pycxx: yes [Couldn't import.  Using local copy.]
                libagg: yes [Requires patches that have not been merged
                        upstream. Using local copy.]
              freetype: yes [Unknown version]
                   png: no  [The C/C++ header for libpng (png.h) could not
                        be found.  You may need to install the development
                        package.]

OPTIONAL SUBPACKAGES
           sample_data: yes [installing]
              toolkits: yes [installing]
                 tests: yes [using nose version 1.2.1]

OPTIONAL BACKEND EXTENSIONS
                macosx: no  [skipping due to configuration]
                qt4agg: yes [Qt: 4.7.4, PyQt4: 4.8.6]
               gtk3agg: no  [Requires pygobject to be installed.]
             gtk3cairo: no  [Requires pygobject to be installed.]
                gtkagg: yes [Gtk: 2.22.1 pygtk: 2.22.0]
                 tkagg: no  [The C/C++ header for Tk (tk.h) could not be
                        found.  You may need to install the development
                        package.]
                 wxagg: yes [version 2.8.12.1]
                   gtk: yes [Gtk: 2.22.1 pygtk: 2.22.0]
                 qtagg: no  [not found]
                   agg: yes [installing]
                 cairo: yes [version 1.10.0]

OPTIONAL LATEX DEPENDENCIES
                dvipng: yes [version 1.12]
           ghostscript: yes [version 9.06]
                 latex: yes [version MiKTeX 2.9]
               pdftops: no  []

============================================================================
                        * The following required packages can not be built:
                        * png, tkagg
@mdboom
Owner

@cgohlke: On windows, it looks in win32_static/include for these header files (as with master). Should it also be looking somewhere else?

@cgohlke

@mdboom: Two issues: I don't use the hard coded win32_static/include directory but specify all the relevant include directories in the INCLUDE environment variable, which can depend on the Python and compiler environments. Second, GTK's pkgconfig works for me and there is no need for the C:/GTK paths.
The build succeeds on my system with the following patch:

diff --git a/setupext.py b/setupext.py
index ba2e20a..5d2a263 100644
--- a/setupext.py
+++ b/setupext.py
@@ -124,6 +124,8 @@ def check_include_file(include_dirs, filename, package):
     """
     Raises an exception if the given include file can not be found.
     """
+    if sys.platform == 'win32':
+        include_dirs.extend(os.getenv('INCLUDE', '.').split(';'))
     if not has_include_file(include_dirs, filename):
         raise CheckFailed(
             "The C/C++ header for %s (%s) could not be found.  You "
@@ -1272,24 +1274,6 @@ class BackendGtk(OptionalBackendPackage):

     def add_flags(self, ext):
         if sys.platform == 'win32':
-            # popen broken on my win32 plaform so I can't use pkgconfig
-            ext.library_dirs.extend(
-                ['C:/GTK/bin', 'C:/GTK/lib'])
-
-            ext.include_dirs.extend(
-                ['win32_static/include/pygtk-2.0',
-                 'C:/GTK/include',
-                 'C:/GTK/include/gobject',
-                 'C:/GTK/include/gext',
-                 'C:/GTK/include/glib',
-                 'C:/GTK/include/pango',
-                 'C:/GTK/include/atk',
-                 'C:/GTK/include/X11',
-                 'C:/GTK/include/cairo',
-                 'C:/GTK/include/gdk',
-                 'C:/GTK/include/gdk-pixbuf',
-                 'C:/GTK/include/gtk',
-                 ])

             def getoutput(s):
                 ret = os.popen(s).read().strip()
@@ -1297,7 +1281,26 @@ class BackendGtk(OptionalBackendPackage):

             if 'PKG_CONFIG_PATH' not in os.environ:
                 # If Gtk+ is installed, pkg-config is required to be installed
-                os.environ['PKG_CONFIG_PATH'] = 'C:\GTK\lib\pkgconfig'
+                os.environ['PKG_CONFIG_PATH'] = 'C:\\GTK\\lib\\pkgconfig'
+
+                # popen broken on my win32 plaform so I can't use pkgconfig
+                ext.library_dirs.extend(
+                    ['C:/GTK/bin', 'C:/GTK/lib'])
+
+                ext.include_dirs.extend(
+                    ['win32_static/include/pygtk-2.0',
+                     'C:/GTK/include',
+                     'C:/GTK/include/gobject',
+                     'C:/GTK/include/gext',
+                     'C:/GTK/include/glib',
+                     'C:/GTK/include/pango',
+                     'C:/GTK/include/atk',
+                     'C:/GTK/include/X11',
+                     'C:/GTK/include/cairo',
+                     'C:/GTK/include/gdk',
+                     'C:/GTK/include/gdk-pixbuf',
+                     'C:/GTK/include/gtk',
+                     ])

             pygtkIncludes = getoutput(
                 'pkg-config --cflags-only-I pygtk-2.0').split()
@mdboom
Owner

@cgohlke: Thanks -- that patch is an improvement over all. I think it probably also makes sense to turn pkgconfig on for everything on Windows, with fallbacks, of course.

@cgohlke

I'm not sure turning on pkgconfig for everything on Windows is a good idea. At least on my system, GTK and associates (PyGTK, PyGObject, PyCairo) are the only packages supporting pkgconfig. If pkgconfig is always on, freetype2 and libpng headers and libraries will be picked up from the GTK directories, but these libraries are outdated and inferior to my libpng and freetype2 builds, which are found through INCLUDE and LIB environment variables.

@mdboom
Owner

@cgohlke: In your patch above, why is INCLUDE explicitly handled? Isn't that handled by the Microsoft compiler directly? Any idea why popen isn't working for you? I'd rather fix that in the general case than have to one-off hard code paths here -- but maybe that's just not possible.

@cgohlke

@mdboom: On my system matplotlib's setupext.py does not find some header files and cancels, unless the INCLUDE paths are added to the include_dir list. The Microsoft Compiler do find the headers and libraries in the paths specified by INCLUDE and LIB environment variables without matplotlib specifying those paths again.
Regarding popen: it does work for me. pkgconfig does work too. The comment # popen broken on my win32 plaform so I can't use pkgconfig and the hard coded GTK paths are from someone else. I just moved it to another code path, as a fallback, but frankly I would remove it.

@mdboom
Owner

@cgohlke: Ok. Can you test this branch again? I think I've integrated what you've proposed (in a slightly different way) by handling the 'INCLUDE' environment variable for everything.

@cgohlke

This does not work. get_base_dirs._basedir_map gets corrupted during runtime and the headers are not found in the include sub-directory of the base_dirs. What's wrong with my original patch?

@mdboom
Owner

My concern with the original patch is that it special cases Gtk. If we're looking for headers in INCLUDE on Windows, we should do it on all packages.

How is get_base_dirs._basedir_map being corrupted?

Do you have instructions on setting up the build environment you are using? I have this working myself on Windows, but I'm sure my environment is not set up in the same way as yours, and there may be ways in which the binaries I'm producing are not as good... But unless I recreate your environment, I have to back-and-forth things with you like this, obviously.

@cgohlke

The build environment is simple: the paths to the libpng, zlib, freetype, and tcl/tk include files and link libraries are appended to the INCLUDE and LIB environment variables. PKG_CONFIG_PATH points to the GTK and PyGTK pkgconfig directories. To build matplotlib installers, run python setup.py bdist_wininst, which uses the default Visual Studio compilers. I don't see what's wrong with special casing GTK, which is a special case in that environment (contains conflicting headers and libraries; built with a different compiler/C runtime; introduces a dependency on a separate ~100 MB runtime). For mingw environments, pkgconfig might work for all dependencies but I did not test.

@mdboom
Owner

Ok -- that seems reasonable. I'll amend this with your original patch.

@cgohlke

OK. This works for me.

@cgohlke

setup.py bdist_egg works for me with this branch.

@mdboom
Owner

Now I just need to figure out why Travis is failing -- it's something pretty odd...

@NelleV
Collaborator

github indicates there are 650files changed. I'm guessing this branch hasn't been updated for a while. Maybe you could quickly do a rebase?

@mdboom
Owner

Actually, I just did a rebase. That number of files is actually correct, as this PR removes our local copy of dateutil, pytz (many many files) and pyparsing, among other things.

@mdboom
Owner

The Travis failures are bizarre, but presumably legitimate... still looking into it, as I'd like to have those resolved before merging this.

@NelleV NelleV commented on the diff
distribute_setup.py
((22 lines not shown))
+from distutils import log
+
+try:
+ from site import USER_SITE
+except ImportError:
+ USER_SITE = None
+
+try:
+ import subprocess
+
+ def _python_cmd(*args):
+ args = (sys.executable,) + args
+ return subprocess.call(args) == 0
+
+except ImportError:
+ # will be used for python 2.3
@NelleV Collaborator
NelleV added a note

I think this can be dropped.

@mdboom Owner
mdboom added a note

I'd prefer not to -- this file comes from distribute itself, I didn't write it. It's just used to bootstrap distribute if the user doesn't already have it installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@pelson
Collaborator

@mdboom - Are all of the changes necessary at the same time? For instance, can PyCXX be updated in a separate PR? Can we remove Pytz & dateutil in a separate PR? There are changes here which surprise me a little (changes to some python 3 ifdef blocks for instance) - are they needed in this PR?

I'm all in favour of the re-tool, it's just impossible to see the wood from the trees as it stands. The only other feasible way to move this PR forward without reducing the size of the change, AFAICS, is to hit merge and see what happens...

@mdboom
Owner

The changes for review are almost entirely in setup.py, setupext.py, .travis.yml, INSTALL, setup.cfg.template and doc/faq/installing_faq.html. You can use the "Show Diff Stats" button at the top of the "Files Changed" page to navigate around. But for this PR, more than reviewing the code, it's important to test this on as many platforms and configurations as we can.

Unfortunately, it's hard to separate these things out, because once you say "we support building with external libraries", you have to make the changes so that our code compiles with vanilla copies of those libraries (hence the changes to _backend_agg.cpp, for example, and the updating of PyCXX.). If you do it piecemeal, you'd end up with a master that wouldn't build, because we'd have different behavior in the system libraries and the ones we ship.

The only changes to Py3 ifdef blocks that I see are within PyCXX, which is simply an updating to the current upstream version -- we shouldn't need to review those changes for style, etc., only confirm that they work by building, running our test suite etc.

About the only thing that could be reduced here, I think, is removing dateutil and pytz later -- but again, I think ideally that removal should all be tested together. It's important to know that after removing those files from our repo that matplotlib still installs and works.

@pelson
Collaborator

Ok. Thanks @mdboom .

FWIW this works well on OSX 10.8 (Mountain Lion) for python2.7.

The only other feasible way to move this PR forward without reducing the size of the change, AFAICS, is to hit merge and see what happens...

On the basis of your last post, I'd suggest we merge this and fix any fallout that might ensue.

@mdboom mdboom merged commit 7940245 into from
@dmcdougall
Collaborator

Does this change utilise pip install to find and install dependencies? I should point out that it has recently been discovered that pip on an untrusted network is not safe.

It might be worth shipping a warning with this.

@mdboom
Owner

Yes, it does. The fact that we ship copies of so many libraries has long been a problem. There were many good reasons to stop doing that, but also a strong desire to not make the installation from source (or pip) more inconvenient than it already is. pip is, unfortunately, the tool we have to do that.

The post you linked to is not as alarming as it sounds -- why is he running pip as root? Now that pip has virtualenv integration, it's unnecessary. The fact that pip doesn't sign packages still means that you could get malicious software installed, but at least it won't have root privileges. But I can't think that installing unsigned packages is much more dangerous than asking users to download and install those packages manually, as it's unlikely those packages' hashes will be checked either.

Maybe we could:

a) if the dependencies are missing and distribute is about to download them on our behalf...
b) check if the script is running as root...
c) if so, display a warning message about the security hazard of doing so and suggesting virtualenv, or installing the required dependencies manually.

@mdboom mdboom deleted the branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Feb 25, 2013
  1. @mdboom
  2. @mdboom
  3. @mdboom

    Fix a few bugs in the build.

    mdboom authored
  4. @mdboom

    More fixes

    mdboom authored
  5. @mdboom

    setupegg.py is now redundant.

    mdboom authored
  6. @mdboom

    Fix type in dates docs

    mdboom authored
  7. @mdboom
  8. @mdboom
  9. @mdboom
  10. @mdboom
  11. @mdboom

    Fix build on Python 2.6

    mdboom authored
  12. @mdboom

    Update Travis dependencies

    mdboom authored
  13. @mdboom
  14. @mdboom

    Fix agg overflow test

    mdboom authored
  15. @mdboom
  16. @mdboom
  17. @mdboom

    Update to PyCXX 6.2.4

    mdboom authored
  18. @mdboom
  19. @mdboom
  20. @mdboom

    Attempting to fix Travis tests

    mdboom authored
  21. @mdboom

    Windows fixes from cgohlke

    mdboom authored
  22. @mdboom

    Fixups after rebase

    mdboom authored
  23. @mdboom
  24. @mdboom
Something went wrong with that request. Please try again.