Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retool the setup.py infrastructure #1454

Merged
merged 24 commits into from Feb 26, 2013
Merged

Conversation

mdboom
Copy link
Member

@mdboom mdboom commented Nov 6, 2012

This is a complete reorganization of the setup.py/setupext.py infrastructure, and is the implementation of MEP11.

  • setuptools is now used to integrate with pip/easy_install.
  • Third-party Python dependencies (dateutil, pytz, six and pyparsing) are now installed automatically using setuptools, rather than maintaining local copies. This seems to work quite well for me, but will need to be tested in a number of environments (including with installers) to ensure it's a valid approach.
  • The install is modular: it's possible to turn on/off installing the tests, sample_data and toolkits.
  • We continue to ship Agg and PyCXX because they both require patches that haven't been merged upstream. However, the infrastructure is now present to make these optionally use the system libraries. That remains unimplemented, however, because doing so would produce a broken matplotlib.
  • On Python 3.x, the latest release of pyparsing (1.5.6) has a bug that makes it completely non-functional. This has already been fixed in pyparsing's SVN [1]. The installer will warn when using a version that is too old, and importing matplotlib will raise an exception.

There was some discussion on the mailing list about making dateutil and pyparsing optional dependencies. However, on actually trying to implement that, I'm not sure that makes sense. pyplot imports a number of things directly from dateutil, and we would have to deprecate those things in the API. pyparsing is required merely to parse a font descriptor -- so you'd be losing pretty core functionality without it. I have instead made them hard requirements. The installer will warn if they aren't around, try to install them with pip/easy_install, and failing that importing matplotlib without those dependencies will raise an exception.

[1] http://pyparsing.svn.sourceforge.net/viewvc/pyparsing?view=revision&revision=219

@mdboom
Copy link
Member Author

mdboom commented Nov 6, 2012

Sorry this is so hard to review. Most of the "639" changed files are just deleted dateutil and pytz. Github used to have a summary of files at the top, but they don't seem to anymore. The interesting files (obviously) are setup.py and setupext.py.

@WeatherGod
Copy link
Member

Just click "Show diff stats" in the Files Changed tab.

@mdboom
Copy link
Member Author

mdboom commented Nov 6, 2012

Aha! Thanks.

@mdboom
Copy link
Member Author

mdboom commented Nov 16, 2012

No comments on this? If not, I'd like to go ahead and merge this so it gets some testing in the wild.

@mdboom
Copy link
Member Author

mdboom commented Nov 16, 2012

As Mathew Topper pointed out on the mailing list, installing with pip should be mentioned in the docs. (Do we want to mention easy_install, or should we consider that obsoleted at this point?)

@WeatherGod
Copy link
Member

One thing that has always bothered me, and I haven't checked to see if it is still the case, is that any call to setup.py, with the exception of a "--help" option, makes an attempt to check for and/or build some of the code. So, if I am trying to call "python setup.py clean", it tries to build stuff first, or would fail if some things can't be found.

@mdboom
Copy link
Member Author

mdboom commented Nov 16, 2012

@WeatherGod: I don't think that's any longer the case, before or after this patch.

@cgohlke
Copy link
Contributor

cgohlke commented Nov 16, 2012

This doesn't work for me on Windows: 1) getstatusoutput is not defined on win32 (patch below). 2) running setup.py bdist_wininst spawns new python.exe processes until the system runs out of resources.

diff --git a/setupext.py b/setupext.py
index c839455..547b825 100644
--- a/setupext.py
+++ b/setupext.py
@@ -684,9 +684,12 @@ class FreeType(SetupPackage):
     name = "freetype"

     def check(self):
-        status, output = getstatusoutput("freetype-config --version")
-        if status == 0:
-            version = output
+        if sys.platform != 'win32':
+            status, output = getstatusoutput("freetype-config --version")
+            if status == 0:
+                version = output
+            else:
+                version = None
         else:
             version = None

@ivanov
Copy link
Member

ivanov commented Nov 17, 2012

running setup.py install --user on this branch went out, fetched and installed distribute-0.6.28. I'm not sure what the consequences of that are, if any, but this was not the case with master.

@sandrotosi
Copy link
Contributor

Hi Michael,
sorry for jumping in that late; the perspective changes are really really awesome from a packaging POV (at least Debian side). I haven't checked the actual code though.

Can you point me to the changes that hadn't merge upstream for Agg and PyCXX? at least I can try to push them on Debian packages (so others may benefit from them) and then up until upstream. Even clarify what the base versions of those 2 softwares are might be enough, I'll extract the patches myself.

@mdboom
Copy link
Member Author

mdboom commented Nov 19, 2012

@sandrotosi : Not too late. This is a complex patch and it will be a while to get everything verified. The patch required to PyCXX is here:

https://github.com/matplotlib/matplotlib/pull/493/files

The patch to Agg is not in any sort of state that it could be permitted upstream -- it involves throwing a PyCXX exception from within Agg. I'll spend a little time seeing if that can be addressed in a better way that would be acceptable upstream -- or even acceptable as a Debian patch.

@ivanov: It shouldn't have installed distribute -- only fetched a local copy to use during the matplotlib install (if I understand correctly how all this stuff is supposed to work). That's so if dateutil and pyparsing aren't present, they can be automatically downloaded by setuptools.

@cgohlke: Do you have any instructions about how you set up your Windows build environment? Would you be interested in adding such to the docs? It looks like I'm going to need to fire up a virtual machine and get this going.

@mdboom
Copy link
Member Author

mdboom commented Nov 19, 2012

Note also that there should be a new release of pyparsing coming out shortly. 2.0.0 will be the minimum version on Python 3, and 1.5.6 will be the minimum version on Python 2.

@mdboom
Copy link
Member Author

mdboom commented Nov 19, 2012

@sandrotosi: I have made a better patch against Agg 2.4 here:

https://gist.github.com/4112950

We can submit it upstream, or at least consider it for inclusion in Debian's Agg package.

This requires a corresponding patch to matplotlib's _backend_agg.cpp (now included in this pull request).

Note that we don't support building against Agg 2.5 because it is GPL'd (not LGPL'd).

(We actually patch a few other things, but they are mainly for broader compiler support -- this is the only one that is critical).

@mdboom
Copy link
Member Author

mdboom commented Nov 19, 2012

@sandrotosi: Also to clarify: the PyCXX we include is based on 6.2.3. I just noticed there is a 6.2.4 (not on the website, but in the downloads list on Sourceforge). Unfortunately, it doesn't address the lack of Python 2.7 support for PyCapsule. (It completely prevents it from compiling matplotlib on Python 2.7).

@mdboom
Copy link
Member Author

mdboom commented Nov 20, 2012

@cgohlke: I now have the build "completing" on Windows. Would you mind conforming in your binary-building infrastructure when you have a chance?

@mdboom
Copy link
Member Author

mdboom commented Nov 26, 2012

@sandrotosi: An update on the PyCXX front. The lack of PyCapsule support for Python 2.7 is not a problem -- I was building things incorrectly leading me to believe that erroneously. The obstacle now is the lack of buffer support for Python 3.x. I'll whip up a patch for that and send it upstream to Barry Scott (maintainer of PyCXX). I'll also update our local copy to 6.2.4 + patches to work with Python 3.x, since that has a few minor improvements vs. what we have now.

@mdboom
Copy link
Member Author

mdboom commented Dec 6, 2012

@cgohlke, @sandrotosi: Let me know when you've had a chance to look at these recent changes. I'd like to merge this soon in the development cycle to give it lots of testing -- while not throwing something that breaks for a lot of people either ;)

@dmcdougall
Copy link
Member

@cgohlke You don't need to clone @mdboom's repo. If you have matplotlib/matplotlib.git cloned, you can just add mdboom as a remote:

$ git remote add mdboom https://github.com/mdboom/matplotlib.git
$ git fetch mdboom
$ get checkout mdboom/setup_rework

Hope that helps.

@cgohlke
Copy link
Contributor

cgohlke commented Dec 6, 2012

This revision fails to find png.h and tk.h (mpl 1.2 finds them).

============================================================================
Edit setup.cfg to change the build options

BUILDING MATPLOTLIB
            matplotlib: yes [1.3.x]
                python: yes [2.7.3 (default, Apr 10 2012, 23:24:47) [MSC
                        v.1500 64 bit (AMD64)]]
              platform: yes [win32]

REQUIRED DEPENDENCIES AND EXTENSIONS
                 numpy: yes [version 1.6.2]
              dateutil: yes [using dateutil version 1.5]
             pyparsing: yes [using pyparsing version 1.5.6]
                 pycxx: yes [Couldn't import.  Using local copy.]
                libagg: yes [Requires patches that have not been merged
                        upstream. Using local copy.]
              freetype: yes [Unknown version]
                   png: no  [The C/C++ header for libpng (png.h) could not
                        be found.  You may need to install the development
                        package.]

OPTIONAL SUBPACKAGES
           sample_data: yes [installing]
              toolkits: yes [installing]
                 tests: yes [using nose version 1.2.1]

OPTIONAL BACKEND EXTENSIONS
                macosx: no  [skipping due to configuration]
                qt4agg: yes [Qt: 4.7.4, PyQt4: 4.8.6]
               gtk3agg: no  [Requires pygobject to be installed.]
             gtk3cairo: no  [Requires pygobject to be installed.]
                gtkagg: yes [Gtk: 2.22.1 pygtk: 2.22.0]
                 tkagg: no  [The C/C++ header for Tk (tk.h) could not be
                        found.  You may need to install the development
                        package.]
                 wxagg: yes [version 2.8.12.1]
                   gtk: yes [Gtk: 2.22.1 pygtk: 2.22.0]
                 qtagg: no  [not found]
                   agg: yes [installing]
                 cairo: yes [version 1.10.0]

OPTIONAL LATEX DEPENDENCIES
                dvipng: yes [version 1.12]
           ghostscript: yes [version 9.06]
                 latex: yes [version MiKTeX 2.9]
               pdftops: no  []

============================================================================
                        * The following required packages can not be built:
                        * png, tkagg

@mdboom
Copy link
Member Author

mdboom commented Dec 7, 2012

@cgohlke: On windows, it looks in win32_static/include for these header files (as with master). Should it also be looking somewhere else?

@cgohlke
Copy link
Contributor

cgohlke commented Dec 7, 2012

@mdboom: Two issues: I don't use the hard coded win32_static/include directory but specify all the relevant include directories in the INCLUDE environment variable, which can depend on the Python and compiler environments. Second, GTK's pkgconfig works for me and there is no need for the C:/GTK paths.
The build succeeds on my system with the following patch:

diff --git a/setupext.py b/setupext.py
index ba2e20a..5d2a263 100644
--- a/setupext.py
+++ b/setupext.py
@@ -124,6 +124,8 @@ def check_include_file(include_dirs, filename, package):
     """
     Raises an exception if the given include file can not be found.
     """
+    if sys.platform == 'win32':
+        include_dirs.extend(os.getenv('INCLUDE', '.').split(';'))
     if not has_include_file(include_dirs, filename):
         raise CheckFailed(
             "The C/C++ header for %s (%s) could not be found.  You "
@@ -1272,24 +1274,6 @@ class BackendGtk(OptionalBackendPackage):

     def add_flags(self, ext):
         if sys.platform == 'win32':
-            # popen broken on my win32 plaform so I can't use pkgconfig
-            ext.library_dirs.extend(
-                ['C:/GTK/bin', 'C:/GTK/lib'])
-
-            ext.include_dirs.extend(
-                ['win32_static/include/pygtk-2.0',
-                 'C:/GTK/include',
-                 'C:/GTK/include/gobject',
-                 'C:/GTK/include/gext',
-                 'C:/GTK/include/glib',
-                 'C:/GTK/include/pango',
-                 'C:/GTK/include/atk',
-                 'C:/GTK/include/X11',
-                 'C:/GTK/include/cairo',
-                 'C:/GTK/include/gdk',
-                 'C:/GTK/include/gdk-pixbuf',
-                 'C:/GTK/include/gtk',
-                 ])

             def getoutput(s):
                 ret = os.popen(s).read().strip()
@@ -1297,7 +1281,26 @@ class BackendGtk(OptionalBackendPackage):

             if 'PKG_CONFIG_PATH' not in os.environ:
                 # If Gtk+ is installed, pkg-config is required to be installed
-                os.environ['PKG_CONFIG_PATH'] = 'C:\GTK\lib\pkgconfig'
+                os.environ['PKG_CONFIG_PATH'] = 'C:\\GTK\\lib\\pkgconfig'
+
+                # popen broken on my win32 plaform so I can't use pkgconfig
+                ext.library_dirs.extend(
+                    ['C:/GTK/bin', 'C:/GTK/lib'])
+
+                ext.include_dirs.extend(
+                    ['win32_static/include/pygtk-2.0',
+                     'C:/GTK/include',
+                     'C:/GTK/include/gobject',
+                     'C:/GTK/include/gext',
+                     'C:/GTK/include/glib',
+                     'C:/GTK/include/pango',
+                     'C:/GTK/include/atk',
+                     'C:/GTK/include/X11',
+                     'C:/GTK/include/cairo',
+                     'C:/GTK/include/gdk',
+                     'C:/GTK/include/gdk-pixbuf',
+                     'C:/GTK/include/gtk',
+                     ])

             pygtkIncludes = getoutput(
                 'pkg-config --cflags-only-I pygtk-2.0').split()

@mdboom
Copy link
Member Author

mdboom commented Dec 7, 2012

@cgohlke: Thanks -- that patch is an improvement over all. I think it probably also makes sense to turn pkgconfig on for everything on Windows, with fallbacks, of course.

@cgohlke
Copy link
Contributor

cgohlke commented Dec 7, 2012

I'm not sure turning on pkgconfig for everything on Windows is a good idea. At least on my system, GTK and associates (PyGTK, PyGObject, PyCairo) are the only packages supporting pkgconfig. If pkgconfig is always on, freetype2 and libpng headers and libraries will be picked up from the GTK directories, but these libraries are outdated and inferior to my libpng and freetype2 builds, which are found through INCLUDE and LIB environment variables.

@mdboom
Copy link
Member Author

mdboom commented Dec 17, 2012

@cgohlke: In your patch above, why is INCLUDE explicitly handled? Isn't that handled by the Microsoft compiler directly? Any idea why popen isn't working for you? I'd rather fix that in the general case than have to one-off hard code paths here -- but maybe that's just not possible.

@cgohlke
Copy link
Contributor

cgohlke commented Dec 17, 2012

@mdboom: On my system matplotlib's setupext.py does not find some header files and cancels, unless the INCLUDE paths are added to the include_dir list. The Microsoft Compiler do find the headers and libraries in the paths specified by INCLUDE and LIB environment variables without matplotlib specifying those paths again.
Regarding popen: it does work for me. pkgconfig does work too. The comment # popen broken on my win32 plaform so I can't use pkgconfig and the hard coded GTK paths are from someone else. I just moved it to another code path, as a fallback, but frankly I would remove it.

@mdboom
Copy link
Member Author

mdboom commented Dec 17, 2012

@cgohlke: Ok. Can you test this branch again? I think I've integrated what you've proposed (in a slightly different way) by handling the 'INCLUDE' environment variable for everything.

@cgohlke
Copy link
Contributor

cgohlke commented Dec 18, 2012

This does not work. get_base_dirs._basedir_map gets corrupted during runtime and the headers are not found in the include sub-directory of the base_dirs. What's wrong with my original patch?

@mdboom
Copy link
Member Author

mdboom commented Dec 19, 2012

My concern with the original patch is that it special cases Gtk. If we're looking for headers in INCLUDE on Windows, we should do it on all packages.

How is get_base_dirs._basedir_map being corrupted?

Do you have instructions on setting up the build environment you are using? I have this working myself on Windows, but I'm sure my environment is not set up in the same way as yours, and there may be ways in which the binaries I'm producing are not as good... But unless I recreate your environment, I have to back-and-forth things with you like this, obviously.

@pelson
Copy link
Member

pelson commented Feb 26, 2013

@mdboom - Are all of the changes necessary at the same time? For instance, can PyCXX be updated in a separate PR? Can we remove Pytz & dateutil in a separate PR? There are changes here which surprise me a little (changes to some python 3 ifdef blocks for instance) - are they needed in this PR?

I'm all in favour of the re-tool, it's just impossible to see the wood from the trees as it stands. The only other feasible way to move this PR forward without reducing the size of the change, AFAICS, is to hit merge and see what happens...

@mdboom
Copy link
Member Author

mdboom commented Feb 26, 2013

The changes for review are almost entirely in setup.py, setupext.py, .travis.yml, INSTALL, setup.cfg.template and doc/faq/installing_faq.html. You can use the "Show Diff Stats" button at the top of the "Files Changed" page to navigate around. But for this PR, more than reviewing the code, it's important to test this on as many platforms and configurations as we can.

Unfortunately, it's hard to separate these things out, because once you say "we support building with external libraries", you have to make the changes so that our code compiles with vanilla copies of those libraries (hence the changes to _backend_agg.cpp, for example, and the updating of PyCXX.). If you do it piecemeal, you'd end up with a master that wouldn't build, because we'd have different behavior in the system libraries and the ones we ship.

The only changes to Py3 ifdef blocks that I see are within PyCXX, which is simply an updating to the current upstream version -- we shouldn't need to review those changes for style, etc., only confirm that they work by building, running our test suite etc.

About the only thing that could be reduced here, I think, is removing dateutil and pytz later -- but again, I think ideally that removal should all be tested together. It's important to know that after removing those files from our repo that matplotlib still installs and works.

@pelson
Copy link
Member

pelson commented Feb 26, 2013

Ok. Thanks @mdboom .

FWIW this works well on OSX 10.8 (Mountain Lion) for python2.7.

The only other feasible way to move this PR forward without reducing the size of the change, AFAICS, is to hit merge and see what happens...

On the basis of your last post, I'd suggest we merge this and fix any fallout that might ensue.

mdboom added a commit that referenced this pull request Feb 26, 2013
Retool the setup.py infrastructure
@mdboom mdboom merged commit 7940245 into matplotlib:master Feb 26, 2013
@dmcdougall
Copy link
Member

Does this change utilise pip install to find and install dependencies? I should point out that it has recently been discovered that pip on an untrusted network is not safe.

It might be worth shipping a warning with this.

@mdboom
Copy link
Member Author

mdboom commented Mar 1, 2013

Yes, it does. The fact that we ship copies of so many libraries has long been a problem. There were many good reasons to stop doing that, but also a strong desire to not make the installation from source (or pip) more inconvenient than it already is. pip is, unfortunately, the tool we have to do that.

The post you linked to is not as alarming as it sounds -- why is he running pip as root? Now that pip has virtualenv integration, it's unnecessary. The fact that pip doesn't sign packages still means that you could get malicious software installed, but at least it won't have root privileges. But I can't think that installing unsigned packages is much more dangerous than asking users to download and install those packages manually, as it's unlikely those packages' hashes will be checked either.

Maybe we could:

a) if the dependencies are missing and distribute is about to download them on our behalf...
b) check if the script is running as root...
c) if so, display a warning message about the security hazard of doing so and suggesting virtualenv, or installing the required dependencies manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants