Allow interpreter to execute a zip file #45107
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee = 'https://github.com/ncoghlan' closed_at = <Date 2007-11-19.18:39:55.770> created_at = <Date 2007-06-19.03:40:37.000> labels = ['interpreter-core'] title = 'Allow interpreter to execute a zip file' updated_at = <Date 2008-02-23.05:56:24.830> user = 'https://bugs.python.org/andy-chu'
activity = <Date 2008-02-23.05:56:24.830> actor = 'ncoghlan' assignee = 'ncoghlan' closed = True closed_date = <Date 2007-11-19.18:39:55.770> closer = 'gvanrossum' components = ['Interpreter Core'] creation = <Date 2007-06-19.03:40:37.000> creator = 'andy-chu' dependencies =  files = ['8053', '8054', '8055', '8056', '8335', '8770'] hgrepos =  issue_num = 1739468 keywords = ['patch'] message_count = 26.0 messages = ['52772', '52773', '52774', '52775', '52776', '52777', '52778', '52779', '52780', '52781', '52782', '52783', '55338', '55824', '55826', '55855', '57605', '57613', '57614', '57634', '57637', '57641', '57645', '57647', '57650', '62719'] nosy_count = 7.0 nosy_names = ['gvanrossum', 'loewis', 'paul.moore', 'pje', 'ncoghlan', 'andy-chu', 'alexandre.vassalotti'] pr_nums =  priority = 'normal' resolution = 'accepted' stage = None status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue1739468' versions = ['Python 2.6', 'Python 3.0']
The text was updated successfully, but these errors were encountered:
The motivation for this is that distributing a zip file is a very light and easy way to distribute a python program with multiple packages/modules. I have done this on Linux, Mac and Windows and it works very nicely -- except that you need a few extra files to bootstrap it: set PYTHONPATH to the zip file and run the main function.
With this small patch, you get rid of the need for extra files. At the bottom is a demo on Linux.
On Windows, you can do a similar thing by making a file that is both a zip file and a batch file. The batch file will pass %~f0 (itself) to the -z flag of the Python interpreter.
I ran this by Guido and he seemed to think it was a fine idea. At Google, we have numerous platform-specific hacks in a program called "autopar" to solve this problem.
I have included the basic patch, but if you guys agree with this, I will add some tests and documentation. And I think it might be useful to include something in the Tools/ directory to do what update_zip.sh does below (add a __zipmain__ module and a shebang/batch file header to a zip file, to make it executable)?
I think this may also help to fix a bug with eggs:
IMPORTANT NOTE: Eggs with an "eggsecutable" header cannot be renamed, or invoked via symlinks. They must be invoked using their original filename, in order to ensure that, once running, pkg_resources will know what project and version is in use. The header script will check this and exit with an error if the .egg file has been renamed or is invoked via a symlink that changes its base name.
andychu testdata$ ls
# The main program you're going to run in "development mode"
andychu testdata$ ./foo.py foo bar
# Same program, packaged into a zip file
andychu testdata$ ./foo_exe.zip foo bar
# Contents of the zip file
andychu testdata$ unzip -l foo_exe.zip
# Demo script to build an executable zip file.
andychu testdata$ cat header.txt
andychu testdata$ cat update_zip.sh
# Make a regular zip file.
# Add a shebang line to it.
# Make it executable.
I like the general idea, but it should be possible to use runpy.run_module to get __name__ set correctly (as that is what happens when you execute a module from a zipfile with -m). Another advantage of using run_module is that it would allow runzip() to take a second argument (possibly defaulting to "__zipmain__") which would specify the module to be executed from the zipfile (the remaining 3 run_module arguments could also be passed in, and set appropriately from main.c).
Adding the new function as runpy.run_zip() (instead of adding a new module) would also be good.
For Windows, an alternative to making the zip file both a batch and a zip file would be to adopt a .pyz extension convention for these files - the file associations can then be set up to invoke the script appropriately with python -z (similar to the way that .pyw files are associated with pythonw instead of the standard python executable). That way the same file could be executed on both Linux (via an embedded shebang line) and on Windows (via filename association), as is the case with standard .py Python scripts.
My final question is whether the change to sys.path should be reverted once the module execution is complete - my suspicion is that it should, but I need to look into it a bit more before giving a definite answer (for the command line flag case, this behaviour obviously doesn't matter - it is only significant if the Python method is invoked directly in the context of a larger program).
Nick, I've updated the code to use a new runpy.run_zip function, which calls run_module. This does make it a bit cleaner.
Let me know what you think. If the code is good I'll write some tests and documentation.
Also, I'm not sure if the '-c' is really appropriate in sys.argv, but that seems to be what the -m flag uses. It seems like it might make sense to have sys.argv be the zip file, if it is really a first class executable.
And I think a script to build one of these files would be appropriate, which I can add. You could pass it the main module and main function, and it would generate a __zipmain__ stub and add it to the zip file. And it is a good idea if the file is cross platform, so a .pyz extension would work.
Sorry the delayed response, I was a bit busy at work this week... but I'll respond sooner this time. : )
andychu trunk$ testdata/foo_exe.zip foo bar
File Added: runzip7.diff
Here is a script that documents how to make such files. I think the important part is just documenting the format. Then people can write whatever tools they need around it. Many people could get by with this simple tool, but others might want something more elaborate.
andychu testprog$ find
andychu testprog$ find -name "*.py" | xargs ../Tools/scripts/makepyz.py -a zip,pyz,unix -z foo.zip -p package1 -m foo -y /usr/local/google/clients/python/trunk/python
andychu testprog$ ./foo.zip
File Added: makepyz.py
The new patch looks much better - the only thing is that run_zip needs to do sys.path.pop(0) to correctly remove the zipfile from the front of the path.
However, I do see your point about whether or not including the current directory on sys.path is the right thing to do for this case - it may be better to set <zipfile_name>/zipmain.py as argv before invoking PySys_SetArgv, and then use __zipmain__ as the module to be executed on the same code path as the -m switch normally uses.
Rather than continuing this discussion here on SF, it may be best to post your proposal to python-dev. I personally like the idea, but a new idiom for running Python scripts will need broader support than just me. Getting input from the py2exe and py2app folks that can be found on python-dev would also be good.
Good point, however I decided to set sys.path and sys.argv a little differently, based on some more testing, as you can see explained in the new patch I just uploaded.
Those are details; I'll post to python-dev and see what people think of the general idea. If it's accepted then we can figure out the details. For now I made the function very specific to the -z flag.
I'm not sure I have a use case for invoking a zip file from another python module. If we were to put that back in, it might be better to have 2 separate functions anyway, since this one is only 3 lines basically.
File Added: runzip8.diff
I don't see the need for that on Linux: you can do the same thing already with a shell script.
martin@mira:~$ cat runzip.sh
So unless that adds a functionality that I'm missing, I'm -1 on this patch.
I like the -z option - I'm in favour of that as it stands (you need to add documentation). This is what the patch covers, and I'd like to see it implemented as is.
The helper script is useful, but not essential. To include in the distribution, you'd have to consider how to deploy it: module executable via -m, .py file in the Scripts directory, shell script/.bat file in the Scripts directory. Of these, only a module using -m is really portable. It may be easier just to just have it as sample code in the documentation which can be cut and pasted as required. (That's what I'd recommend).
For Windows, if you expect to define a file extension for these files, you need to consider console vs GUI issues. File extensions are more useful in a GUI context, so maybe .pyz files should be executed with "pythonw -z". Or maybe there should be 2 extensions, .pyz (console) and .pwz (GUI)? I don't have an answer to this, and honestly, if there's any controversy, I wouldn't bother, but just leave it to the user to decide and implement a local solution (much as Python doesn't add its directory to %PATH%) If you wanted to define a standard, you'd need patches to the Windows MSI builder to implement it.
Patch implementing an alternate approach: support automatically
I like PJE's approach, and the patch works for me.
About the only thing I'd change is to switch the expression in
An optimising compiler is going to produce similar code either way, and
Adding a simple test of the functionality to test_cmd_line would also be
PJE's patch looks good to me too.
Attached an updated version of PJE's patch with the suggested cleanups
The basic tests and the directory tests are currently working, but for
I'm posting the patch anyway to see if anyone else can spot where it's
Actually the failures aren't OSX-specific:
Traceback (most recent call last): File "Lib/test/test_cmd_line_script.py", line 117, in test_directory self._check_script(script_dir, script_name, script_dir) File "Lib/test/test_cmd_line_script.py", line 96, in _check_script self.assertEqual(exit_code, 0, data) AssertionError: /usr/local/google/home/guido/python/py3k/python: '/tmp/tmpLGqOxc' is a directory, cannot continue
Traceback (most recent call last): File "Lib/test/test_cmd_line_script.py", line 124, in test_directory_compiled self._check_script(script_dir, compiled_name, script_dir) File "Lib/test/test_cmd_line_script.py", line 96, in _check_script self.assertEqual(exit_code, 0, data) AssertionError: /usr/local/google/home/guido/python/py3k/python: '/tmp/tmprNwPih' is a directory, cannot continue
Traceback (most recent call last): File "Lib/test/test_cmd_line_script.py", line 130, in test_zipfile self._check_script(zip_name, None, zip_name) File "Lib/test/test_cmd_line_script.py", line 96, in _check_script self.assertEqual(exit_code, 0, data) AssertionError: File "/tmp/tmpInCAJO/test_zip.zip", line 1 PK# statements being executed ^ SyntaxError: invalid syntax [25429 refs]
Traceback (most recent call last): File "Lib/test/test_cmd_line_script.py", line 137, in test_zipfile_compiled self._check_script(zip_name, None, zip_name) File "Lib/test/test_cmd_line_script.py", line 96, in _check_script self.assertEqual(exit_code, 0, data) AssertionError: File "/tmp/tmpqh6g1C/test_zip.zip", line 1 SyntaxError: Non-UTF-8 code starting with '\xc8' in file /tmp/tmpqh6g1C/test_zip.zip on line 2, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details [25428 refs]