Skip to content
This repository has been archived by the owner on Feb 25, 2018. It is now read-only.

Commit

Permalink
allow preloading data by putting it directly in the index
Browse files Browse the repository at this point in the history
  • Loading branch information
rfk committed Aug 23, 2010
1 parent 2b43748 commit 1fe1dd8
Show file tree
Hide file tree
Showing 2 changed files with 61 additions and 17 deletions.
30 changes: 23 additions & 7 deletions README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,17 +27,33 @@ To create an index for a given zipfile, do the following::
from zipimportx import zipimporter
zipimporter("mylib.zip").write_index()

Depending on your platform, this will create either "mylib.zip.win32.idx" or
"mylib.zip.posix.idx" containing the pre-parsed zipfile directory information.
(Specifically, it will contain a marshalled dictionary similar to those found
in zipimport._zip_directory_cache.)
This will create the file "mylib.zip.idx" containing the pre-parsed zipfile
directory information (specifically, it will contain a marshalled dictionary
similar to those found in zipimport._zip_directory_cache).

In my tests, use of these indexes speeds up the initial loading of a zipfile by
about a factor of 3 on Linux, and a factor of 5 on Windows.

Note that this package uses nothing but builtin modules. To bootstrap zipfile
imports for a frozen application, you can inline the module's code directly
into your application's startup script. Do this somewhere in your build::

To further speed up the loading of a collection of modules, you can "preload"
the actual module data by including it in the index. This allows the data for
several modules to be loaded in a single sequential read rather than requiring
a separate read for each module. Preload module data like this::

from zipimportx import zipimporter
zipimporter("mylib.zip").write_index(preload=["mymod*","mypkg*"])


Note that imports will almost certainly *break* if the index does not reflect
the actual contents of the zipfile. This module is therefore most useful
for frozen apps and other situations where the zipfile is not expected to
change.


Note also that this package uses nothing but builtin modules. To bootstrap
zipfile imports for a frozen application, you can inline this module's code
directly into your application's startup script. Simply do something like
this in your build process::

import zipimportx
import inspect
Expand Down
48 changes: 38 additions & 10 deletions zipimportx/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,33 @@
from zipimportx import zipimporter
zipimporter("mylib.zip").write_index()
Depending on your platform, this will create either "mylib.zip.win32.idx" or
"mylib.zip.posix.idx" containing the pre-parsed zipfile directory information.
(Specifically, it will contain a marshalled dictionary similar to those found
in zipimport._zip_directory_cache.)
This will create the file "mylib.zip.idx" containing the pre-parsed zipfile
directory information (specifically, it will contain a marshalled dictionary
similar to those found in zipimport._zip_directory_cache).
In my tests, use of these indexes speeds up the initial loading of a zipfile by
about a factor of 3 on Linux, and a factor of 5 on Windows.
Note that this package uses nothing but builtin modules. To bootstrap zipfile
imports for a frozen application, you can inline the module's code directly
into your application's startup script. Do this somewhere in your build::
To further speed up the loading of a collection of modules, you can "preload"
the actual module data by including it in the index. This allows the data for
several modules to be loaded in a single sequential read rather than requiring
a separate read for each module. Preload module data like this::
from zipimportx import zipimporter
zipimporter("mylib.zip").write_index(preload=["mymod*","mypkg*"])
Note that imports will almost certainly *break* if the index does not reflect
the actual contents of the zipfile. This module is therefore most useful
for frozen apps and other situations where the zipfile is not expected to
change.
Note also that this package uses nothing but builtin modules. To bootstrap
zipfile imports for a frozen application, you can inline this module's code
directly into your application's startup script. Simply do something like
this in your build process::
import zipimportx
import inspect
Expand Down Expand Up @@ -288,7 +304,7 @@ def _check_mtime(self,mtbytes,srctoc):
diff = -1*diff
return (diff <= 1)

def _get_data(self,path,toc=None):
def _get_data(self,path,toc=None,raw=False):
"""Helper method to read the data for a given path.
The path must be relative to the archive root, i.e. be exactly as
Expand Down Expand Up @@ -327,10 +343,11 @@ def _get_data(self,path,toc=None):
finally:
zf.close()
# Decompress if necessary, and return the data.
if not compress:
if raw:
return raw_data
else:
if compress:
return zlib.decompress(raw_data,-15)
return raw_data

def find_module(self,fullname,path=None):
"""find_module(fullname, path=None) -> self or None.
Expand Down Expand Up @@ -462,6 +479,17 @@ def write_index(self,platform=None,preload=[]):
index = {}
for (key,info) in posix_index.iteritems():
index[key.replace("/","\\")] = info
# Add any preload data to the index
if preload:
import fnmatch # not a builtin, import only as needed
if isinstance(preload,basestring):
preload = [preload]
for (key,info) in index.iteritems():
for pattern in preload:
if fnmatch.fnmatch(key,pattern):
data = self._get_data(key,info,raw=True)
index[key] = tuple(list(info) + [data])
break
# Write out to the appropriately-named index file.
with open(self.archive + archive_index,"wb") as f:
marshal.dump(index,f)
Expand Down

0 comments on commit 1fe1dd8

Please sign in to comment.