Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
pR0Ps committed Jun 27, 2021
0 parents commit 72b2721
Show file tree
Hide file tree
Showing 7 changed files with 2,228 additions and 0 deletions.
27 changes: 27 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
build/
develop-eggs/
dist/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Pytest
.coverage
.pytest_cache
201 changes: 201 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
zipstream-ng
============
A modern and easy to use streamable zip file generator. It can package and stream many files and
folders on the fly without needing temporary files or excessive memory.

Includes the ability to calculate the total size of the stream before any data is actually added
(provided no compression is used). This makes it ideal for use in web applications since the total
size can be used to set the `Content-Length` header without having to generate the entire file first
(see examples below).

Other features:
- Flexible API: Typical use cases are simple, complicated ones are possible.
- Supports zipping data from files, as well as any iterable objects (including strings and bytes).
- Threadsafe: won't mangle data if multiple threads are adding files or reading from the stream.
- Includes a clone of Python's `http.server` module with zip support added. Try `python -m zipstream.server`.
- Automatically handles Zip64 extensions: uses them if required, doesn't if not.
- Automatically handles out of spec dates (clamps them to the range that zip files support).
- No external dependencies.


Installation
------------
```
pip install git+https://github.com/pR0Ps/zipstream-ng
```


Examples
--------

### zipserver (included)

A fully-functional and useful example can be found in the included
[`zipstream.server`](./zipstream/server.py) module. It's a clone of Python's built in `http.server`
with the added ability to serve multiple files and folders as a single zip file. Try it out by
installing the package and running `zipserver --help` or `python -m zipstream.server --help`


### Integration with Flask

A [Flask](https://flask.palletsprojects.com/)-based file server that serves the path at the
requested path as a zip file:

```python
import os.path
from flask import Flask, Response
from zipstream import ZipStream

app = Flask(__name__)

@app.route('/<path:path>', methods=['GET'])
def stream_zip(path):
name = os.path.basename(os.path.normpath(path))
zs = ZipStream.from_path(path, sized=True)
return Response(
zs,
mimetype="application/zip",
headers={
"content-disposition": f"attachment; filename={name}.zip",
"content-length": len(zs),
}
)

if __name__ == "__main__":
app.run(host='0.0.0.0', port=5000)
```


### Create a local zip file (the boring use case)

```python
from zipstream import ZipStream

zs = ZipStream.from_path("/path/to/files")
with open("files.zip", "wb") as f:
f.writelines(zs)
```


### Partial generation and last-minute file additions

It's possible to generate up the last added file without finalizing the stream. Doing this enables
adding something like a file manifest or compression log after all the files have been added.
`ZipStream` provides a `get_info` function that returns information on all the files that have been
added to the stream. In this example, all that information will be added to the zip in a file named
"manifest.json" before it's finalized.

```python
from zipstream import ZipStream
import json

def gen_zipfile()
zs = ZipStream.from_path("/path/to/files")
yield from zs.all_files()
zs.add(
json.dumps(
zs.get_info(),
indent=2
),
"manifest.json"
)
yield from zs.finalize()
```


Comparison to stdlib
--------------------
Since Python 3.6 it has actually been possible to generate zip files as a stream using just the
standard library, it just hasn't been very ergonomic or efficient. Consider the typical use case of
zipping up a directory of files while streaming it over a network connection:

(note that the size of the stream is not pre-calculated in this case as this would make the stdlib
example way too long).

Using ZipStream:
```python
from zipstream import ZipStream

send_stream(
ZipStream.from_path("/path/to/files/")
)
```

<details>
<summary>The same(ish) functionality using just the stdlib:</summary>

```python
import os
import io
from zipfile import ZipFile, ZipInfo

class Stream(io.RawIOBase):
"""An unseekable stream for the ZipFile to write to"""

def __init__(self):
self._buffer = bytearray()
self._closed = False

def close(self):
self._closed = True

def write(self, b):
if self._closed:
raise ValueError("Can't write to a closed stream")
self._buffer += b
return len(b)

def readall(self):
chunk = bytes(self._buffer)
self._buffer.clear()
return chunk

def iter_files(path):
for dirpath, _, files in os.walk(path, followlinks=True):
if not files:
yield dirpath # Preserve empty directories
for f in files:
yield os.path.join(dirpath, f)

def read_file(path):
with open(path, 'rb') as fp:
while True:
buf = fp.read(1024 * 64)
if not buf:
break
yield buf

def generate_zipstream(path):
stream = Stream()
with ZipFile(stream, mode='w') as zf:
toplevel = os.path.basename(os.path.normpath(path))
for f in iter_files(path):
# Use the basename of the path to set the arcname
arcname = os.path.join(toplevel, os.path.relpath(f, path))
zinfo = ZipInfo.from_file(f, arcname)

# Write data to the zip file then yield the stream content
with zf.open(zinfo, mode='w') as fp:
if zinfo.is_dir():
continue
for buf in read_file(f):
fp.write(buf)
yield stream.readall()
yield stream.readall()

send_stream(
generate_zipstream("/path/to/files/")
)
```
</details>


Tests
-----
This package contains extensive tests. To run them, install `pytest` (`pip install pytest`) and run
`py.test` in the project directory.


License
-------
Licensed under the [GNU LGPLv3](https://www.gnu.org/licenses/lgpl-3.0.html).
38 changes: 38 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env python

import contextlib
from setuptools import setup
import os.path


try:
DIR = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(DIR, "README.md"), encoding='utf-8') as f:
long_description = f.read()
except Exception:
long_description=None


setup(
name="zipstream-ng",
version="1.0.0",
description="A modern and easy to use streamable zip file generator",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/pR0Ps/zipstream-ng",
licence="LGPLv3",
classifiers=[
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Operating System :: OS Independent",
"Topic :: System :: Archiving :: Compression",
"License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)"
],
packages=["zipstream"],
entry_points={
"console_scripts": ["zipserver=zipstream.server:main"]
},
python_requires=">=3.7.0",
)
Empty file added tests/__init__.py
Empty file.

0 comments on commit 72b2721

Please sign in to comment.