Skip to content

Commit

Permalink
Merge pull request #55 from mdboom/encoding-simple
Browse files Browse the repository at this point in the history
Simple compression specification
  • Loading branch information
mdboom committed Mar 10, 2015
2 parents 1ce028d + b85b73e commit 2f8a166
Show file tree
Hide file tree
Showing 15 changed files with 56 additions and 8 deletions.
Binary file modified reference_files/0.1.0/ascii.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/basic.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/complex.asdf
Binary file not shown.
Binary file added reference_files/0.1.0/compressed.asdf
Binary file not shown.
15 changes: 15 additions & 0 deletions reference_files/0.1.0/compressed.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#ASDF 0.1.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/0.1.0/
--- !core/asdf
data: !core/ndarray
data: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127]
datatype: int64
shape: [128]
...
Binary file modified reference_files/0.1.0/exploded0000.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/float.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/int.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/shared.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/stream.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/unicode_bmp.asdf
Binary file not shown.
Binary file modified reference_files/0.1.0/unicode_spp.asdf
Binary file not shown.
13 changes: 12 additions & 1 deletion reference_files/generate/generate
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,18 @@ def ref_exploded(fd):
'data': np.arange(8)
}

with pyasdf.AsdfFile(tree).write_to(fd, exploded=True):
with pyasdf.AsdfFile(tree).write_to(fd, all_array_storage='external'):
pass


def ref_compressed(fd):
tree = {
'data': np.arange(128)
}

ff = pyasdf.AsdfFile(tree)
ff.set_array_compression(tree['data'], 'zlib')
with ff.write_to(fd):
pass


Expand Down
34 changes: 27 additions & 7 deletions source/file_layout.rst
Original file line number Diff line number Diff line change
Expand Up @@ -164,29 +164,49 @@ Each block begins with the following header:
- ``flags`` (32-bit unsigned integer, big-endian): A bit field
containing flags (described below).

- ``compression`` (4-byte byte string): The name of the compression
algorithm, if any. Should be ``\0\0\0\0`` to indicate no
compression. See :ref:`compression` for valid values.

- ``allocated_size`` (64-bit unsigned integer, big-endian): The amount
of space allocated for the block (not including the header), in
bytes.

- ``used_size`` (64-bit unsigned integer, big-endian): The amount of
used space for the block (not including the header), in bytes.
used space for the block on disk (not including the header), in
bytes.

- ``data_size`` (64-bit unsigned integer, big-endian): The size of the
block when decoded, in bytes. If ``compression`` is all zeros
(indicating no compression), it **must** be equal to ``used_size``.
If compression is being used, this is the size of the decoded block
data.

- ``checksum`` (64-bit unsigned integer, big-endian): An optional MD5
checksum of the used data in the block. The special value of 0
indicates that no checksum verification should be performed. *TBD*.

- ``encoding`` (16-byte character string): A way to indicate how the
buffer is compressed or encoded. *TBD*.

Flags
^^^^^

The following bit flags are understood in the ``flags`` field:

- ``STREAMED`` (0x1): When set, the block is in streaming mode, and it
extends to the end of the file. When set, the ``allocated_size``
and ``used_size`` fields are ignored. By necessity, any block with
the ``STREAMED`` bit set must be the last block in the file.
extends to the end of the file. When set, the ``allocated_size``,
``used_size`` and ``data_size`` fields are ignored. By necessity,
any block with the ``STREAMED`` bit set must be the last block in
the file.

.. _compression:

Compression
^^^^^^^^^^^

Currently, only one block compression type is supported:

- ``zlib``: The zlib lossless compression algorithm. It is widely
used, patent-unencumbered, and has an implementation released under
a permissive license in `zlib <http://www.zlib.net/>`__.

Block content
^^^^^^^^^^^^^
Expand Down
2 changes: 2 additions & 0 deletions source/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,8 @@ The ASDF format is built on top of a number of existing standards:
- `VOUnits (Units in the VO)
<http://www.ivoa.net/documents/VOUnits/index.html>`__

- `Zlib Deflate compression <http://www.zlib.net/feldspar.html>`__

.. [Thomas2015] Thomas, B., Jenness. T. et al. 2015, "The Future of
Astronomical Data Formats I. Learning from FITS".
Astronomy & Computing, in press, arXiv e-print: 1502.00996.
Expand Down

0 comments on commit 2f8a166

Please sign in to comment.