Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMP decompression #38

Conversation

LennartNoordsij
Copy link
Contributor

This patch adds support in zfp for OpenMP decompression.
It does that by generating an "offset table" during the compression
phase, and then using that table for performing parallel variable-rate
decoding. The patch breaks the following tests:

  • testOmp: The test is broken because the behavior of chunk_size is
    changed. We need to align with the maintainers on what to
    do regarding that.

The patch has been tested locally for 1D, 2D, 3D (4D is not tested yet).
The tests will be added in a follow up once we align on the implementation.

LennartNoordsij and others added 2 commits January 9, 2019 16:14
Merge changes from upstream
This patch adds support in zfp for OpenMP decompression.
It does that by generating an "offset table" during the compression
phase, and then using that table for performing parallel variable-rate
decoding. The patch breaks the following tests:

- testOmp: The test is broken because the behavior of chunk_size is
           changed. We need to align with the maintainers on what to
           do regarding that.

The patch has been tested locally for 1D, 2D, 3D (4D is not tested yet).
The tests will be added in a follow up once we align on the implementation.
include/zfp.h Outdated
@@ -145,6 +145,7 @@ typedef struct {
int minexp; /* minimum floating point bit plane number to store */
bitstream* stream; /* compressed bit stream */
zfp_execution exec; /* execution policy and parameters */
unsigned long long * offset_table; /* table with the chunk offsets in bits */
Copy link
Contributor

@mohamed mohamed Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not size_t?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. unsigned long long is not C89 compliant. If this table is to be stored in the file, then a fixed-width type, such as uint64, would be more portable. size_t could be 32 or 64 bits, for instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I will change this to be a fixed width type. uint64_t is good for now.

@@ -81,6 +81,9 @@ size_t stream_flush(bitstream* stream);
/* copy n bits from one bit stream to another */
void stream_copy(bitstream* dst, bitstream* src, size_t n);

/* return pointer to start of data array used by this bitstream */
void * stream_begin(bitstream* s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the return type to const void *

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this function differ from stream_data()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not different, I must have missed that function. I will remove this function and use stream_data.

{
uint i;
void * buffer = stream_begin(zfp_stream_bit_stream(stream));
bitstream** bs = (bitstream**)malloc(chunks * sizeof(bitstream*));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if malloc fails? You are not checking if bs is not NULL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I believe this is the only place in zfp where malloc is not being checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This malloc is taken from line 51 of compress_init_par, where it is also unchecked. I will include a check for this in my function.
I will also raise an issue for the unchecked mallocs in compress_init_par.

utils/zfp.c Outdated
/* TODO: think of a clean way to check for offset header, possibly execution policy */
if (chunk_size) {
/* write the offset header to the file */
if (fwrite(offset_table, sizeof(unsigned long long), chunks, file) != chunks) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to align with @lindstro and @mclarsen on how to handle writing the offset table. Few questions that pop up:

  • Should it go to the header or a separate file?
  • If it goes to the header, do we version the header to preserve backwards-compatibility?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break compatibility in the sense that existing code will not be able to read a file that stores these offsets. We are planning on making several changes to the compressed format for our September release, and that would be the right time to address this. I'm afraid we can't officially adopt changes like this until then.

Perhaps we can make an experimental branch for evaluating parallel decompression and experimenting with different ways of encoding the offset table. For instance, I had a discussion with @LennartNoordsij about alternative solutions that are not based on storing offsets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An experimental branch seems like the best way to go forward for now. I will also add my CUDA decompression implementation which is based on the same offset table implementation as this OpenMP version. I will combine them both in a branch "Parallel decompression".

Copy link
Contributor

@mohamed mohamed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave the task of reviewing ompdecompress.c to the experts @lindstro and @mclarsen

Lennart Noordsij added 3 commits January 10, 2019 11:12
Default behavior of chunk_size is restored for fixed rate. Fixed accuracy and precision now give an error when chunk_size and threads are both 0, or a warning when only threads is nonzero
Fixed rate compression and decompression now work without the offset header
Original functions reintroduced, replacing the inline functionality based on chunk_size
Merged upstream patch for cygwin endtoend testing
testOmp is still not passing, as it expects zeros on output for OpenMP fixed accuracy and precision
@LennartNoordsij
Copy link
Contributor Author

Closed until the design objective and implementation are aligned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants