Skip to content

Commit

Permalink
added comments and macros for in-place (de)compression
Browse files Browse the repository at this point in the history
  • Loading branch information
Cyan4973 committed May 29, 2019
1 parent 4fc6b48 commit b17f578
Show file tree
Hide file tree
Showing 4 changed files with 83 additions and 10 deletions.
38 changes: 36 additions & 2 deletions doc/lz4_manual.html
Expand Up @@ -40,9 +40,9 @@ <h1>1.9.1 Manual</h1>

Blocks are different from Frames (doc/lz4_Frame_format.md).
Frames bundle both blocks and metadata in a specified manner.
This are required for compressed data to be self-contained and portable.
Embedding metadata is required for compressed data to be self-contained and portable.
Frame format is delivered through a companion API, declared in lz4frame.h.
Note that the `lz4` CLI can only manage frames.
The `lz4` CLI can only manage frames.
<BR></pre>

<a name="Chapter2"></a><h2>Version</h2><pre></pre>
Expand Down Expand Up @@ -357,6 +357,40 @@ <h1>1.9.1 Manual</h1>

</p></pre><BR>

<pre><b></b><p>
It's possible to have input and output sharing the same buffer,
for highly contrained memory environments.
In both cases, it requires input to lay at the end of the buffer,
and buffer must have some margin, hence be larger than final size.

This technique is more useful for decompression,
since decompressed size is typically larger,
and margin is mostly required to avoid stripe overflow, so it's short.

For compression though, margin must be able to cope with both
history preservation, requiring input data to remain unmodified up to LZ4_DISTANCE_MAX,
and data expansion, which can happen when input is not compressible.
As a consequence, buffer size requirements are much higher than average compressed size,
hence memory savings are limited.

There are ways to limit this cost for compression :
- Reduce history size, by modifying LZ4_DISTANCE_MAX.
Lower values will also reduce compression ratio, except when input_size < LZ4_DISTANCE_MAX,
so it's a reasonable trick when inputs are known to be small.
- Require the compressor to deliver a "maximum compressed size".
When this size is < LZ4_COMPRESSBOUND(inputSize), then compression can fail,
in which case, the return code will be 0 (zero).
The caller must be ready for these cases to happen,
and typically design a backup scheme to send data uncompressed.
The combination of both techniques can significantly reduce
the amount of margin required for in-place compression.

</p></pre><BR>

<pre><b>#define LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize) ( (decompressedSize) + LZ4_DECOMPRESS_INPLACE_MARGIN) </b>/**< note: presumes that compressedSize < decompressedSize */<b>
</b></pre><BR>
<pre><b>#define LZ4_COMPRESS_INPLACE_BUFFER_SIZE(maxCompressedSize) ( (maxCompressedSize) + LZ4_COMPRESS_INPLACE_MARGIN) </b>/**< maxCompressedSize is generally LZ4_COMPRESSBOUND(inputSize), but can be set to any lower value, with the risk that compression can fail (return code 0(zero)) */<b>
</b></pre><BR>
<a name="Chapter9"></a><h2>PRIVATE DEFINITIONS</h2><pre>
Do not use these definitions directly.
They are only exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`.
Expand Down
4 changes: 0 additions & 4 deletions lib/lz4.c
Expand Up @@ -412,10 +412,6 @@ static const int LZ4_minLength = (MFLIMIT+1);
#define MB *(1 <<20)
#define GB *(1U<<30)

#ifndef LZ4_DISTANCE_MAX /* can be user - defined at compile time */
# define LZ4_DISTANCE_MAX 65535
#endif

#if (LZ4_DISTANCE_MAX > 65535) /* max supported by LZ4 format */
# error "LZ4_DISTANCE_MAX is too big : must be <= 65535"
#endif
Expand Down
48 changes: 46 additions & 2 deletions lib/lz4.h
Expand Up @@ -65,9 +65,9 @@ extern "C" {
Blocks are different from Frames (doc/lz4_Frame_format.md).
Frames bundle both blocks and metadata in a specified manner.
This are required for compressed data to be self-contained and portable.
Embedding metadata is required for compressed data to be self-contained and portable.
Frame format is delivered through a companion API, declared in lz4frame.h.
Note that the `lz4` CLI can only manage frames.
The `lz4` CLI can only manage frames.
*/

/*^***************************************************************
Expand Down Expand Up @@ -462,8 +462,51 @@ LZ4LIB_STATIC_API int LZ4_compress_fast_extState_fastReset (void* state, const c
*/
LZ4LIB_STATIC_API void LZ4_attach_dictionary(LZ4_stream_t* workingStream, const LZ4_stream_t* dictionaryStream);


/*! In-place compression and decompression
*
* It's possible to have input and output sharing the same buffer,
* for highly contrained memory environments.
* In both cases, it requires input to lay at the end of the buffer,
* and buffer must have some margin, hence be larger than final size.
*
* This technique is more useful for decompression,
* since decompressed size is typically larger,
* and margin is mostly required to avoid stripe overflow, so it's short.
*
* For compression though, margin must be able to cope with both
* history preservation, requiring input data to remain unmodified up to LZ4_DISTANCE_MAX,
* and data expansion, which can happen when input is not compressible.
* As a consequence, buffer size requirements are much higher than average compressed size,
* hence memory savings are limited.
*
* There are ways to limit this cost for compression :
* - Reduce history size, by modifying LZ4_DISTANCE_MAX.
* Lower values will also reduce compression ratio, except when input_size < LZ4_DISTANCE_MAX,
* so it's a reasonable trick when inputs are known to be small.
* - Require the compressor to deliver a "maximum compressed size".
* When this size is < LZ4_COMPRESSBOUND(inputSize), then compression can fail,
* in which case, the return code will be 0 (zero).
* The caller must be ready for these cases to happen,
* and typically design a backup scheme to send data uncompressed.
* The combination of both techniques can significantly reduce
* the amount of margin required for in-place compression.
*/

#define LZ4_DECOMPRESS_INPLACE_MARGIN 32

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

Hi Cyan,
I don't know if it is the same in-place decompression of EROFS case...
EROFS use case is illustrated in the url https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/commit/?h=experimental&id=96f1f6f98564a01453ac8dc328b918a6415db42e.
As you see, some part of the input and output buffer is overlapped. That is because EROFS will allocate several pages before decompression, and it will read compressed data to the last page (and most of cases is full of the last page).
However oend could be greater than ibegin, not as described above ibegin >> oend...

Thanks,
Gao Xiang

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

More detail is that

  1. some user program requests decompressed data from page 6-9 (page 6 and page 9 are partially requested);
  2. EROFS will read compressed data to page 9 (since it is the last page, EROFS will not and doesn't have chance to allocate more buffers because of memory constraint in short);
  3. EROFS uses decompress-in-place bit which is marked in erofs-mkfs to decide whether it can decompress in-place in such the case;
  4. if decompress-in-place bit is not set, EROFS will memcpy compressed data in the last page to the per-CPU buffer, and then do the traditional decompression;
  5. if decompress-in-place bit is set, EROFS will decompress in-place;
    My concern is that since decompress-in-place bit is set in mkfs, but kernel lz4 implementation could be changed in order to improve decompression speed, so some on-the-fly margins could be changed as well...
    The reason why decompress-in-place bit is set in mkfs is that we want to improve in-place success rate further when the formula expression is false, it will emulate the decompress process...

For enwik9 dataset, EROFS have 99.6% success rate for lz4 1.8.3 margins....

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

And I am not sure whether my understanding is correct...
If you mean any buffer size larger than LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE could do overlapped decompression.. I will investigate further....

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 30, 2019

Author Member

If you mean any buffer size larger than LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE could do overlapped decompression

Yes, that's the intended meaning.
Maybe I could add some documentation to make sure it's correctly understood.

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 30, 2019

Author Member

I must admit that I'm not completely sure if I understand EROFS decompression stage correctly.

You mention that EROFS will allocate several pages (6-9 in your example), and start by writing compressed data into page 9.
This will work to decompress any page < 9, hence 6, 7 and 8. But it's not really "in-place", because output never overwrites the input.

However, for page 9 itself, this is less clear.
I suspect what you may mean by "margin" is that the page is not necessarily "full" after decompression.
In which case, the last bytes of the page can be considered a "margin", and if it's large enough, it makes page 9 in-place decompression possible. That would be correct.

But I'm speculating a lot here, and I'm not sure if I understood correctly what you meant.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

Hi Cyan,
It could be much better to add more words about it, because I tried to replace this line
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/tree/lib/compressor_lz4.c?h=experimental#n108
from dst + dstsize + 18 < src + srcsize - ((srcsize - 14) / 15) to something similar to the below LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE, but I triggered a DBG_BUGON on the next line
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/tree/lib/compressor_lz4.c?h=experimental#n109...
I will think more about it.....

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

I try to give more description about page 9.
at first (before decompression), page 9 is full of compressed data (which means ~ 4k, mostly 4k).
and the whole compressed data will decompress into page 6,7,8 (you are right, not overlap), and the beginning of page 9.
The page 9 will be overlapped partially from compressed data to decompressed data on-the-fly when decompressing. That is our design... and the margin, I mean is the necessary distance between iend and oend... I think it can be proved with some formula.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

I suspect what you may mean by "margin" is that the page is not necessarily "full" after decompression.
In which case, the last bytes of the page can be considered a "margin", and if it's large enough, it makes page 9 in-place decompression possible. That would be correct.

Yes, I think that is what I tried to say. After decompressing all the data the last page is not "full" of decompressed data, oend is much less than page end, the distance is what I say margin, otherwise I will emulate decompress to decide whether it is really safe or not to do overlapped decompression.

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 30, 2019

Author Member

It could be much better to add more words about it

Sure, I completed documentation in inplace branch regarding both compression and compression in-place.

I tried to replace this line (...) to something similar to the below LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE, but I triggered a DBG_BUGON

Questions :

  • Can it happen that EROFS store compressed data that is larger than original data (data expansion) ? This would indeed have an impact on how to calculate the margin.
  • What is the impact of DBG_BUGON ? Is it some kind of assert(), aka the code should never reach that stage ? Or just a trace ?

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

It could be much better to add more words about it

Sure, I completed documentation in inplace branch regarding both compression and compression in-place.

I tried to replace this line (...) to something similar to the below LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE, but I triggered a DBG_BUGON

Questions :

  • Can it happen that EROFS store compressed data that is larger than original data (data expansion) ? This would indeed have an impact on how to calculate the margin.

compressed data is less than original data for all cases, because EROFS has its own index format. If no compression gain, data will leave uncompressed 4k (that is another path --- no compression path).

  • What is the impact of DBG_BUGON ? Is it some kind of assert(), aka the code should never reach that stage ? Or just a trace ?

I just want to verify my formula is correct since I have no idea it is absolutely correct and is the strict lower bound;

p.s. My english is not quite well...sorry.... maybe I can quote ATC19 paper for more details which is written by another people, but it will be months later to publish...

and I want to optimize DIP mark, since if the formula is correct, I can evaluate it directly in kernel.
if it is marked in mkfs, there could be some lz4 version mismatch...hmmm...

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 30, 2019

Author Member

If I understand your explanations correctly, the only value you need for EROFS is LZ4_DECOMPRESS_INPLACE_MARGIN.
You just have ensure that the distance between oend and iend is always >= margin.

margin was expanded from 16 bytes, in v1.8.3, to 32 bytes, in 1.9.0, as a consequence of using larger stripes during decompression.

Now, I think that this margin is large enough for all cases, but it's only tested in a limited number of configurations, so there is always a chance that I might be wrong and forgot about some corner case scenarios. It's likely that you will be able to test many more scenarios with EROFS than I can with my test program. If you ever find a problem with the proposed margin, I'll glad to know, in order to update to formula and provide safer / more accurate margins.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

Yes, I will investigate more...

But I once considered literal length extra bytes, which is necessary for long literal. So literal length could take some compressed data but no decompressed data at all.
I am not worry about match (4 byte minmatch to 3 bytes token + offset) or match length extra bytes, because those have compression gain.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 30, 2019

My concern is that "
Consider many sequences nearly at the end of compressed data are as such

abc..[15+255+255+255+255]..defg + 4bytesmatch
output: 15+255+255+255+255+4 = 1039 byte decompressed data
input: 1(token)+5(extra_literal)+15+255+255+255+255 = 1041 byte compressed data
these sequences could consume "margin", which could cause data expansion locally.

However, the whole decompressed data is not data expansion (means decompressedSize > compressedSize) since the other sequences
are highly compressible, is it possible to cause op > ip at some position?"

I think the potential issue is the extra literal length bytes, which could be taken into consideration of the formula...
However I have no idea how to figure out the exact compressed data to prove that... Maybe I am wrong...
I have to sleep... If I am wrong, please ignore....

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 30, 2019

Author Member

That's a very interesting point.
Indeed, a long sequence of literals triggers a "local expansion", so is there a chance to overflow input with output during that period ?

Let's detail this scenario.
And let's start by reminding that we have a strong compressedSize < decompressedSize hypothesis.
That means that data expansion can only be "local", and is globally compensated by some other matches.

A first obvious scenario would be to start with a very large run of literals, followed by a match which is just large enough to satisfy the condition compressedSize < decompressedSize.
Let's assume that the LZ4_DECOMPRESS_INPLACE_MARGIN macro condition is respected, and now we have :

  • bufferSize = 4096
  • decompressedSize = 4064 (to allow 32 bytes of margin)
  • compressedSize = 4063 (<decompressedSize)
|0<------------------buffer----------------->|4096
      |33<------------compressed------------>|4096
|0<-------decompressed------->|4064
                              |4064<-margin->|4096

We start with a very large run of literals. Note that, since decompressedSize is limited, this run is limited too. For example : 15 + 15*255 +n = 3840+n bytes. This requires 16 additional length bytes.

Well, this is not a problem : lz4 will simply copy the long run of literals from compressed to decompressed. The source position is 33(start) + 1(token) +16(length) = 50 and destination position starts at 0, so we have actually increased the distance between ip and op. After the literals copy, ip is 17 bytes farther away from op, hence no risk of overwriting.

Actually, the distance between ip and op decreases when there is a match. So, to get into trouble, we would need to start with a match, and then end with a long run of literals.
That looks possible. Just invert the quantity.

The compressed block now has this content :
1(token) + 1(literal) + 2(offset==1) + 1(matchLen==22) + 1(token) + 16(litLen) + 4041(literals) == 4063 bytes

Now, after the first sequence, op is increased by 1 (literal) + 22 (matchLen), hence is at position 23.
While ip has just increased by 1 (token) + 1 (literal) + 2 (offset) + 1 (matchLen) == 5, starting from position 33, so now it is at position 38.

So we still have op(23) < ip(38), but that's not as safe as it sounds, because the distance is now 15, which is less than 32. And lz4 decoder works by stripes, of up to 32 bytes, so it could potentially overwrite as much as 32 bytes beyond its local position.

In practice, this specific example will still work correctly, because it happens that the overwriting operation will not reach dangerous territories (the 32 bytes stripe will be between positions 1 and 33, so it won't reach 38, and after that, the copy of last literals is safe, because the literals length will increase distance by 17 again, hence back to 32).

However, it underlines a potential scenario, where distances between ip and op are reduced, triggering a risk of overwriting input by output. All it takes is for decompressedSize to be large enough so that the initial match can reduce distance enough to trigger the overwriting impact. It's difficult with a limit of 4K, but larger blocks can likely cause it.

Therefore it seems safer to increase the margin by a variable amount, taking into consideration the risk that additional litLen bytes provide room for some early match to reduce distance between ip and op. It's a very convoluted scenario, but it seems possible.

I'll update the LZ4_DECOMPRESS_INPLACE_MARGIN macro accordingly, and try to add such a scenario in the tester.

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 30, 2019

Author Member

Added a test case featuring worst case in-place decompression scenario we have been discussing.

Updated LZ4_DECOMPRESS_INPLACE_MARGIN() to counter local expansion issue.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 31, 2019

Hi Cyan,
"additional litLen bytes" can be calculated by compressedsize (since all literals must be plaintext in compressed data), which should be much less than decompressedsize, but enough to evaluate the bytes of literals.

LZ4_DECOMPRESS_INPLACE_MARGIN could be LZ4_DECOMPRESS_INPLACE_MARGIN(compressedsize)

but I am not sure >> 8 is enough for us, therefore in my formula, it / 14... However I forgot the detailed risk, and I need some time to recall it...

It's difficult with a limit of 4K, but larger blocks can likely cause it.

Yes, you are right. and I need to observe more data to ensure this upcoming formula correctly because it could ship to million of mobile phones....

Thanks,
Gao Xiang

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 31, 2019

I think the worst case can be gotten as the following

[1] 1 token + 2 additional litLen bytes + 15 + 255 (literal) + 2 bytes offset + 4 bytes match, because 5 bytes cannot generate any data, plus 4 bytes match , therefore could be contributed 1 byte extra margin, therefore
32 + compressedsize / (255 + 15)

[2] 1 token + 3 additional litLen bytes + 15 + 255 + 255 (literal) + 2 byte offset + 4 bytes match, 6 bytes cannot generate any data, plus 4 bytes match , therefore could be contributed 2 byte extra margin, therefore
32 + compressedsize / (255 * 2 + 15) * 2 === 32 + compressed / (255 + 15/2)
....
[3] 1 token + 4 additional litLen bytes + 15 + 255 + 255 + 255 (literal) + 2 bytes offset + 4 bytes match
32 + compressedsize / (255 * 3 + 15) * 3 == 32 + compressedsize / (255 + 15/3)
...

Therefore I think 32 + compressedsize / 255 is the strictly lower bound... But I am not 100% sure, but I will try with
32 + compressedsize / 255.

And I hope this feature get merged into lz4 upstream :)

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 31, 2019

Author Member

As mentioned before, if the compressed block starts with a long run of literals, the distance between ip and op actually increases, therefore getting farther away from any risk of overwrite.

To get into trouble, the compressed block must start with a match.

Then, presuming we have this condition that compressedSize < decompressedSize, the maximum shrinkage perspective of this first match depends on how much local expansion is possible by a later run of literals. That's where the / 255 happens.
However, the / 255 proportion is on the number of literals, not on the total compressed nor decompressed size.

If one uses the decompressed size as base, the total size also includes the match length, which must be at least as large as the nb of additional bytes for literal length, in order to respect the condition compressedSize < decompressedSize.

If one uses the compressed size as base, the nb of literals cannot represent the whole compressed data : at most it represents 255/266 of compressed data, because the 256th byte is the additional byte for literal length.

So, in both cases, / 256, hence >> 8, seems to work properly.

I can replace >> 8 by / 256 if that feels proper. On unsigned value, the compiler should be smart enough to make this transformation. On signed values, that's less clear, because I'm not sure if rounding is the same. That could be considered a topic of secondary importance though.

Using compressedSize as base instead of decompressedSize has the perspective of making the margin smaller, which is good. However, it also requires the user to provide both values (compressedSize and decompressedSize) in order to calculate LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(), making this macro slightly more complex to use.
Yet that doesn't sound unreasonable. In order for in-place decompression to work, it seems mandatory to know both sizes anyway.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 31, 2019

As mentioned before, if the compressed block starts with a long run of literals, the distance between ip and op actually increases, therefore getting farther away from any risk of overwrite.

To get into trouble, the compressed block must start with a match.

Then, presuming we have this condition that compressedSize < decompressedSize, the maximum shrinkage perspective of this first match depends on how much local expansion is possible by a later run of literals. That's where the / 255 happens.
However, the / 255 proportion is on the number of literals, not on the total compressed nor decompressed size.

If one uses the decompressed size as base, the total size also includes the match length, which must be at least as large as the nb of additional bytes for literal length, in order to respect the condition compressedSize < decompressedSize.

If one uses the compressed size as base, the nb of literals cannot represent the whole compressed data : at most it represents 255/266 of compressed data, because the 256th byte is the additional byte for literal length.

So, in both cases, / 256, hence >> 8, seems to work properly.

I can replace >> 8 by / 256 if that feels proper. On unsigned value, the compiler should be smart enough to make this transformation. On signed values, that's less clear, because I'm not sure if rounding is the same. That could be considered a topic of secondary importance though.

I think agree with your detailed analysis.

Using compressedSize as base instead of decompressedSize has the perspective of making the margin smaller, which is good. However, it also requires the user to provide both values (compressedSize and decompressedSize) in order to calculate LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(), making this macro slightly more complex to use.
Yet that doesn't sound unreasonable. In order for in-place decompression to work, it seems mandatory to know both sizes anyway.

But I wonder why need both compressedSize and decompressedSize? If my understanding is correct.
Since decompressedSize > compressedSize, user can give any value which is not less than compressedSize.

Could you give an exact expression in advance? I think we are getting the end, and I will use it for my scenerio...

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 31, 2019

Author Member

But I wonder why need both compressedSize and decompressedSize? If my understanding is correct.
Since decompressedSize > compressedSize, user can give any value which is not less than compressedSize.

It's indeed not necessary.
The current LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE() macro only uses decompressedSize, resulting in a margin which is "large enough".

Using the compressedSize instead to calculate LZ4_DECOMPRESS_INPLACE_MARGIN is only useful to reduce the margin to some minimum length.
It's more a refinement than a necessity.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 31, 2019

But I wonder why need both compressedSize and decompressedSize? If my understanding is correct.
Since decompressedSize > compressedSize, user can give any value which is not less than compressedSize.

It's indeed not necessary.
The current LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE() macro only uses decompressedSize, resulting in a margin which is "large enough".

Using the compressedSize instead to calculate LZ4_DECOMPRESS_INPLACE_MARGIN is only useful to reduce the margin to some minimum length.
It's more a refinement than a necessity.

Yes, it depends on you :) but erofs will use LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE and pass compressedSize for our usage.

Thanks for taking time and your kindly help. I think I have no problem about this feature. :)

Thanks,
Gao Xiang

This comment has been minimized.

Copy link
@Cyan4973

Cyan4973 May 31, 2019

Author Member

Note : you can use compressedSize for LZ4_DECOMPRESS_INPLACE_MARGIN(),
but don't use it for LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE() !
This would result in a wrong buffer size calculation.

As mentioned earlier, I suspect your use case only needs LZ4_DECOMPRESS_INPLACE_MARGIN(), where it's fine to use compressedSize.

I'll update the documentation to make sure this difference is clearly understood.

This comment has been minimized.

Copy link
@hsiangkao

hsiangkao May 31, 2019

Yes, I think I fullly understand your idea and the internal meaning.
I will use it in the correct way. this is a typo error --- I mean LZ4_DECOMPRESS_INPLACE_MARGIN of course.

Actually I have tested mkfs with LZ4_DECOMPRESS_INPLACE_MARGIN(compressedSize) for about a day without any problem.

But in case I am afraid / 256 is not safe, I want to change into / 255, but it seems unnecessary after your detailed analysis. So 32 + compressedSize / 256 is OK for me. That is all, Thanks.

Thanks,
Gao Xiang

#define LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize) ( (decompressedSize) + LZ4_DECOMPRESS_INPLACE_MARGIN) /**< note: presumes that compressedSize < decompressedSize */

#ifndef LZ4_DISTANCE_MAX /* history window size; can be user-defined at compile time */
# define LZ4_DISTANCE_MAX 65535 /* set to maximum value by default */
#endif

#define LZ4_COMPRESS_INPLACE_MARGIN (LZ4_DISTANCE_MAX + 32)
#define LZ4_COMPRESS_INPLACE_BUFFER_SIZE(maxCompressedSize) ( (maxCompressedSize) + LZ4_COMPRESS_INPLACE_MARGIN) /**< maxCompressedSize is generally LZ4_COMPRESSBOUND(inputSize), but can be set to any lower value, with the risk that compression can fail (return code 0(zero)) */


#endif /* LZ4_STATIC_LINKING_ONLY */



/*-************************************************************
* PRIVATE DEFINITIONS
Expand Down Expand Up @@ -567,6 +610,7 @@ union LZ4_streamDecode_u {
} ; /* previously typedef'd to LZ4_streamDecode_t */



/*-************************************
* Obsolete Functions
**************************************/
Expand Down
3 changes: 1 addition & 2 deletions tests/fuzzer.c
Expand Up @@ -1017,8 +1017,7 @@ static void FUZ_unitTests(int compressionLevel)
DISPLAYLEVEL(3, "in-place compression using LZ4_compress_default() :");
{ size_t const sampleSize = 65 KB;
size_t const maxCSize = LZ4_COMPRESSBOUND(sampleSize);
size_t const margin = 64 KB;
size_t const outSize = maxCSize + margin;
size_t const outSize = LZ4_COMPRESS_INPLACE_BUFFER_SIZE(maxCSize);
size_t const startIndex = outSize - sampleSize;
char* const startInput = testCompressed + startIndex;
XXH32_hash_t const crcOrig = XXH32(testInput, sampleSize, 0);
Expand Down

0 comments on commit b17f578

Please sign in to comment.