Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZSTD_estimateCCtxSize() and ZSTD_compressSequences() #3747

Closed
encode84 opened this issue Sep 6, 2023 · 4 comments
Closed

ZSTD_estimateCCtxSize() and ZSTD_compressSequences() #3747

encode84 opened this issue Sep 6, 2023 · 4 comments
Assignees
Labels

Comments

@encode84
Copy link

encode84 commented Sep 6, 2023

If allocate memory according to ZSTD_estimateCCtxSize() with level 3 or below, the ZSTD_compressSequences( ) will return error -64 (ZSTD_error_memory_allocation)
With level 4 and above it works fine. Is it expected behavior?

(Of course I'm using it in pair with ZSTD_initStaticCCtx())

If use ZSTD_compressSequences() with ZSTD_createCCtx() no issues what so ever.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Sep 6, 2023

Hi Ilya,

the expectation is that ZSTD_estimateCCtxSize() returns a size which is suitable for the level specified and any level below it : https://github.com/facebook/zstd/blob/v1.5.5/lib/zstd.h#L1644 .

Therefore, if one requests a size for level 3, it will be fine for levels 1, 2 and 3, but there is no expectation for it to work for level 4.

Occasionally, it might be possible for a higher level to fit in, because, by chance, the technical parameters of this level happen to not require more memory, so it fits. But there is no guarantee, and as the technical definition of "levels" change overtime, this observation may also change across versions.
The technical definition of levels is hosted here : https://github.com/facebook/zstd/blob/dev/lib/compress/clevels.h .

@Cyan4973 Cyan4973 self-assigned this Sep 6, 2023
@encode84
Copy link
Author

encode84 commented Sep 7, 2023

Hello Yann,
The problem is that I pass the same compression level to all of these functions (via ZSTD_CCtx_setParameter(cctx, ZSTD_c_compressionLevel), That's why I alarmed it.
ZSTD_estimateCCtxSize(3) memory will be not sufficient for compression with ZSTD_CCtx_setParameter(cctx, ZSTD_c_compressionLevel, 3) + ZSTD_compressSequences(cctx, ...

@Cyan4973
Copy link
Contributor

Cyan4973 commented Sep 7, 2023

ZSTD_estimateCCtxSize() was meant to be used with ZSTD_compressCCtx().

ZSTD_compressSequences() is a completely different entry point, and I'm not sure how it would work in this case.
I'm not even sure if ZSTD_compressSequences() is compatible with ZSTD_initStaticCCtx()...

Since ZSTD_compressSequences() will compress the sequences that you provide, it completely skips the match searching stage, which means it doesn't even need memory for the search tables.
So it actually uses less memory ? Well for search certainly, but on the other hand, it needs more memory specifically to handle these externally provided sequences, and build some internal representation of it.
In short, its memory requirement is somewhat unrelated to compressionLevel.

I don't even know how much it needs, since this combination with ZSTD_initStaticCCtx() was never tested.

edit : one method to get an appropriate size for ZSTD_initStaticCCtx() could be this one :

  • Allocate a normal ZSTD_CCtx* state (with full malloc capability)
  • Run your scenario as wanted (using ZSTD_compressSequences())
  • At the end, request ZSTD_sizeof_CCtx(cctx), this should return the state size
  • Assuming everything ZSTD_compressSequences() needs can fit into the state size, this should be a large enough value for ZSTD_initStaticCCtx().

There is still a risk that ZSTD_compressSequences() also requires some dynamic memory sizing capability outside of the scope of ZSTD_initStaticCCtx(). In which case, no size will ever be large enough. The best way to know that would be to test it with some dummy very large state size.

@encode84
Copy link
Author

encode84 commented Sep 8, 2023

Thank you, Yann!
It might be a good idea to add a special level for ZSTD_estimateCCtxSize(), e.g. level 255 or something to indicate entropy only operations for ZSTD_compressSequences()

Cyan4973 added a commit that referenced this issue Sep 13, 2023
gcflymoto pushed a commit to gcflymoto/zstd that referenced this issue Nov 2, 2023
hswong3i pushed a commit to alvistack/facebook-zstd that referenced this issue Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants