memset() rather than reduceIndex() #1658

Cyan4973 · 2019-06-21T23:50:28Z

I recently received a user performance report stating that reduceIndex() can require a non-negligible delay to perform its job, especially when applied to a large context.

reduceIndex() is required in 2 scenarios :

when compressing a large data stream, typically > 4 GB
when reusing a context for a lot of smaller data streams, and using the continue mode as a consequence of constantly keeping the same parameters.

For the first case, there's nothing new.

For the second case, the issue is that reduceIndex() is a measurable load compared to a single compression job. User reported delay in the ~80ms range. I could find similar delay on my laptop with llvm in the 20-30ms range when using large tables (~30 MB). User results are exacerbated by the use of Visual Studio, which is worse at auto-vectorization (reduceIndex() performance is tied to auto-vectorization).

The thing is, for such case, we do not actually need a reduceIndex() operation. A simpler memset() would work just as well. And of course, a big difference is that memset() is actually fast (I could measure up to 10x faster), and performance differences between compilers is unlikely to be "large".

This patch tries to favor memset() over reduceIndex() by testing currentIndex when starting a new compression (detected by triggering ZSTD_resetCCtx_internal()). It will disallow continue mode when currentIndex is considered "too close" to the limit.
The limit was already defined in zstd_compress_internal.h as ZSTD_CURRENT_MAX, and "too close" has been arbitrarily defined as < ZSTD_INDEXOVERFLOW_MARGIN (== 16 MB).

like assert() but cannot be disabled. proper separation of user contract errors (CONTROL()) and invariant verification (assert()).

…imit by disabling continue mode when index is close to limit.

terrelln · 2019-06-22T00:43:27Z

lib/compress/zstd_compress.c

+ * memset() will be triggered before reduceIndex().
+ */
+#define ZSTD_INDEXOVERFLOW_MARGIN (16 MB)
+static int ZSTD_index_valid_for_continue(ZSTD_window_t w)


nit: ZSTD_indexValidForContinue() is more inline with naming.

Also : minor speed optimization : shortcut to ZSTD_reset_matchState() rather than the full reset process. It still needs to be completed with ZSTD_continueCCtx() for proper initialization. Also : changed position of LDM hash tables in the context, so that the "regular" hash tables can be at a predictable position, hence allowing the shortcut to ZSTD_reset_matchState() without complex conditions.

Cyan4973 · 2019-06-24T21:56:11Z

This last update implements an optimization :
rather than going through the full update process,
it shortcuts to ZSTD_reset_matchState() :
since all parameters are the same, there is no need to re-arrange table sizes and positions.

Seems to work fine so far, but I'm not against a second look,
since I have not redacted ZSTD_reset_matchState(),
I could miss some implied condition with the rest of the code.

In particular, I would like to be completely sure that it works well in combination with LDM.
So far, it seems fine, all tests pass, and I couldn't identify a failure scenario.
The new code skips this LDM memset(), but proceeds to ZSTD_continueCCtx() as usual, which does window_clear the LDM tables. So I suspect it works, but LDM tables will get a rescale(), instead of a memset().

@terrelln likely has a better understanding of this part.

terrelln · 2019-06-24T22:10:35Z

It should be fine. LDM manages its own window and indices, so this change shouldn't impact it.

I'll take a closer look in a bit, just to be sure.

terrelln

Sorry this fell off my radar, LGTM!

Cyan4973 added 2 commits June 21, 2019 15:58

benchfn : added macro macro CONTROL()

944e2e9

like assert() but cannot be disabled. proper separation of user contract errors (CONTROL()) and invariant verification (assert()).

prefer memset() rather than reduceIndex() when close to index range l…

45c9fbd

…imit by disabling continue mode when index is close to limit.

Cyan4973 requested a review from terrelln June 21, 2019 23:50

facebook-github-bot added the CLA Signed label Jun 21, 2019

terrelln approved these changes Jun 22, 2019

View reviewed changes

terrelln approved these changes Jul 1, 2019

View reviewed changes

Cyan4973 merged commit 857e608 into dev Jul 1, 2019

felixhandte mentioned this pull request Jul 19, 2019

Merge v1.4.1 to Master #1691

Merged

Cyan4973 deleted the memset branch August 27, 2019 17:25

terrelln mentioned this pull request Jan 17, 2020

Fix lowLimit underflow in overflow correction #1957

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memset() rather than reduceIndex() #1658

memset() rather than reduceIndex() #1658

Cyan4973 commented Jun 21, 2019 •

edited

Loading

terrelln Jun 22, 2019

Cyan4973 commented Jun 24, 2019 •

edited

Loading

terrelln commented Jun 24, 2019 •

edited

Loading

terrelln left a comment

memset() rather than reduceIndex() #1658

memset() rather than reduceIndex() #1658

Conversation

Cyan4973 commented Jun 21, 2019 • edited Loading

terrelln Jun 22, 2019

Choose a reason for hiding this comment

Cyan4973 commented Jun 24, 2019 • edited Loading

terrelln commented Jun 24, 2019 • edited Loading

terrelln left a comment

Choose a reason for hiding this comment

Cyan4973 commented Jun 21, 2019 •

edited

Loading

Cyan4973 commented Jun 24, 2019 •

edited

Loading

terrelln commented Jun 24, 2019 •

edited

Loading