Summary
COVER_ctx_init() in lib/dictBuilder/cover.c computes a partial-suffix-array
size as trainingSamplesSize - MAX(d, sizeof(U64)) + 1 (a size_t subtraction)
without first checking that trainingSamplesSize is large enough. The function
does guard against an undersized corpus, but it guards on totalSamplesSize,
while it uses trainingSamplesSize (a fraction controlled by params.splitPoint)
for the subtraction. With splitPoint < 1.0 and a small corpus the two
quantities diverge and the subtraction underflows, producing
(size_t)-2 (= 0xfffffffffffffffe). The next malloc request then becomes
roughly 16 EB, which ASan rejects as allocation-size-too-big and which in
production builds returns NULL (followed by a SEGV at the next dereference).
The path is most easily reached from ZDICT_optimizeTrainFromBuffer_cover,
because that entry point documents params.d == 0 (and params.k == 0) as
"let optimize pick" and then iterates internally over candidate d values.
A library consumer that forwards zero-initialized ZDICT_cover_params_t to
optimize — exactly what the documentation invites — triggers the bug as soon
as the corpus is small enough.
The zstd CLI is not affected: --train-cover validates and clamps cover
parameters before calling the API.
Root Cause
lib/dictBuilder/cover.c, COVER_ctx_init() (function defined at line 628).
The sanity check on the corpus uses totalSamplesSize:
// lib/dictBuilder/cover.c:641
if (totalSamplesSize < MAX(d, sizeof(U64)) ||
totalSamplesSize >= (size_t)COVER_MAX_SAMPLES_SIZE) {
...
return ERROR(srcSize_wrong);
}
But the subtraction immediately below operates on trainingSamplesSize, which
is a fraction (splitPoint) of totalSamplesSize:
// lib/dictBuilder/cover.c:669-670
ctx->suffixSize = trainingSamplesSize - MAX(d, sizeof(U64)) + 1;
ctx->suffix = (U32 *)malloc(ctx->suffixSize * sizeof(U32));
When splitPoint is the default for optimize (0.5 after splitPoint <= 0.0 ? COVER_DEFAULT_SPLITPOINT and consumer overrides) and the corpus is small,
trainingSamplesSize < MAX(d, sizeof(U64)) even though totalSamplesSize
passes the check. The subtraction wraps to (size_t)-N, and malloc() is
called with (size_t)-N * sizeof(U32) ≈ 16 EB.
How params.d == 0 makes this trivial to hit:
ZDICT_optimizeTrainFromBuffer_cover (cover.c:1197) maps a zero params.d
to a search range:
// lib/dictBuilder/cover.c:1206-1207
const unsigned kMinD = parameters->d == 0 ? 6 : parameters->d;
const unsigned kMaxD = parameters->d == 0 ? 8 : parameters->d;
and then iterates for (d = kMinD; d <= kMaxD; d += 2) (line 1254), calling
COVER_ctx_init(..., d, splitPoint, ...) (line 1261) for each candidate. There
is no precondition on the corpus size before the loop, so a caller that hands
optimize a tiny sample set — perfectly valid as far as the documented "let
optimize pick" contract is concerned — drives the inner function into the
underflowing branch.
PoC
Trigger file
crash_input is a 24-byte blob consumed by the AGF harness. The harness
parses the bytes as [8-byte header][16 bytes of sample data], then derives
nbSamples (≥5), per-sample sizes (summing to 16), and the cover parameters
from the header bits. The specific decoded shape that crashes:
nbSamples = 10, ten one-byte samples
params.d = 0 (zero-initialized — selects the optimize search)
params.k = 0 (likewise)
params.splitPoint = 0.5
dictBufferCapacity = 1360
The header bit pattern leaves d and k at zero so the harness enters the
optimize code path with a very small corpus.
How to generate
Any input that ends up calling ZDICT_optimizeTrainFromBuffer_cover with
nbSamples ≥ 5
- per-sample sizes such that
COVER_sum(first nbTrainSamples) < 8 (e.g. five
to ten 1-byte samples with splitPoint = 0.5)
- zero-initialized
params.d and params.k
will reproduce. Minimal recipe: feed N=10 one-byte samples to optimize with
zeroed params (see app_realworld.c below).
Trigger Method 1: Direct libzstd API
app_realworld.c — a minimal libzstd consumer that calls the documented
optimize entry point with zero-initialized parameters and ten one-byte samples
(a realistic edge case: training on very short log lines).
/* app_realworld.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define ZDICT_STATIC_LINKING_ONLY
#include <zdict.h>
int main(void)
{
const size_t N = 10;
unsigned char samples[N];
size_t sizes[N];
for (size_t i = 0; i < N; ++i) {
samples[i] = (unsigned char)('A' + i);
sizes[i] = 1;
}
const size_t dictCap = 1360;
void *dict = malloc(dictCap);
ZDICT_cover_params_t params;
memset(¶ms, 0, sizeof(params));
/* Documented as "let optimize pick" — but underflows downstream. */
params.d = 0;
params.k = 0;
params.steps = 0;
params.nbThreads = 1;
params.splitPoint = 0.50;
params.zParams.compressionLevel = 0;
size_t r = ZDICT_optimizeTrainFromBuffer_cover(
dict, dictCap, samples, sizes, (unsigned)N, ¶ms);
fprintf(stderr, "ZDICT_optimizeTrainFromBuffer_cover returned: %zu (isError=%d)\n",
r, ZDICT_isError(r));
free(dict);
return 0;
}
Build:
ZSTD_SRC=/path/to/zstd
clang -O1 -g -fsanitize=address \
-I"$ZSTD_SRC/lib" -I"$ZSTD_SRC/lib/dictBuilder" \
-DZDICT_STATIC_LINKING_ONLY \
app_realworld.c "$ZSTD_SRC/lib/libzstd.a" -pthread -o app_realworld
Run:
ASAN_OPTIONS=detect_leaks=0 ./app_realworld
Output:
==ERROR: AddressSanitizer: requested allocation size 0xfffffffffffffff8
exceeds maximum supported size of 0x10000000000
#0 0x...... in malloc
#1 0x...... in COVER_ctx_init lib/dictBuilder/cover.c:670:24
#2 0x...... in ZDICT_optimizeTrainFromBuffer_cover lib/dictBuilder/cover.c:1261
#3 0x...... in main app_realworld.c:...
SUMMARY: AddressSanitizer: allocation-size-too-big in malloc
Without ASan, malloc() returns NULL and the next access in COVER_ctx_init
SEGVs.
Trigger Method 2: Fuzzer (AGF harness)
simple_compress.c — the libFuzzer harness used by AGF. It deterministically
parses the input into samples and cover parameters and calls
ZDICT_optimizeTrainFromBuffer_cover.
#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#define ZDICT_STATIC_LINKING_ONLY
#include "zdict.h"
#define MIN_DICT_CAPACITY 256U
#define MAX_DICT_CAPACITY 2048U
#define MAX_SAMPLE_BYTES 4096U
#define MAX_NB_SAMPLES 32U
#define HEADER_SIZE 8U
static size_t min_size(size_t a, size_t b) { return a < b ? a : b; }
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
{
if (size < HEADER_SIZE + 8) return 0;
size_t sampleBytes = min_size(size - HEADER_SIZE, MAX_SAMPLE_BYTES);
if (sampleBytes < 8) return 0;
const uint8_t *samplesBuffer = data + HEADER_SIZE;
unsigned nbSamples = 5U + (unsigned)(data[0] % (MAX_NB_SAMPLES - 4U));
if ((size_t)nbSamples > sampleBytes) nbSamples = (unsigned)sampleBytes;
if (nbSamples < 5U) return 0;
size_t dictBufferCapacity = MIN_DICT_CAPACITY
+ ((size_t)data[1] | ((size_t)data[2] << 8))
% (MAX_DICT_CAPACITY - MIN_DICT_CAPACITY + 1U);
void *dictBuffer = malloc(dictBufferCapacity);
size_t *samplesSizes = (size_t *)malloc((size_t)nbSamples * sizeof(samplesSizes[0]));
if (!dictBuffer || !samplesSizes) { free(dictBuffer); free(samplesSizes); return 0; }
/* Build nbSamples non-empty samples whose sizes sum to sampleBytes. */
size_t remaining = sampleBytes - nbSamples;
for (unsigned i = 0; i + 1U < nbSamples; ++i) {
size_t add = remaining
? (size_t)data[HEADER_SIZE + (i % sampleBytes)] % (remaining + 1U)
: 0;
samplesSizes[i] = 1U + add;
remaining -= add;
}
samplesSizes[nbSamples - 1U] = 1U + remaining;
ZDICT_cover_params_t params;
memset(¶ms, 0, sizeof(params));
if ((data[3] & 1U) == 0)
params.d = 6U + 2U * (unsigned)((data[4] >> 4) % 3U); /* otherwise 0 */
if ((data[3] & 2U) == 0) {
unsigned minK = params.d == 0 ? 8U : params.d;
unsigned range = (unsigned)(dictBufferCapacity - minK + 1U);
params.k = minK + (unsigned)(data[4] % range);
}
if ((data[3] & 4U) == 0) params.steps = 1U + (unsigned)(data[5] % 4U);
params.nbThreads = 1U;
if ((data[3] & 8U) != 0 && nbSamples >= 10U)
params.splitPoint = 0.50 + ((double)(data[6] % 51U) / 100.0);
else if ((data[3] & 16U) != 0)
params.splitPoint = 1.0;
params.zParams.compressionLevel = (int)(data[7] % 10U);
(void)ZDICT_optimizeTrainFromBuffer_cover(dictBuffer, dictBufferCapacity,
samplesBuffer, samplesSizes,
nbSamples, ¶ms);
free(samplesSizes);
free(dictBuffer);
return 0;
}
Build:
clang++ -fsanitize=fuzzer,address -g -O1 \
-I"$ZSTD_SRC/lib" -I"$ZSTD_SRC/lib/dictBuilder" \
-DZDICT_STATIC_LINKING_ONLY \
simple_compress.c "$ZSTD_SRC/lib/libzstd.a" -pthread -o cover_fuzzer
Run on crash_input:
ASAN_OPTIONS=detect_leaks=0 ./cover_fuzzer crash_input
Same allocation-size-too-big at lib/dictBuilder/cover.c:670 in
COVER_ctx_init.
Impact
| Aspect |
Details |
| Type |
Denial of Service (~16 EB malloc; abort under ASan, NULL-deref / SEGV otherwise) |
| Severity |
Low |
| Attack Vector |
Local; library consumer asks for dictionary optimization via the documented "auto" mode (params.d == 0 / params.k == 0) on a small corpus |
| Affected Components |
libzstd dictBuilder (lib/dictBuilder/cover.c, ZDICT_optimizeTrainFromBuffer_cover → COVER_ctx_init). Not the zstd CLI — --train-cover validates parameters before the API call |
| Reachability |
Programmatic only: applications that pass zero-initialized ZDICT_cover_params_t (or any small d paired with a small corpus and splitPoint < 1.0) into the optimize entry point |
| CWE |
CWE-190 (Integer Overflow/Underflow) and CWE-789 (Memory Allocation with Excessive Size Value) |
Suggested Fix
The cleanest local fix is in COVER_ctx_init itself: gate the subtraction on
the same quantity that participates in it. In lib/dictBuilder/cover.c around
line 641, change the precondition from totalSamplesSize to
trainingSamplesSize (or add a second check), so the function bails out with
ERROR(srcSize_wrong) instead of underflowing at line 669:
/* lib/dictBuilder/cover.c, COVER_ctx_init */
if (trainingSamplesSize < MAX(d, sizeof(U64)) ||
totalSamplesSize < MAX(d, sizeof(U64)) ||
totalSamplesSize >= (size_t)COVER_MAX_SAMPLES_SIZE) {
DISPLAYLEVEL(1, "Total samples size is too small/large (...)");
return ERROR(srcSize_wrong);
}
...
ctx->suffixSize = trainingSamplesSize - MAX(d, sizeof(U64)) + 1;
Equivalently, clamp the subtraction:
if (trainingSamplesSize < MAX(d, sizeof(U64)) + 1)
return ERROR(srcSize_wrong);
ctx->suffixSize = trainingSamplesSize - MAX(d, sizeof(U64)) + 1;
A complementary hardening at the API boundary, in
ZDICT_optimizeTrainFromBuffer_cover (cover.c:1197), would be to require
that COVER_sum(samplesSizes, nbSamples) * splitPoint >= kMaxD before
entering the for (d = kMinD; d <= kMaxD; d += 2) loop, so the "auto" mode is
rejected up front when the corpus is too small to support any candidate d.
Both changes are small and local; either alone closes the underflow.
Credit
Aisle Research (Ze Sheng, Dmitrijs Trizna, Luigino Camastra, Guido Vranken)
Summary
COVER_ctx_init()inlib/dictBuilder/cover.ccomputes a partial-suffix-arraysize as
trainingSamplesSize - MAX(d, sizeof(U64)) + 1(asize_tsubtraction)without first checking that
trainingSamplesSizeis large enough. The functiondoes guard against an undersized corpus, but it guards on
totalSamplesSize,while it uses
trainingSamplesSize(a fraction controlled byparams.splitPoint)for the subtraction. With
splitPoint < 1.0and a small corpus the twoquantities diverge and the subtraction underflows, producing
(size_t)-2(=0xfffffffffffffffe). The nextmallocrequest then becomesroughly
16 EB, which ASan rejects asallocation-size-too-bigand which inproduction builds returns
NULL(followed by a SEGV at the next dereference).The path is most easily reached from
ZDICT_optimizeTrainFromBuffer_cover,because that entry point documents
params.d == 0(andparams.k == 0) as"let optimize pick" and then iterates internally over candidate
dvalues.A library consumer that forwards zero-initialized
ZDICT_cover_params_ttooptimize — exactly what the documentation invites — triggers the bug as soon
as the corpus is small enough.
The
zstdCLI is not affected:--train-covervalidates and clamps coverparameters before calling the API.
Root Cause
lib/dictBuilder/cover.c,COVER_ctx_init()(function defined at line 628).The sanity check on the corpus uses
totalSamplesSize:But the subtraction immediately below operates on
trainingSamplesSize, whichis a fraction (
splitPoint) oftotalSamplesSize:When
splitPointis the default for optimize (0.5aftersplitPoint <= 0.0 ? COVER_DEFAULT_SPLITPOINTand consumer overrides) and the corpus is small,trainingSamplesSize < MAX(d, sizeof(U64))even thoughtotalSamplesSizepasses the check. The subtraction wraps to
(size_t)-N, andmalloc()iscalled with
(size_t)-N * sizeof(U32)≈ 16 EB.How
params.d == 0makes this trivial to hit:ZDICT_optimizeTrainFromBuffer_cover(cover.c:1197) maps a zeroparams.dto a search range:
and then iterates
for (d = kMinD; d <= kMaxD; d += 2)(line 1254), callingCOVER_ctx_init(..., d, splitPoint, ...)(line 1261) for each candidate. Thereis no precondition on the corpus size before the loop, so a caller that hands
optimize a tiny sample set — perfectly valid as far as the documented "let
optimize pick" contract is concerned — drives the inner function into the
underflowing branch.
PoC
Trigger file
crash_inputis a 24-byte blob consumed by the AGF harness. The harnessparses the bytes as
[8-byte header][16 bytes of sample data], then derivesnbSamples(≥5), per-sample sizes (summing to 16), and the cover parametersfrom the header bits. The specific decoded shape that crashes:
nbSamples = 10, ten one-byte samplesparams.d = 0(zero-initialized — selects the optimize search)params.k = 0(likewise)params.splitPoint = 0.5dictBufferCapacity = 1360The header bit pattern leaves
dandkat zero so the harness enters theoptimizecode path with a very small corpus.How to generate
Any input that ends up calling
ZDICT_optimizeTrainFromBuffer_coverwithnbSamples ≥ 5COVER_sum(first nbTrainSamples) < 8(e.g. fiveto ten 1-byte samples with
splitPoint = 0.5)params.dandparams.kwill reproduce. Minimal recipe: feed N=10 one-byte samples to optimize with
zeroed params (see
app_realworld.cbelow).Trigger Method 1: Direct libzstd API
app_realworld.c— a minimal libzstd consumer that calls the documentedoptimize entry point with zero-initialized parameters and ten one-byte samples
(a realistic edge case: training on very short log lines).
Build:
ZSTD_SRC=/path/to/zstd clang -O1 -g -fsanitize=address \ -I"$ZSTD_SRC/lib" -I"$ZSTD_SRC/lib/dictBuilder" \ -DZDICT_STATIC_LINKING_ONLY \ app_realworld.c "$ZSTD_SRC/lib/libzstd.a" -pthread -o app_realworldRun:
Output:
Without ASan,
malloc()returnsNULLand the next access inCOVER_ctx_initSEGVs.
Trigger Method 2: Fuzzer (AGF harness)
simple_compress.c— the libFuzzer harness used by AGF. It deterministicallyparses the input into samples and cover parameters and calls
ZDICT_optimizeTrainFromBuffer_cover.Build:
clang++ -fsanitize=fuzzer,address -g -O1 \ -I"$ZSTD_SRC/lib" -I"$ZSTD_SRC/lib/dictBuilder" \ -DZDICT_STATIC_LINKING_ONLY \ simple_compress.c "$ZSTD_SRC/lib/libzstd.a" -pthread -o cover_fuzzerRun on crash_input:
Same
allocation-size-too-bigatlib/dictBuilder/cover.c:670inCOVER_ctx_init.Impact
NULL-deref / SEGV otherwise)params.d == 0/params.k == 0) on a small corpuslibzstddictBuilder (lib/dictBuilder/cover.c,ZDICT_optimizeTrainFromBuffer_cover→COVER_ctx_init). Not thezstdCLI —--train-covervalidates parameters before the API callZDICT_cover_params_t(or any smalldpaired with a small corpus andsplitPoint < 1.0) into the optimize entry pointSuggested Fix
The cleanest local fix is in
COVER_ctx_inititself: gate the subtraction onthe same quantity that participates in it. In
lib/dictBuilder/cover.caroundline 641, change the precondition from
totalSamplesSizetotrainingSamplesSize(or add a second check), so the function bails out withERROR(srcSize_wrong)instead of underflowing at line 669:Equivalently, clamp the subtraction:
A complementary hardening at the API boundary, in
ZDICT_optimizeTrainFromBuffer_cover(cover.c:1197), would be to requirethat
COVER_sum(samplesSizes, nbSamples) * splitPoint >= kMaxDbeforeentering the
for (d = kMinD; d <= kMaxD; d += 2)loop, so the "auto" mode isrejected up front when the corpus is too small to support any candidate
d.Both changes are small and local; either alone closes the underflow.
Credit
Aisle Research (Ze Sheng, Dmitrijs Trizna, Luigino Camastra, Guido Vranken)