Skip to content

fix: benchmark infrastructure — JVM heap, Liquibase, MockProducer leak#48

Merged
endrju19 merged 2 commits into
mainfrom
fix/benchmark-infra
May 17, 2026
Merged

fix: benchmark infrastructure — JVM heap, Liquibase, MockProducer leak#48
endrju19 merged 2 commits into
mainfrom
fix/benchmark-infra

Conversation

@endrju19
Copy link
Copy Markdown
Collaborator

Summary

Three independent fixes that make `./gradlew :okapi-benchmarks:jmh` complete cleanly. Before these, the JMH run OOMs partway through. All three issues exist on main today; running the benchmark suite without these fixes will fail.

No test or production code is touched — pure benchmark infrastructure.

Fixes

1. Bump JMH JVM heap to `-Xmx8g`

Throughput-mode microbenchmarks call `deliver()` at ~1M ops/s; each call allocates Jackson + Kotlin reflection state for JSON deserialization. At the previous default `-Xmx2g` the allocation rate exceeds GC throughput and OOMs within the first measurement iteration.

2. Pass `-Dliquibase.duplicateFileMode=WARN` as JMH JVM arg

`okapi-postgres.jar` and the fat JMH jar both ship the changelog at the same classpath path (`com/softwaremill/okapi/db/postgres/changelog.xml`). Liquibase 4.x treats duplicate resources as an error by default, which aborts `PostgresBenchmarkSupport` setup. The two files are identical (same jar source on the classpath twice), so `WARN` is safe.

3. Subclass `MockProducer` in `DelivererMicroBenchmark` to `clear()` history after every `send()`

`MockProducer.history` (internal `sent` list) retains every record sent for inspection — there is no eviction. In throughput mode at ~1M ops/s for 30s × forks × iterations that list grew to GBs and OOMed the JVM regardless of heap size. Discarding per call is safe because microbench doesn't inspect what was sent — only timing.

With this fix, `DelivererMicroBenchmark.kafkaDeliver` now produces meaningful numbers (~2.3M ops/s ± <1%) instead of `error > score`.

Files

  • `okapi-benchmarks/build.gradle.kts` — JVM args
  • `okapi-benchmarks/src/jmh/kotlin/.../DelivererMicroBenchmark.kt` — MockProducer override

Why a separate PR

These are pure infrastructure fixes — completely independent of any specific benchmark or transport implementation. Carved out from PR #46 (KOJAK-82) so they can land on main right away, without waiting for the Kafka deliverBatch (#40) review cycle. PR #46 will then contain only the refreshed JMH numbers.

Test plan

  • `./gradlew :okapi-benchmarks:compileJmhKotlin` passes
  • Verified locally: full `./gradlew :okapi-benchmarks:jmh` run completes with `BUILD SUCCESSFUL` and no OOM

endrju19 added 2 commits May 17, 2026 10:08
Three independent fixes that make ./gradlew :okapi-benchmarks:jmh
complete cleanly. Before these, the JMH run OOMs partway through.

1. Bump JMH JVM heap from -Xmx2g to -Xmx8g.
   Throughput-mode microbenchmarks call deliver() at ~1M ops/s; each
   call allocates Jackson + Kotlin reflection state for deserialization.
   At 2GB the allocation rate exceeds GC throughput and OOMs within
   the first measurement iteration.

2. Pass -Dliquibase.duplicateFileMode=WARN as JMH JVM arg.
   okapi-postgres.jar and the fat JMH jar both ship the changelog at
   the same classpath path. Liquibase 4.x treats duplicate resources
   as an error by default, which aborts PostgresBenchmarkSupport
   setup. The two files are identical (same jar source on the classpath
   twice), so WARN is safe.

3. Subclass MockProducer in DelivererMicroBenchmark to clear() history
   after every send().
   MockProducer.history (internal `sent` list) retains every record
   sent for inspection — there is no eviction. In throughput mode at
   ~1M ops/s for 30s × forks × iterations that list grew to GBs and
   OOMed the JVM regardless of heap size. Discarding per call is safe
   because microbench doesn't inspect what was sent — only timing.

All three issues exist on main today; running the benchmark suite
without these fixes will fail. No test or production code is touched.
@endrju19 endrju19 merged commit d9aa3be into main May 17, 2026
8 checks passed
@endrju19 endrju19 deleted the fix/benchmark-infra branch May 17, 2026 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant