Skip to content

Skip Hibernate deep copy for ValueEntity.data JSON column#87

Merged
willr3 merged 1 commit into
Hyperfoil:mainfrom
stalep:immutable-json-data
May 13, 2026
Merged

Skip Hibernate deep copy for ValueEntity.data JSON column#87
willr3 merged 1 commit into
Hyperfoil:mainfrom
stalep:immutable-json-data

Conversation

@stalep
Copy link
Copy Markdown
Member

@stalep stalep commented May 13, 2026

Summary

  • Add @Mutability(Immutability.class) to ValueEntity.data so Hibernate uses reference equality for dirty-checking instead of deserializing and comparing the full JSON tree

Problem

Hibernate's FormatMapperBasedJavaType.deepCopy deserializes and re-serializes the JSONB data column on every flush to create a snapshot for dirty comparison. Profiling showed this consumed ~50% of CPU during uploads (deepCopy + dirty-checking combined).

Fix

@Mutability(Immutability.class) tells Hibernate the field value is never mutated in-place — changes are only made by replacing the reference (data = newData). Hibernate then uses the original reference as the snapshot and compares with == instead of deep tree comparison.

All code paths that modify data use reference replacement, never in-place mutation (e.g., no data.put("key", val)), so reference equality is correct.

Benchmark — rhivos-perf-comprehensive legacy import (5 runs, 4 workers, PostgreSQL)

Metric Before After Change
Wall-clock 1m36s 1m20s -17%
CPU user 2m02s 0m59s -51%
dirty-checking samples 3,946 2 -99.97%
deepCopy samples 2,169 0 -100%
GC samples 1,764 1,572 -11%

Test plan

  • All 207 existing tests pass
  • Verified no code path mutates data in-place — all are reference replacements
  • Recalculation tests confirm dirty detection still works for existingValue.data = newValue.data reassignment

Add @mutability(Immutability.class) to the JSONB data field so Hibernate
uses reference equality instead of deserialize-compare-reserialize for
dirty checking. All code paths that modify data use reference replacement
(data = newData), never in-place mutation, so reference equality is
correct.

Profiling showed deepCopy + dirty-checking of the JSON column consumed
~50% of CPU during uploads. This eliminates both entirely — deepCopy
drops to 0 samples, dirty-checking to near-zero. Wall-clock improvement
of ~17% on a 4-worker rhivos import (1m36s to 1m20s) with CPU user time
cut in half (2m02s to 59s).
@stalep stalep requested a review from willr3 May 13, 2026 11:46
@willr3 willr3 merged commit 4e7ab8e into Hyperfoil:main May 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants