Skip to content

[bugfix] NumericValue + BooleanValue: hashCode contract for op:same-key (PR #6333 family)#6336

Merged
duncdrum merged 1 commit into
eXist-db:developfrom
joewiz:bugfix/systemic-hashcode-sweep
May 12, 2026
Merged

[bugfix] NumericValue + BooleanValue: hashCode contract for op:same-key (PR #6333 family)#6336
duncdrum merged 1 commit into
eXist-db:developfrom
joewiz:bugfix/systemic-hashcode-sweep

Conversation

@joewiz
Copy link
Copy Markdown
Member

@joewiz joewiz commented May 10, 2026

Summary

Closes the family of AtomicValue subclasses that violate the equals/hashCode contract under op:same-key semantics. PR #6333 fixed xs:duration; this sweep covers xs:integer / xs:decimal / xs:double / xs:float and xs:boolean.

Without these guards, Bifurcan-backed map operations (map:contains, map:get, ...) bucket spec-equal cross-type keys differently and silently mis-report containment. The diagnostic signature is the same "1-in-N pass rate" symptom that produced map-contains-017 on xs:duration.

Why

In MapType.KEY_HASH_FN = AtomicValue::hashCode, Bifurcan uses each value's hashCode for bucketing and only consults sameKey for equality within a bucket. If two spec-equal cross-type values hash to different buckets, the equality check is never reached.

Confirmed contract violations on develop (b917e1a)

Pair equals hashCode
IntegerValue(1) vs DecimalValue(1) true 1 / 31 ❌
IntegerValue(1) vs DoubleValue(1.0) true 1 / 1072693248 ❌
DecimalValue(1.0) vs DoubleValue(1.0) true 31 / 1072693248 ❌
DoubleValue(1.0) vs FloatValue(1.0f) true 1072693248 / 1065353216 ❌
DoubleValue(+INF) vs FloatValue(+INF) true 2146435072 / 2139095040 ❌
new BooleanValue(true) vs new BooleanValue(true) true identity / identity ❌

What changed

File Change
NumericValue.java Add canonical hashCode(): NaN / ±INF sentinels, else ((DecimalValue) convertTo(DECIMAL)).getValue().hashCode() (trailing-zero-stripped at construction). Equals already uses op:numeric-equal, so the contract now holds.
IntegerValue.java, DecimalValue.java, DoubleValue.java, FloatValue.java Drop per-class hashCode() overrides; inherit the canonical form from NumericValue.
BooleanValue.java Add hashCode() = Boolean.hashCode(value). Non-singleton instances previously hashed by identity.
HashCodeContractTest.java (new) 14 contract assertions across the numeric and boolean clusters; 100-iter determinism guard.
maps.xqm 9 XQuery regression cases mirroring map-contains-017 (numeric cross-type, NaN, infinities, non-singleton booleans). mt:for-each updated to sort its result — the previous assertion relied on Bifurcan bucket iteration order, which has no spec guarantee.
custom-assertion.xqm Sort additional-keys diagnostic for determinism; one test expectation updated for new bucket distribution.

Inventory of AtomicValue subclasses surveyed

Class equals override hashCode override Verdict
IntegerValue inherited from NumericValue (cross-type numeric-eq) replaced by canonical NumericValue fixed
DecimalValue inherited replaced fixed
DoubleValue inherited replaced fixed
FloatValue inherited replaced fixed
BooleanValue by-value added fixed
DurationValue (+ OrderedDurationValue / YearMonthDurationValue / DayTimeDurationValue) by canonical tuple (added in PR #6333) scope of PR #6333
StringValue, AnyURIValue, UntypedAtomicValue by value by value already correct (codepoint-equal cluster delegates to String.hashCode)
AbstractDateTimeValue (+ all date/time/gXxx subclasses) by calendar by calendar.hashCode already correct within-class; cross-type sameKey requires lexical match so no cross-type collision needed
BinaryValue (+ BinaryValueFromBinaryString, BinaryValueFromFile, BinaryValueFromInputStream, Base64BinaryDocument) by getClass() + content by stream-read hash within-class consistent; cross-type (hexBinary vs base64Binary) returns false in equals and fn:deep-equal returns false too
QNameValue by qname by qname.hashCode already correct
FunctionReference, JavaObjectValue not used as map keys per spec n/a out of scope

Spec references

Test plan

  • HashCodeContractTest (14/14 pass)
  • MapTests XQuery suite (134 pass, was failing mt:for-each due to bucket order)
  • XQSuiteTests (90 pass after custom-assertion.xqm determinism fix)
  • Full mvn test -pl exist-core under shared lock (6615 tests, no hashCode-related failures)
  • XQTS QT4 / XQ 3.1 / FTTS - to be validated in CI; runner-JAR-shaded-exist-core hazard means local runs are unreliable for this change shape (cf. project_xqts_runner_info_drift_warnings.md)

Risk and rollback

Hash bucket layout for numeric and boolean keys changes. Map iteration order is implementation-defined per spec; tests that asserted specific orderings have been updated to sort. Each value-class change is independently revertable.

Closes the family of AtomicValue subclasses whose equals/hashCode contract is
violated under op:same-key, the same Bifurcan-bucket hazard PR eXist-db#6333 closed
on xs:duration:

  IntegerValue(1).equals(DecimalValue(1))             // true
  IntegerValue(1).hashCode() == 1
  DecimalValue(1).hashCode()  == 31                   // contract broken
  IntegerValue(1).hashCode() == DoubleValue(1.0).hashCode()  // 1 vs 1072693248

The numeric subclasses each inherited NumericValue.equals (cross-type
op:numeric-equal) but overrode hashCode with type-local BigInteger / BigDecimal
/ Double / Float hashes. Spec-equal cross-type pairs landed in different
Bifurcan buckets and map:contains / map:get silently mis-reported containment.

BooleanValue had the same shape: an equals override on value, no hashCode
override, so non-singleton instances hashed by identity.

Fix mirrors PR eXist-db#6333: a canonical hash invariant across the equals-equivalence
class.

- NumericValue: single hashCode() using NaN / +-INF sentinels and otherwise
  ((DecimalValue) convertTo(DECIMAL)).getValue().hashCode() over the
  trailing-zero-stripped canonical decimal form.
- IntegerValue / DecimalValue / DoubleValue / FloatValue: drop per-class
  hashCode overrides; inherit the canonical form.
- BooleanValue: add hashCode = Boolean.hashCode(value).

HashCodeContractTest (new, 14 assertions) covers the cross-type numeric pairs,
NaN, +-INF, non-singleton BooleanValue, and a 100-iter determinism guard.
maps.xqm adds 9 XQuery regression cases mirroring map-contains-017 across the
numeric cluster, NaN, infinities, and non-singleton booleans.

Two existing tests asserted specific map iteration orders. Per the XPath 3.1
spec map:keys and map:for-each are implementation-defined-order; the
assertions were guarding Bifurcan bucket layout, which changes with hashCode.
Updated:
- mt:for-each in maps.xqm now sorts its result before assertion.
- custom-assertion.xqm sorts its "Additional keys found" diagnostic and the
  ca:map-assertion-additional-key expectation is updated for the new bucket
  layout.

Out of scope: DurationValue family (closed by PR eXist-db#6333). Date/time, binary,
QName, and codepoint clusters already satisfy the contract (verified).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joewiz joewiz requested a review from a team as a code owner May 10, 2026 19:48
@line-o line-o added the xquery issue is related to xquery implementation label May 11, 2026
@line-o line-o added this to Wave 2 and v7.0.0 May 11, 2026
@github-project-automation github-project-automation Bot moved this to Todo in Wave 2 May 11, 2026
@line-o line-o added this to the eXist-7.0.0 milestone May 11, 2026
Copy link
Copy Markdown
Member

@line-o line-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@line-o line-o requested a review from a team May 11, 2026 19:58
@duncdrum duncdrum merged commit 17b6da9 into eXist-db:develop May 12, 2026
14 of 15 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in Wave 2 May 12, 2026
@github-project-automation github-project-automation Bot moved this to Done in v7.0.0 May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

xquery issue is related to xquery implementation

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants