IGNITE-28430 Remove Objects.hash usages by valepakh · Pull Request #7914 · apache/ignite-3

valepakh · 2026-04-02T09:08:50Z

https://issues.apache.org/jira/browse/IGNITE-28430

The only meaningful change here is MessageImplGenerator where the network message implementation generates hashCode method using the same pattern as everywhere else.

...rc/main/java/org/apache/ignite/internal/network/processor/messages/MessageImplGenerator.java

PakhomovAlexander · 2026-04-03T11:28:50Z

modules/catalog/src/main/java/org/apache/ignite/internal/catalog/storage/SnapshotEntry.java

    @Override
    public int hashCode() {
-        return Objects.hash(version);
+        return version;


Hash code values change with this PR — please verify this is safe for all affected classes.

This is not a transparent refactor. Objects.hash() and the manual 31 * result + ... pattern produce different hash codes for the same inputs, because Objects.hash() (which delegates to Arrays.hashCode(Object[])) starts with an initial seed of 1, while the manual pattern seeds with the first field's hash directly.

Concretely:

Single-field case (this file, NullableValue, DisposableDeploymentUnit, etc.):

Before: Objects.hash(version) → Arrays.hashCode(new Object[]{version}) → 31 * 1 + version = 31 + version

After: return version; → just version

Two-field case (e.g., StoredRaftNodeId, ClusterTag, PartitionKey):

Before: Objects.hash(a, b) → 31 * (31 * 1 + hash(a)) + hash(b) = 961 + 31*hash(a) + hash(b)

After: 31 * hash(a) + hash(b)

N-field case — same pattern, the 31^N constant from the initial seed propagates through.

For purely in-memory HashMap/HashSet usage this is fine — the JVM makes no guarantees about hash stability across versions anyway. But in a distributed system like Ignite, it's worth explicitly verifying that none of the affected classes have their hashCode() result:

Persisted to disk (e.g., as part of a serialized data structure, snapshot, or index)

Transmitted over the network and compared/used on a remote node (e.g., for partition assignment or routing)

Used in rolling-upgrade scenarios where nodes running old code and new code must agree on hash values

Stored in external systems (e.g., logged and later grep'd, or used as cache keys in an external store)

Classes I'd look at most carefully given their names and locations:

SnapshotEntry (this file) — catalog storage, snapshot-related

Lease — placement driver leases

PartitionKey — partition replicator, raft snapshots

RaftGroupConfiguration — raft consensus config

CatalogSerializationChecker — serialization tests (the test itself may need updated expected values)

If any of these persist or transmit hash codes, this change could cause subtle issues during rolling upgrades (old nodes produce 31 + version, new nodes produce version).

If you've already verified this — a brief note in the PR description ("hash codes are not persisted or transmitted for any of these classes") would help future readers understand why the change is safe.

Fixed by keeping the same results as Objects.hash

PakhomovAlexander

MessageImplGenerator: field ordering change in hashCode

The old code hashes fields in primitives → objects → arrays order (lines 648-662), while the new code hashes in declaration order (message.getters()).

For any message where a non-primitive field is declared before a primitive field, this produces a different hash code.

Example — a message with String name(), int id() (in this declaration order):

Old: Objects.hash(this.id, this.name) → 31 * (31 + id) + Objects.hashCode(name)
New: declaration order → 31 * (31 + Objects.hashCode(name)) + id

With name="test" (hashCode=3556498), id=42:

Old: 31 * 73 + 3556498 = 3,558,761
New: 31 * 3556529 + 42 = 110,252,441

In practice this is likely safe — network message hashCode() is only used for in-memory collections, not persisted or sent over the wire, and generated code is rebuilt each compilation. But it is a behavioral change worth acknowledging.

valepakh · 2026-04-05T11:30:42Z

MessageImplGenerator: field ordering change in hashCode

ClusterNodeMessage is the only key message affected. Old: Objects.hash(port, id, name, host, userAttrs, sysAttrs, profiles). New: hash(id, name, host, port, userAttrs, sysAttrs, profiles).

Why it's safe despite the change

hashCode is NEVER serialized over the wire. Generated message serializers only write field values (writer.writeInt, writer.writeString, etc.). The hashCode value itself is never transmitted.
hashCode is NEVER persisted to disk. Not in raft logs, raft snapshots, metastorage, or catalog storage. Raft commands are stored as serialized byte buffers containing only field data.
Maps/Sets with message keys are fully reconstructed on deserialization. DirectByteBufferStreamImplV1.readMap() creates a new HashMap and inserts deserialized entries using map.put() — which computes hashCode locally on the
receiver. The sender's hash bucket layout is not preserved.
No cross-node hashCode agreement is needed. No protocol compares hashCode values between nodes. Message routing uses groupType() and messageType() (short integer IDs), not hashCode.
Rolling upgrades are safe. Even if an old node sends a NodesLeaveCommand with Set to a new node via raft:

The Set is serialized as individual elements (field data only)
The receiving node deserializes each element and inserts into a new local HashSet using its own hashCode
The logical content is identical regardless of hash values

No deterministic iteration dependency. No code relies on HashMap/HashSet iteration order of message collections. Java makes no such guarantee anyway.

The only theoretical risk

If some code path iterated over a Set and the iteration order happened to matter (e.g., producing deterministic output for comparison), the changed hashCode could alter iteration order. However, no such
usage exists in the codebase — NodesLeaveCommand.nodes() is iterated to remove nodes from the logical topology, and order doesn't matter there (CmgRaftGroupListener line 316 streams and collects to a new set).

Conclusion

It is safe to change the hashCode field ordering for network messages. PakhomovAlexander's analysis of the behavioral change is technically correct — the hash values do change for messages with mixed field types — but it has
no observable impact because hashCode is purely a JVM-local concern for these messages, never crossing serialization or process boundaries.

AMashenkov · 2026-04-06T09:50:09Z

Most changes in test are not on hot path.
Does it make sense applying the rule to the tests?

Most of objects are not persistent\serializable and their hash code is not transferred.
So, there is no need to keep "compatibility".
E.g. all touched SQL objects, Nullable, QualifiedName.

valepakh · 2026-04-06T09:57:07Z

Most changes in test are not on hot path. Does it make sense applying the rule to the tests?

It doesn't, do you want me to revert all of the tests changes?
Apparently, excluding a pmd rule from test is not trivial.

Most of objects are not persistent\serializable and their hash code is not transferred. So, there is no need to keep "compatibility". E.g. all touched SQL objects, Nullable, QualifiedName.

It's true but I didn't want to go over each change and see if it needs to be kept the same or not, just made a bulk edit.

ivanzlenko · 2026-04-06T10:26:42Z

Most changes in test are not on hot path.
Does it make sense applying the rule to the tests?

Most of objects are not persistent\serializable and their hash code is not transferred.
So, there is no need to keep "compatibility".
E.g. all touched SQL objects, Nullable, QualifiedName.

No, we shouldn't make any difference between test and production code base.

It creates inconsistencies
We should treat test code the same way we treat production
Any performance gains are valid especially for test code

AMashenkov · 2026-04-06T12:03:03Z

It's true but I didn't want to go over each change and see if it needs to be kept the same or not, just made a bulk edit.

"go over each change and see if it needs" this is what engineer should do - prove the change is reasonable, right?
As a reviewer, I went through every change and tried to understand why it is done in the way is done.
Bulk edit - saves your time, but not reviwers.

valepakh · 2026-04-06T12:06:51Z

It's true but I didn't want to go over each change and see if it needs to be kept the same or not, just made a bulk edit.

"go over each change and see if it needs" this is what engineer should do - prove the change is reasonable, right? As a reviewer, I went through every change and tried to understand why it is done in the way is done. Bulk edit - saves your time, but not reviwers.

In this case I think it's better to do in in bulk so that we don't need to make the same mistake again.
What are you suggesting, to not merge this at all?
Why?

...ecords-tests/src/testFixtures/java/org/apache/ignite/internal/schema/marshaller/Records.java

AMashenkov · 2026-04-06T14:39:54Z

Most changes in test are not on hot path.
Does it make sense applying the rule to the tests?
Most of objects are not persistent\serializable and their hash code is not transferred.
So, there is no need to keep "compatibility".
E.g. all touched SQL objects, Nullable, QualifiedName.

No, we shouldn't make any difference between test and production code base.

It creates inconsistencies

We should treat test code the same way we treat production

Any performance gains are valid especially for test code

1, 2. Do you mean inconsistencies in styles? This is not about a style. I see performance vs readability here.
Readability in tests is more important.
3. In tests I see classes which implements unused hashcode method.
The author just generated it or followed the equals method contract, when implemented an equals method.

Anyone measured performance? like we did here #7932 ?

It is not clear why the hashcode uses pattern int hash = 31 + .... everywhere? Should we use this pattern always?
Why there are no comments in places where preserving compatibility is mandatory and the pattern was used?

ivanzlenko · 2026-04-06T15:09:24Z

1, 2. Do you mean inconsistencies in styles? This is not about a style. I see performance vs readability here.
Readability in tests is more important.

I'm not talking about code style. We have rules to which should stick while writing code. They should be EQUAL through WHOLE code base. Period.

ivanzlenko · 2026-04-06T15:10:22Z

In tests I see classes which implements unused hashcode method.
The author just generated it or followed the equals method contract, when implemented an equals method.

I do not get your point. If someone implemented empty hash code method - it should be flagged as a bug and fixed.

ivanzlenko · 2026-04-06T15:13:41Z

It is not clear why the hashcode uses pattern int hash = 31 + .... everywhere?

Pattern came from this book from some unknown author https://www.amazon.com/dp/0321356683
Maybe he somehow related to Java, maybe not. Legends tell different stories about this.

valepakh · 2026-04-06T15:21:14Z

Anyone measured performance? like we did here #7932 ?

The original issue was a large number of allocations generated by this method noticed in one of the traces when one of the objects appeared in the hot path.

It is not clear why the hashcode uses pattern int hash = 31 + .... everywhere? Should we use this pattern always? Why there are no comments in places where preserving compatibility is mandatory and the pattern was used?

This pattern with starting 31 is to keep value identical to the Objects.hash implementation. Rather than going through each instance and trying to understand whether the compatibility should be preserved or not I just used the same pattern everywhere. It doesn't matter what the value is if it satisfies the contract and I don't think anyone closely examines the hashCode methods.

AMashenkov · 2026-04-07T08:42:14Z

I'm ok with the changes and int hash = 31 + .... pattern.
But I'm really confused with the argumentation, guys.

Imho, engineer must dive into details; changes must be reasonable; performance must be proved; rules shouldn't be just for rules.
Why don't we have the rules to avoid using streams or loops over Iterable? they also creates a garbage....

valepakh · 2026-04-07T08:55:11Z

I'm ok with the changes and int hash = 31 + .... pattern. But I'm really confused with the argumentation, guys.

Imho, engineer must dive into details; changes must be reasonable; performance must be proved; rules shouldn't be just for rules. Why don't we have the rules to avoid using streams or loops over Iterable? they also creates a garbage....

My reasoning was that accidentally using Objects.hash in hot path is really easy, and it's not immediately obvious that it will create additional garbage allocations. That's why I added a PMD rule - to prevent accidental uses in a hot path, which in turn required all of the usages to be removed.
I didn't bother proving the performance since this change simply eliminates objects' array allocations from this method altogether.
Well, I actually did test it, but in a broader context and in a synthetic benchmarks, which showed the decrease of allocations count, but, again, this is obvious and the numbers greatly depend on the specific code path taken.

valepakh added 3 commits April 2, 2026 12:22

IGNITE-28430 Remove Objects.hash usages

fc2d9db

Refactor message generator

7b04709

Add pmd rule check to prohibit Objects.hash() in hashCode()

adcde20

ivanzlenko approved these changes Apr 2, 2026

View reviewed changes

PakhomovAlexander reviewed Apr 3, 2026

View reviewed changes

...rc/main/java/org/apache/ignite/internal/network/processor/messages/MessageImplGenerator.java Show resolved Hide resolved

PakhomovAlexander reviewed Apr 3, 2026

View reviewed changes

valepakh added 3 commits April 3, 2026 15:53

Keep existing calculations to be exactly the same as Objects.hash

273f3d9

Revert extra changes

dbab754

Revert extra change

37a0d7f

PakhomovAlexander reviewed Apr 3, 2026

View reviewed changes

AMashenkov reviewed Apr 6, 2026

View reviewed changes

...ecords-tests/src/testFixtures/java/org/apache/ignite/internal/schema/marshaller/Records.java Show resolved Hide resolved

valepakh closed this Apr 7, 2026

Conversation

valepakh commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

PakhomovAlexander Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

valepakh Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

PakhomovAlexander left a comment

Choose a reason for hiding this comment

Uh oh!

valepakh commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AMashenkov commented Apr 6, 2026

Uh oh!

valepakh commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ivanzlenko commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AMashenkov commented Apr 6, 2026

Uh oh!

valepakh commented Apr 6, 2026

Uh oh!

Uh oh!

AMashenkov commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ivanzlenko commented Apr 6, 2026

Uh oh!

ivanzlenko commented Apr 6, 2026

Uh oh!

ivanzlenko commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valepakh commented Apr 6, 2026

Uh oh!

AMashenkov commented Apr 7, 2026

Uh oh!

valepakh commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

valepakh commented Apr 2, 2026 •

edited

Loading

valepakh commented Apr 5, 2026 •

edited

Loading

valepakh commented Apr 6, 2026 •

edited

Loading

ivanzlenko commented Apr 6, 2026 •

edited

Loading

AMashenkov commented Apr 6, 2026 •

edited

Loading

ivanzlenko commented Apr 6, 2026 •

edited

Loading

valepakh commented Apr 7, 2026 •

edited

Loading