IGNITE-28430 Remove Objects.hash usages#7914
Conversation
...rc/main/java/org/apache/ignite/internal/network/processor/messages/MessageImplGenerator.java
Show resolved
Hide resolved
| @Override | ||
| public int hashCode() { | ||
| return Objects.hash(version); | ||
| return version; |
There was a problem hiding this comment.
Hash code values change with this PR — please verify this is safe for all affected classes.
This is not a transparent refactor. Objects.hash() and the manual 31 * result + ... pattern produce different hash codes for the same inputs, because Objects.hash() (which delegates to Arrays.hashCode(Object[])) starts with an initial seed of 1, while the manual pattern seeds with the first field's hash directly.
Concretely:
Single-field case (this file, NullableValue, DisposableDeploymentUnit, etc.):
- Before:
Objects.hash(version)→Arrays.hashCode(new Object[]{version})→31 * 1 + version=31 + version - After:
return version;→ justversion
Two-field case (e.g., StoredRaftNodeId, ClusterTag, PartitionKey):
- Before:
Objects.hash(a, b)→31 * (31 * 1 + hash(a)) + hash(b)=961 + 31*hash(a) + hash(b) - After:
31 * hash(a) + hash(b)
N-field case — same pattern, the 31^N constant from the initial seed propagates through.
For purely in-memory HashMap/HashSet usage this is fine — the JVM makes no guarantees about hash stability across versions anyway. But in a distributed system like Ignite, it's worth explicitly verifying that none of the affected classes have their hashCode() result:
- Persisted to disk (e.g., as part of a serialized data structure, snapshot, or index)
- Transmitted over the network and compared/used on a remote node (e.g., for partition assignment or routing)
- Used in rolling-upgrade scenarios where nodes running old code and new code must agree on hash values
- Stored in external systems (e.g., logged and later grep'd, or used as cache keys in an external store)
Classes I'd look at most carefully given their names and locations:
SnapshotEntry(this file) — catalog storage, snapshot-relatedLease— placement driver leasesPartitionKey— partition replicator, raft snapshotsRaftGroupConfiguration— raft consensus configCatalogSerializationChecker— serialization tests (the test itself may need updated expected values)
If any of these persist or transmit hash codes, this change could cause subtle issues during rolling upgrades (old nodes produce 31 + version, new nodes produce version).
If you've already verified this — a brief note in the PR description ("hash codes are not persisted or transmitted for any of these classes") would help future readers understand why the change is safe.
There was a problem hiding this comment.
Fixed by keeping the same results as Objects.hash
PakhomovAlexander
left a comment
There was a problem hiding this comment.
MessageImplGenerator: field ordering change in hashCode
The old code hashes fields in primitives → objects → arrays order (lines 648-662), while the new code hashes in declaration order (message.getters()).
For any message where a non-primitive field is declared before a primitive field, this produces a different hash code.
Example — a message with String name(), int id() (in this declaration order):
- Old:
Objects.hash(this.id, this.name)→31 * (31 + id) + Objects.hashCode(name) - New: declaration order →
31 * (31 + Objects.hashCode(name)) + id
With name="test" (hashCode=3556498), id=42:
- Old:
31 * 73 + 3556498= 3,558,761 - New:
31 * 3556529 + 42= 110,252,441
In practice this is likely safe — network message hashCode() is only used for in-memory collections, not persisted or sent over the wire, and generated code is rebuilt each compilation. But it is a behavioral change worth acknowledging.
ClusterNodeMessage is the only key message affected. Old: Objects.hash(port, id, name, host, userAttrs, sysAttrs, profiles). New: hash(id, name, host, port, userAttrs, sysAttrs, profiles). Why it's safe despite the change
The only theoretical risk If some code path iterated over a Set and the iteration order happened to matter (e.g., producing deterministic output for comparison), the changed hashCode could alter iteration order. However, no such Conclusion It is safe to change the hashCode field ordering for network messages. PakhomovAlexander's analysis of the behavioral change is technically correct — the hash values do change for messages with mixed field types — but it has |
|
Most changes in test are not on hot path. Most of objects are not persistent\serializable and their hash code is not transferred. |
It doesn't, do you want me to revert all of the tests changes?
It's true but I didn't want to go over each change and see if it needs to be kept the same or not, just made a bulk edit. |
No, we shouldn't make any difference between test and production code base.
|
"go over each change and see if it needs" this is what engineer should do - prove the change is reasonable, right? |
In this case I think it's better to do in in bulk so that we don't need to make the same mistake again. |
...ecords-tests/src/testFixtures/java/org/apache/ignite/internal/schema/marshaller/Records.java
Show resolved
Hide resolved
1, 2. Do you mean inconsistencies in styles? This is not about a style. I see performance vs readability here. Anyone measured performance? like we did here #7932 ? It is not clear why the hashcode uses pattern |
I'm not talking about code style. We have rules to which should stick while writing code. They should be EQUAL through WHOLE code base. Period. |
I do not get your point. If someone implemented empty hash code method - it should be flagged as a bug and fixed. |
Pattern came from this book from some unknown author https://www.amazon.com/dp/0321356683 |
The original issue was a large number of allocations generated by this method noticed in one of the traces when one of the objects appeared in the hot path.
This pattern with starting 31 is to keep value identical to the |
|
I'm ok with the changes and Imho, engineer must dive into details; changes must be reasonable; performance must be proved; rules shouldn't be just for rules. |
My reasoning was that accidentally using |
https://issues.apache.org/jira/browse/IGNITE-28430
The only meaningful change here is MessageImplGenerator where the network message implementation generates
hashCodemethod using the same pattern as everywhere else.