More Compact Serialization of Metadata #82608

original-brownbear · 2022-01-14T12:14:29Z

Serialize the map of hashes to mappings and then lookup from the map instead
of serializing them over and over for each index to make full cluster state
transport messages much smaller in the common case of many duplicate mappings.

This should make the master node impact of requests for the full cluster state (or at least the state including mappings) quite a bit cheaper memory+cpu+network wise. Also it saves lots of buffers on the coordinating/sending node as well as CPU for deduplicating mappings.

relates #77466

Serialize the map of hashes to mappings and then lookup from the map instead of serializing them over and over for each index to make full cluster state transport messages much smaller in the common case of many duplicate mappings.

elasticmachine · 2022-01-14T12:14:33Z

Pinging @elastic/es-distributed (Team:Distributed)

arteam · 2022-01-14T13:28:23Z

server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java

+        if (in.getVersion().onOrAfter(MAPPINGS_AS_HASH_VERSION)) {
+            final int mappings = in.readVInt();
+            if (mappings > 0) {
+                final Map<String, MappingMetadata> mappingMetadataMap = new HashMap<>(mappings);


The HashMap constructors accepts the capacity, not the expected amount of elements. It needs to be sized a bit higher than mappings, otherwise it will need to be resized/rehashed.

See https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/Maps.java#L273

True, though I guess it might be worthwhile to have a general fix to this. We seem to always pre-size capacity == element count in deserialization. Technically, we probably could move to accounting for the load factor, but I wouldn't expect too much from it (especially when the key's hashcode is essentially free).

original-brownbear · 2022-01-14T14:25:35Z

Thanks Ievgen!

More Compact Serialization of Metadata

e9cb774

Serialize the map of hashes to mappings and then lookup from the map instead of serializing them over and over for each index to make full cluster state transport messages much smaller in the common case of many duplicate mappings.

original-brownbear added >enhancement :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.1.0 labels Jan 14, 2022

elasticmachine added the Team:Distributed Meta label for distributed team label Jan 14, 2022

idegtiarenko approved these changes Jan 14, 2022

View reviewed changes

arteam reviewed Jan 14, 2022

View reviewed changes

original-brownbear merged commit 62db2ae into elastic:master Jan 14, 2022

original-brownbear deleted the efficient-serialization-metadata-over-wire branch January 14, 2022 14:25

original-brownbear mentioned this pull request Jan 14, 2022

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

original-brownbear mentioned this pull request Jan 27, 2022

A Node Joining a Cluster with a Large State Receives the Full Uncompressed State in a ValidateJoinRequest #83204

Closed

joegallo mentioned this pull request Jul 13, 2022

Deduplicate mappings in persisted cluster state #88479

Merged

original-brownbear restored the efficient-serialization-metadata-over-wire branch April 18, 2023 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More Compact Serialization of Metadata #82608

More Compact Serialization of Metadata #82608

original-brownbear commented Jan 14, 2022 •

edited

Loading

elasticmachine commented Jan 14, 2022

arteam Jan 14, 2022

original-brownbear Jan 14, 2022

original-brownbear commented Jan 14, 2022

More Compact Serialization of Metadata #82608

More Compact Serialization of Metadata #82608

Conversation

original-brownbear commented Jan 14, 2022 • edited Loading

elasticmachine commented Jan 14, 2022

arteam Jan 14, 2022

Choose a reason for hiding this comment

original-brownbear Jan 14, 2022

Choose a reason for hiding this comment

original-brownbear commented Jan 14, 2022

original-brownbear commented Jan 14, 2022 •

edited

Loading