The length of segment & host names stored in zookeeper as part of cluster and data metadata can be very long. The length is dependent on table names and host names in the cluster. A couple of examples:
pinot-controller-controller-0-0.pinot-pinot-controller-headless.cell-bzf7co-managed.svc.cluster.local_9000
nation_dm2_0_output_4341_csv_FileIngestionTask_1732618352252_3715
In a test setup with a table of 200K segments, there are 5 million String objects and take up 247mb of memory.
A couple of stack traces of allocations:
↖{j.u.LinkedHashMap}.values
↖{j.u.TreeMap}.values
↖org.apache.helix.zookeeper.datamodel.ZNRecord.mapFields
↖org.apache.helix.model.CurrentState._record
↖{j.u.LinkedHashMap}.values
↖{j.u.TreeMap}.values
↖org.apache.helix.zookeeper.datamodel.ZNRecord.mapFields
↖org.apache.helix.model.ResourceConfig._record
↖{j.u.HashMap}.values
↖org.apache.helix.common.caches.PropertyCache._objMap
Long names also affect performance. An example with a relatively small table name.
curl -s -S -n -H "Authorization: Bearer $SCALETEST_TOKEN" "https://$CONTROLLER_HOST:$CONTROLLER_PORT/segments/nation_OFFLINE" -o /dev/null -w "%{time_total},%{size_download},%{speed_download}\n" >> stats.log
❯ cat stats.log
8.461959,4698905,555297
The length of segment & host names stored in zookeeper as part of cluster and data metadata can be very long. The length is dependent on table names and host names in the cluster. A couple of examples:
In a test setup with a table of 200K segments, there are 5 million String objects and take up 247mb of memory.
A couple of stack traces of allocations:
Long names also affect performance. An example with a relatively small table name.