mapFields in IS & EV have a structure like:
"10000": {
"Server...8098": "ONLINE"
},
where
10000 is the segment name,
Server...8098 is the server name
ONLINE is the status of the segment on that server.
The state of the segment is represented using multiple hash maps in controller etc.
The root cause of the memory overhead is that default size of each of the hashmaps is 16 while the mean no. of items is <=3.
With replication and replica groups, the overhead increases.
In an experiment, it was noticed that for a table with 100K segments, 500MB of memory was lost due to unused capacity. There were 300K single item hash maps.
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", val:"ONLINE")}
Additionally the data structure overhead due to the large number of hashmaps is ~200MB for 100K segments.
| #instances |
(Average) objectsize |
Total overheadper class |
Class name |
| 6,412,230 |
40b |
75,143Kb (3.8%) |
j.u.LinkedHashMap$Entry |
| 5,356,965 |
24b |
62,776Kb (3.2%) |
String |
| 3,498,014 |
32b |
40,992Kb (2.1%) |
j.u.HashMap$Node |
| 3,290,971 |
86b |
38,566Kb (2.0%) |
j.u.HashMap$Node[] |
| 2,307,923 |
40b |
27,045Kb (1.4%) |
j.u.TreeMap$Entry |
| 1,784,111 |
64b |
20,907Kb (1.1%) |
j.u.LinkedHashMap |
| 1,518,617 |
48b |
17,796Kb (0.9%) |
j.u.HashMap |
| 489,050 |
24b |
5,731Kb (0.3%) |
j.u.LinkedHashMap$LinkedKeySet |
| 377,361 |
24b |
4,422Kb (0.2%) |
j.u.LinkedHashMap$LinkedEntrySet |
| 325,425 |
16b |
3,813Kb (0.2%) |
j.u.HashMap$KeySet |
| 315,886 |
32b |
3,701Kb (0.2%) |
j.u.concurrent.ConcurrentHashMap$Node |
| 216,027 |
16b |
2,531Kb (0.1%) |
j.u.HashMap$EntrySet |
mapFieldsin IS & EV have a structure like:where
10000is the segment name,Server...8098is the server nameONLINEis the status of the segment on that server.The state of the segment is represented using multiple hash maps in controller etc.
The root cause of the memory overhead is that default size of each of the hashmaps is 16 while the mean no. of items is <=3.
With replication and replica groups, the overhead increases.
In an experiment, it was noticed that for a table with 100K segments, 500MB of memory was lost due to unused capacity. There were 300K single item hash maps.
Additionally the data structure overhead due to the large number of hashmaps is ~200MB for 100K segments.