-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
This lambda ...
Lines 116 to 117 in 39601ed
| (l, trainedModelCacheInfoResponse) -> handleResponses( | |
| state, |
... captures the entire ClusterState from the point at which the action started running, and retains it all the way until after the completion of both the TransportNodesStatsAction and then the TrainedModelCacheInfoAction. Since both of those actions fan out to multiple nodes, they could take a long time (tens of seconds) to complete in an overloaded or otherwise faulty cluster. That's too long to retain a ClusterState, there's a good chance it'll be replaced by newer ClusterState instances in this time but we can't GC them while those actions are running. Moreover it appears that we only need a few select parts of the ClusterState in handleResponses()
We should instead extract and retain just those parts of the ClusterState that are needed to compute the final response.