-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #15675: Leak in Cache of Node Compliance and NodeInfo and perfs improvement #2648
Conversation
do not merge for now, need testing |
|
||
writeTime = System.currentTimeMillis | ||
writtenNodeConfigs <- writeNodeConfigurations(rootNodeId, updatedNodeConfigIds, nodeConfigs, allLicenses, globalPolicyMode, generationTime, parallelism) ?~!"Cannot write nodes configuration" | ||
// we don't need nodeConfigs, but rather the nodeConfigs for updated nodes, and all nodesInfos | ||
_ <- writeNodeConfigurations(rootNodeId, updatedNodeConfigIds, updatedNodeConfigs, allLicenses, globalPolicyMode, generationTime, parallelism) ?~!"Cannot write nodes configuration" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change has bee overzealous, we still need allNodeConfigInfo here
it seems this change let me go from 12Go to 10Go heap size |
_ = PolicyLogger.debug(s"Reports computed in ${timeSetExpectedReport} ms") | ||
|
||
_ = PolicyLogger.debug(s"Size of expectedReports is ${memoryMeter.measure(expectedReports)} ") | ||
_ = PolicyLogger.debug(s"Size deepof expectedReports is ${memoryMeter.measureDeep(expectedReports)} ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[2019-12-15 21:15:44] DEBUG policy.generation - Size of allNodeInfos is 24
[2019-12-15 21:15:46] DEBUG policy.generation - Size deepof allNodeInfos is 26394576
[2019-12-15 21:15:46] DEBUG policy.generation - Size of allNodeModes is 24
[2019-12-15 21:15:46] DEBUG policy.generation - Size deepof allNodeModes is 1132752
[2019-12-15 21:15:46] DEBUG policy.generation - Size of nodeConfigCaches is 16
[2019-12-15 21:15:46] DEBUG policy.generation - Size deepof nodeConfigCaches is 16
[2019-12-15 21:15:56] DEBUG policy.generation - Size of ruleVals is 24
[2019-12-15 21:32:14] DEBUG policy.generation - Size deepof ruleVals is 1978959240
[2019-12-15 21:32:33] DEBUG policy.generation - Size of nodeContexts is 24
[2019-12-15 21:32:38] DEBUG policy.generation - Size deepof nodeContexts is 304977352
[2019-12-15 21:35:28] DEBUG policy.generation - Size of nodeConfigs is 24
[2019-12-15 21:56:58] DEBUG policy.generation - Size deepof nodeConfigs is 3474356896
[2019-12-15 21:57:13] DEBUG policy.generation - Size of updatedNodeConfigIds is 24
[2019-12-15 21:57:13] DEBUG policy.generation - Size deepof updatedNodeConfigIds is 1423528
[2019-12-15 21:57:13] DEBUG policy.generation - Size of updatedNodeConfigs is 24
[2019-12-15 22:18:08] DEBUG policy.generation - Size deepof updatedNodeConfigs is 3504843864
[2019-12-15 22:18:08] DEBUG policy.generation - Size of updatedNodeInfo is 24
[2019-12-15 22:18:09] DEBUG policy.generation - Size deepof updatedNodeInfo is 26395416
[2019-12-15 22:18:09] DEBUG policy.generation - Size of updatedNodesId is 24
[2019-12-15 22:18:09] DEBUG policy.generation - Size deepof updatedNodesId is 804984
[2019-12-15 22:28:57] DEBUG policy.generation - Size of expectedReports is 24
[2019-12-15 22:34:29] DEBUG policy.generation - Size deepof expectedReports is 241639688
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
largest objects are
updatedNodeConfigs = 3 504 843 864
nodeConfigs = 3 474 356 896
ruleVals = 1 978 959 240
nodeContexts = 304 977 352
expectedReports = 241 639 688
ping @fanf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's in byte or bite ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is in byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of the 2 668 075 048 within nodeConfigs, 2 582 780 816 are for the policies
@@ -357,7 +357,7 @@ final case class Policy( | |||
// == .toList (keep order) ==> List[List[Variable]] | |||
// == flatten (keep order) ==> List[Variable] | |||
val expandedVars = Policy.mergeVars(policyVars.map( _.expandedVars.values).toList.flatten) | |||
val originalVars = Policy.mergeVars(policyVars.map( _.originalVars.values).toList.flatten) | |||
//val originalVars = Policy.mergeVars(policyVars.map( _.originalVars.values).toList.flatten) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one seems to have a lot of impact
Size deepof nodeConfigs is 2 668 075 072
Size deepof ruleVals is 376 459 952
vs
Size deepof nodeConfigs is 3 474 356 896
Size deepof ruleVals is 748 776 080
but we need to validate as there is a lot of variability in ruleval (from 748 487 288 to 2 123 635 544)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodeconfig is stable at about 2 668 075 048 with this line vs 3 474 356 896 without this line
nodeContexts = nodeConfigs.map { case (_, nodeconfig) => nodeconfig.nodeContext} | ||
|
||
_ = PolicyLogger.debug(s"Size of nodeConfigs.nodeContext is ${memoryMeter.measure(nodeContexts)}") | ||
_ = PolicyLogger.debug(s"Size deepof nodeConfigs is ${memoryMeter.measureDeep(nodeContexts)} ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is 78 016 840 only
@@ -357,7 +357,7 @@ final case class Policy( | |||
// == .toList (keep order) ==> List[List[Variable]] | |||
// == flatten (keep order) ==> List[Variable] | |||
val expandedVars = Policy.mergeVars(policyVars.map( _.expandedVars.values).toList.flatten) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm really tempted to make a def out of this one, as it is used only in two different place
The fill template is the most expensive part of the writing of policies, by a factor 3. We could certainly do better, but I can't really see how (it adds up to more than Promises written, because the 3 before are multi-threaded) |
@@ -76,7 +76,7 @@ final class RuleStatusReport private ( | |||
, val overrides : List[OverridenPolicy] | |||
) extends StatusReport { | |||
val compliance = report.compliance | |||
val byNodes: Map[NodeId, AggregatedStatusReport] = report.reports.groupBy(_.nodeId).mapValues(AggregatedStatusReport(_)) | |||
val byNodes: Map[NodeId, AggregatedStatusReport] = report.reports.groupBy(_.nodeId).map{case (a, value) => (a, AggregatedStatusReport(value))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this files should not have been changed, as the cost for homepage skyrockets
@@ -1226,13 +1226,13 @@ object ExecutionBatch extends Loggable { | |||
*/ | |||
ComponentStatusReport( | |||
expectedComponent.componentName | |||
, ComponentValueStatusReport.merge(unexpectedReportStatus ::: pairedReportStatus).mapValues { status => | |||
, ComponentValueStatusReport.merge(unexpectedReportStatus ::: pairedReportStatus).map { case (key, status) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to inspect perf cost here
val result = reports.mapValues { status => | ||
NodeStatusReport.filterByRules(status, ruleIds) | ||
val result = reports.map { case (nodeId, status) => | ||
(nodeId, NodeStatusReport.filterByRules(status, ruleIds)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to inspect cost here also
it need much more work to detect if mapValue change was worthy or not. Will revert this commit, and focus on measuring the impact of the extra collection created to have only the nodeconfiguration of node changed, which may (or not) double memory requierement in worst case. Then we'll call it a PR |
This looks good, so I won't change anything in this PR. I'll cherry pick parts in new PRs, more unitary |
Note: spikes in memory usage seems to come from
and from compute compliance |
Sharp spikes in moemory usages happens:
i'm a bit unsure if it is caused by the policy generation, or the pressure from compliance on top of policy generation |
Closing, as all has been backported in more sensible PRs |
https://issues.rudder.io/issues/15675