-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issue with cartesian product of fact and shared aggregation #224
Comments
Interestingly, if I move the FactType1 match to the bottom it doesn't exhibit this behaviour and finishes much quicker. But shifting the order of the patterns in my non-trivial version (which is on the NuGet version) results in the cost of Fire going up a lot, with a lot of calls to JoinNode.MatchesConditions/BetaCondition.IsSatisfiedBy in an Update call on the equivalent of FactType2. Haven't investigated why yet. |
@a046 I'll investigate what's going on and get back to you on this |
Thanks Sergiy! Interestingly, similarly related to this issue (but probably not the core problem), MultiKeySortedAggregator has some of the same problems discussed in #227 with the variable length parameters and array allocations which is some of the cost here. I'm really tempted (separate to this issue, which could have a problem) to move towards a lazy data structure for sorted aggregators as discussed towards the end of #171 as I think it's worthwhile in light of this issue. |
While expressions performance is important, that's unlikely the root cause here, as you indicated. The issue here is that the subnet that's built for FactType2 is joined at each level with FactType1, even though there is really no dependency - none of the expressions are actually using FactType1. I think the optimizations I'm implementing in https://github.com/NRules/NRules/tree/feature-rete-optimization will solve this issue. |
When a query/pattern does not depend on preceding matches, it should be separated into an independent sub-network, such that it's only evaluated based on it's inputs chaning, and not based on the preceding partial matches. The corolary to this is that a binding expression that does not depend on any preceding matches will only evaluate once (hence fixing the BindingEvaluationExceptionTest to introduce the dependency on the preceding fact match).
I have setup a simple rule with a cartesian product of FactType1 and a sorted collection of FactType2.
From profiling the test, it is clear that an Aggregator is being created for each Fact1 which leads to a large amount of comparisons and sorting overhead on the SortedAggregator but also high memory usage. As each sorted aggregator uses the same conditions etc., I would expect a single MultiKeySortedAggregator to be shared across the 20 Fact1s.
The text was updated successfully, but these errors were encountered: