Performance issue with cartesian product of fact and shared aggregation #224

a046 · 2020-05-02T12:46:00Z

I have setup a simple rule with a cartesian product of FactType1 and a sorted collection of FactType2.

From profiling the test, it is clear that an Aggregator is being created for each Fact1 which leads to a large amount of comparisons and sorting overhead on the SortedAggregator but also high memory usage. As each sorted aggregator uses the same conditions etc., I would expect a single MultiKeySortedAggregator to be shared across the 20 Fact1s.

using System;
using System.Collections.Generic;
using NRules.Fluent.Dsl;
using NRules.IntegrationTests.TestAssets;
using NRules.RuleModel;
using Xunit;

namespace NRules.IntegrationTests
{
    public class MultiKeySortedCollectionCartesianProductTest : BaseRuleTestFixture
    {
        [Fact]
        public void Fire_Many()
        {
            const int fact1Count = 20;

            // Arrange
            var fact1s = new List<FactType1>();
            for (int i = 0; i < fact1Count; i++)
            {
                fact1s.Add(new FactType1(i.ToString()));
            }

            var fact2s = new List<FactType2>();
            for (int a = 0; a < 10; a++)
            {
                for (int b = 0; b < 10; b++)
                {
                    for (int c = 0; c < 10; c++)
                    {
                        for (int d = 0; d < 10; d++)
                        {
                            for (int e = 0; e < 10; e++)
                            {
                                fact2s.Add(new FactType2(a, b, c, d, e));
                            }
                        }
                    }
                }
            }

            var rng = new System.Random(1000);
            int n = fact2s.Count;
            while (n > 1)
            {
                n--;
                int k = rng.Next(n + 1);
                var value = fact2s[k];
                fact2s[k] = fact2s[n];
                fact2s[n] = value;
            }

            Session.InsertAll(fact1s);
            Session.InsertAll(fact2s);

            // Act
            Session.Fire();

            // Assert
            AssertFiredTimes(fact1Count);
        }

        protected override void SetUpRules()
        {
            SetUpRule<TestRule>();
        }

        public class FactType1
        {
            public FactType1(string name)
            {
                Name = name;
            }

            public string Name { get; set; }
        }

        public class FactType2
        {
            public FactType2(int testInt1, int testInt2, int testInt3, int testInt4, int testInt5)
            {
                TestPropertyInt1 = testInt1;
                TestPropertyInt2 = testInt2;
                TestPropertyInt3 = testInt3;
                TestPropertyInt4 = testInt4;
                TestPropertyInt5 = testInt5;
            }

            public int TestPropertyInt1 { get; set; }
            public int TestPropertyInt2 { get; set; }
            public int TestPropertyInt3 { get; set; }
            public int TestPropertyInt4 { get; set; }
            public int TestPropertyInt5 { get; set; }
        }

        public class TestRule : Rule
        {
            public Action Action = () => { };

            public override void Define()
            {
                FactType1 fact = null;
                IEnumerable<FactType2> collection = null;

                When()
                    .Match<FactType1>(() => fact)
                    .Query(() => collection, x => x
                        .Match<FactType2>()
                        .Collect()
                        .OrderBy(f => f.TestPropertyInt1)
                        .ThenByDescending(f => f.TestPropertyInt2)
                        .ThenBy(f => f.TestPropertyInt3)
                        .ThenByDescending(f => f.TestPropertyInt4)
                        .ThenBy(f => f.TestPropertyInt5));
                Then()
                    .Do(ctx => CallAction(ctx, fact, collection));
            }

            private void CallAction(IContext context, FactType1 fact1, IEnumerable<FactType2> collection2)
            {
                Action();
            }
        }
    }
}

The text was updated successfully, but these errors were encountered:

a046 · 2020-05-02T13:59:33Z

Interestingly, if I move the FactType1 match to the bottom it doesn't exhibit this behaviour and finishes much quicker.

But shifting the order of the patterns in my non-trivial version (which is on the NuGet version) results in the cost of Fire going up a lot, with a lot of calls to JoinNode.MatchesConditions/BetaCondition.IsSatisfiedBy in an Update call on the equivalent of FactType2. Haven't investigated why yet.

snikolayev · 2020-05-05T03:54:23Z

@a046 I'll investigate what's going on and get back to you on this

a046 · 2020-05-25T00:40:34Z

Thanks Sergiy!

Interestingly, similarly related to this issue (but probably not the core problem), MultiKeySortedAggregator has some of the same problems discussed in #227 with the variable length parameters and array allocations which is some of the cost here.

I'm really tempted (separate to this issue, which could have a problem) to move towards a lazy data structure for sorted aggregators as discussed towards the end of #171 as I think it's worthwhile in light of this issue.

snikolayev · 2020-05-26T00:53:17Z

While expressions performance is important, that's unlikely the root cause here, as you indicated. The issue here is that the subnet that's built for FactType2 is joined at each level with FactType1, even though there is really no dependency - none of the expressions are actually using FactType1. I think the optimizations I'm implementing in https://github.com/NRules/NRules/tree/feature-rete-optimization will solve this issue.

When a query/pattern does not depend on preceding matches, it should be separated into an independent sub-network, such that it's only evaluated based on it's inputs chaning, and not based on the preceding partial matches. The corolary to this is that a binding expression that does not depend on any preceding matches will only evaluate once (hence fixing the BindingEvaluationExceptionTest to introduce the dependency on the preceding fact match).

a046 mentioned this issue May 4, 2020

Support CancellationToken #223

Merged

snikolayev self-assigned this Nov 25, 2020

snikolayev added this to the 0.9.1 milestone Nov 25, 2020

snikolayev closed this as completed Nov 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issue with cartesian product of fact and shared aggregation #224

Performance issue with cartesian product of fact and shared aggregation #224

a046 commented May 2, 2020

a046 commented May 2, 2020

snikolayev commented May 5, 2020

a046 commented May 25, 2020 •

edited

snikolayev commented May 26, 2020 •

edited

Performance issue with cartesian product of fact and shared aggregation #224

Performance issue with cartesian product of fact and shared aggregation #224

Comments

a046 commented May 2, 2020

a046 commented May 2, 2020

snikolayev commented May 5, 2020

a046 commented May 25, 2020 • edited

snikolayev commented May 26, 2020 • edited

a046 commented May 25, 2020 •

edited

snikolayev commented May 26, 2020 •

edited