Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue with cartesian product of fact and shared aggregation #224

Closed
a046 opened this issue May 2, 2020 · 4 comments
Closed
Assignees
Milestone

Comments

@a046
Copy link
Contributor

a046 commented May 2, 2020

I have setup a simple rule with a cartesian product of FactType1 and a sorted collection of FactType2.

From profiling the test, it is clear that an Aggregator is being created for each Fact1 which leads to a large amount of comparisons and sorting overhead on the SortedAggregator but also high memory usage. As each sorted aggregator uses the same conditions etc., I would expect a single MultiKeySortedAggregator to be shared across the 20 Fact1s.

image

using System;
using System.Collections.Generic;
using NRules.Fluent.Dsl;
using NRules.IntegrationTests.TestAssets;
using NRules.RuleModel;
using Xunit;

namespace NRules.IntegrationTests
{
    public class MultiKeySortedCollectionCartesianProductTest : BaseRuleTestFixture
    {
        [Fact]
        public void Fire_Many()
        {
            const int fact1Count = 20;

            // Arrange
            var fact1s = new List<FactType1>();
            for (int i = 0; i < fact1Count; i++)
            {
                fact1s.Add(new FactType1(i.ToString()));
            }

            var fact2s = new List<FactType2>();
            for (int a = 0; a < 10; a++)
            {
                for (int b = 0; b < 10; b++)
                {
                    for (int c = 0; c < 10; c++)
                    {
                        for (int d = 0; d < 10; d++)
                        {
                            for (int e = 0; e < 10; e++)
                            {
                                fact2s.Add(new FactType2(a, b, c, d, e));
                            }
                        }
                    }
                }
            }

            var rng = new System.Random(1000);
            int n = fact2s.Count;
            while (n > 1)
            {
                n--;
                int k = rng.Next(n + 1);
                var value = fact2s[k];
                fact2s[k] = fact2s[n];
                fact2s[n] = value;
            }

            Session.InsertAll(fact1s);
            Session.InsertAll(fact2s);

            // Act
            Session.Fire();

            // Assert
            AssertFiredTimes(fact1Count);
        }

        protected override void SetUpRules()
        {
            SetUpRule<TestRule>();
        }

        public class FactType1
        {
            public FactType1(string name)
            {
                Name = name;
            }

            public string Name { get; set; }
        }

        public class FactType2
        {
            public FactType2(int testInt1, int testInt2, int testInt3, int testInt4, int testInt5)
            {
                TestPropertyInt1 = testInt1;
                TestPropertyInt2 = testInt2;
                TestPropertyInt3 = testInt3;
                TestPropertyInt4 = testInt4;
                TestPropertyInt5 = testInt5;
            }

            public int TestPropertyInt1 { get; set; }
            public int TestPropertyInt2 { get; set; }
            public int TestPropertyInt3 { get; set; }
            public int TestPropertyInt4 { get; set; }
            public int TestPropertyInt5 { get; set; }
        }

        public class TestRule : Rule
        {
            public Action Action = () => { };

            public override void Define()
            {
                FactType1 fact = null;
                IEnumerable<FactType2> collection = null;

                When()
                    .Match<FactType1>(() => fact)
                    .Query(() => collection, x => x
                        .Match<FactType2>()
                        .Collect()
                        .OrderBy(f => f.TestPropertyInt1)
                        .ThenByDescending(f => f.TestPropertyInt2)
                        .ThenBy(f => f.TestPropertyInt3)
                        .ThenByDescending(f => f.TestPropertyInt4)
                        .ThenBy(f => f.TestPropertyInt5));
                Then()
                    .Do(ctx => CallAction(ctx, fact, collection));
            }

            private void CallAction(IContext context, FactType1 fact1, IEnumerable<FactType2> collection2)
            {
                Action();
            }
        }
    }
}
@a046
Copy link
Contributor Author

a046 commented May 2, 2020

Interestingly, if I move the FactType1 match to the bottom it doesn't exhibit this behaviour and finishes much quicker.

But shifting the order of the patterns in my non-trivial version (which is on the NuGet version) results in the cost of Fire going up a lot, with a lot of calls to JoinNode.MatchesConditions/BetaCondition.IsSatisfiedBy in an Update call on the equivalent of FactType2. Haven't investigated why yet.

@snikolayev
Copy link
Member

@a046 I'll investigate what's going on and get back to you on this

@a046
Copy link
Contributor Author

a046 commented May 25, 2020

Thanks Sergiy!

Interestingly, similarly related to this issue (but probably not the core problem), MultiKeySortedAggregator has some of the same problems discussed in #227 with the variable length parameters and array allocations which is some of the cost here.

I'm really tempted (separate to this issue, which could have a problem) to move towards a lazy data structure for sorted aggregators as discussed towards the end of #171 as I think it's worthwhile in light of this issue.

@snikolayev
Copy link
Member

snikolayev commented May 26, 2020

While expressions performance is important, that's unlikely the root cause here, as you indicated. The issue here is that the subnet that's built for FactType2 is joined at each level with FactType1, even though there is really no dependency - none of the expressions are actually using FactType1. I think the optimizations I'm implementing in https://github.com/NRules/NRules/tree/feature-rete-optimization will solve this issue.

snikolayev added a commit that referenced this issue Nov 25, 2020
When a query/pattern does not depend on preceding matches, it should be separated into an independent sub-network, such that it's only evaluated based on it's inputs chaning, and not based on the preceding partial matches.
The corolary to this is that a binding expression that does not depend on any preceding matches will only evaluate once (hence fixing the BindingEvaluationExceptionTest to introduce the dependency on the preceding fact match).
@snikolayev snikolayev self-assigned this Nov 25, 2020
@snikolayev snikolayev added this to the 0.9.1 milestone Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants