Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for GroupBy key with nested structure for ODATA #14152

Closed
smitpatel opened this issue Dec 12, 2018 · 11 comments · Fixed by #17114 or #17186
Closed

Support for GroupBy key with nested structure for ODATA #14152

smitpatel opened this issue Dec 12, 2018 · 11 comments · Fixed by #17114 or #17186
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Milestone

Comments

@smitpatel
Copy link
Member

Something like this

            var revs = context.WorkItemRevisions
                .GroupBy(o => new {
                    GroupByContainer = new {
                        Name = "WorkItemType",
                        Value = o.WorkItemType,
                        Next = new {
                            Name = "State",
                            Value = o.State } } })
                .Select(r => new { r.Key.GroupByContainer })
                .ToList();
@ajcvickers ajcvickers added this to the 3.0.0 milestone Dec 14, 2018
@divega divega modified the milestones: 3.0.0, Backlog Jun 21, 2019
@smitpatel smitpatel removed their assignment Jun 24, 2019
@smitpatel smitpatel added the verify-fixed This issue is likely fixed in new query pipeline. label Jul 2, 2019
@smitpatel smitpatel removed this from the Backlog milestone Jul 2, 2019
@ajcvickers ajcvickers added this to the 3.0.0 milestone Jul 8, 2019
@ajcvickers ajcvickers self-assigned this Jul 24, 2019
@kosinsky
Copy link
Contributor

kosinsky commented Jul 31, 2019

I've played with Preview7 for Linq query above.
Generated SQL looks like:

SELECT N'WorkItemType' AS [Name], [v].[WorkItemType] AS [Value]
FROM [AnalyticsModel].[vw_WorkItemRevision] AS [v]
GROUP BY N'WorkItemType', [v].[WorkItemType]

It fails with SQL error "Each GROUP BY expression must contain at least one column that is not an outer reference", because N'WorkItemType' in GROUP BY section that is translation for Name = "WorkItemType",

However, for SELECT queries of the same style

            var revsODataStyle = context.WorkItemRevisions
                                        .Select(o => new
                                        {
                                            Container = new
                                            {
                                                Name = "WorkItemType",
                                                Value = o.WorkItemType,
                                                Next = new
                                                {
                                                    Name = "Title",
                                                    Value = o.Title
                                                }
                                            }
                                        })
                                        .ToList();

Generated SQL in preview 5 used to omit constants:

SELECT [o].[WorkItemType] AS [Value0], [o].[Title] AS [Value]
FROM [AnalyticsModel].[vw_WorkItemRevision] AS [o]

Generated SQL for preview 7 includes them:

    SELECT N'WorkItemType' AS [Name], [v].[WorkItemType] AS [Value], N'Title' AS [Name], [v].[Title] AS [Value]
    FROM [AnalyticsModel].[vw_WorkItemRevision] AS [v]

Also, I noticed that Name and Value column names repeated twice now. Will it cause problem with reading the output?

@smitpatel
Copy link
Member Author

smitpatel commented Jul 31, 2019

Since Name is part of grouping key structure, we consider it one of the grouping key. Perhaps we can do something about it like how we deal with constants. Though Name does not seem to provide any value to grouping key. If it is just some metadata then it should not be part of expression tree in such way.

In preview7, we decided to send constants to server always. It can be omitted only on top level. If it is in subquery then it would cause a client eval and bad performance. To keep things simple we decided just to write the constant in SQL.
Also names are repeated twice since we just take name from last member name in the projection. It is clashing since it is top level. If you have some idea that it becomes unique on top level let us know. For subquery level where clash could cause invalid SQL, we make them unique.

@ajcvickers ajcvickers removed the verify-fixed This issue is likely fixed in new query pipeline. label Aug 1, 2019
@kosinsky
Copy link
Contributor

kosinsky commented Aug 2, 2019

Thanks. I played a bit with OData $select style queries results are correct

@smitpatel
Copy link
Member Author

Marked needs design. We may first need to figure out what exact translation we are going to do.
Keep in mind this is how group by constant works right now.

            return AssertQuery<Order>(
                isAsync,
                os => os.GroupBy(o => 2).Select(
                    g =>
                        new
                        {
                            Sum = g.Sum(o => o.OrderID),
                            Min = g.Min(o => o.OrderID),
                            g.Key,
                            Max = g.Max(o => o.OrderID),
                            Avg = g.Average(o => o.OrderID)
                        }),
                e => e.Min + " " + e.Max);
SELECT SUM([t].[OrderID]) AS [Sum], MIN([t].[OrderID]) AS [Min], [t].[Key], MAX([t].[OrderID]) AS [Max], AVG(CAST([t].[OrderID] AS float)) AS [Avg]
FROM (
    SELECT [o].[OrderID], [o].[CustomerID], [o].[EmployeeID], [o].[OrderDate], 2 AS [Key]
    FROM [Orders] AS [o]
) AS [t]
GROUP BY [t].[Key]

@kosinsky
Copy link
Contributor

From SQL point of view GROUP BY is redundant in the query above. We will help only one record in the output, as a result following query is semantically the same:

SELECT SUM([t].[OrderID]) AS [Sum], MIN([t].[OrderID]) AS [Min], 1 as [Key], MAX([t].[OrderID]) AS [Max], AVG(CAST([t].[OrderID] AS float)) AS [Avg]
FROM [Orders] AS [t]

I played with the query. If change it to have constant in new {}:

            return AssertQuery<Order>(
                isAsync,
                os => os.GroupBy(o => new { Name = 2 }).Select(
                    g =>
                        new
                        {
                            Sum = g.Sum(o => o.OrderID),
                            Min = g.Min(o => o.OrderID),
                            g.Key,
                            Max = g.Max(o => o.OrderID),
                            Avg = g.Average(o => o.OrderID)
                        }),
                e => e.Min + " " + e.Max);

I'm getting:
System.InvalidOperationException : Client projection contains reference to constant expression of type: .<>f__AnonymousType101. This could potentially cause memory leak.
at ConstantVerifyingExpressionVisitor.VisitConstant(ConstantExpression constantExpression) in ShapedQueryCompilingExpressionVisitor.cs line: 187

If I comment throwing the exception output will be (notice that Key is 1 not 2):

SELECT SUM([t].[OrderID]), MIN([t].[OrderID]), MAX([t].[OrderID]), AVG(CAST([t].[OrderID] AS float))
FROM (
    SELECT [o].[OrderID], [o].[CustomerID], [o].[EmployeeID], [o].[OrderDate], 1 AS [Key]
    FROM [Orders] AS [o]
) AS [t]
GROUP BY [t].[Key]

However, if I add second non-constant property to the group by no subquery will be generated (that is good):

            return AssertQuery<Order>(
                isAsync,
                os => os.GroupBy(o => new { Name = 2, Value=o.CustomerID }).Select(
                    g =>
                        new
                        {
                            Sum = g.Sum(o => o.OrderID),
                            Min = g.Min(o => o.OrderID),
                            g.Key,
                            Max = g.Max(o => o.OrderID),
                            Avg = g.Average(o => o.OrderID)
                        }),
                e => e.Min + " " + e.Max);
SELECT SUM([o].[OrderID]) AS [Sum], MIN([o].[OrderID]) AS [Min], 2 AS [Name], [o].[CustomerID] AS [Value], MAX([o].[OrderID]) AS [Max], AVG(CAST([o].[OrderID] AS float)) AS [Avg]
FROM [Orders] AS [o]
GROUP BY 2, [o].[CustomerID]

With change it my PR it will be

SELECT SUM([o].[OrderID]) AS [Sum], MIN([o].[OrderID]) AS [Min], 2 AS [Name], [o].[CustomerID] AS [Value], MAX([o].[OrderID]) AS [Max], AVG(CAST([o].[OrderID] AS float)) AS [Avg]
FROM [Orders] AS [o]
GROUP BY  [o].[CustomerID]

@ajcvickers
Copy link
Member

@kosinsky How critical is this to you for the 3.0 release? (We're locking down and this probably doesn't meet the bar, but wanted to give to a chance to make a case for it.)

@kosinsky
Copy link
Contributor

Without that feature OData aggregation will not work for ASP.NET Core 3.0/EF Core 3.0. In OData/WebAPI I can see a bunch of ask about ASP.NET Core 3.0 support as well as EF Core and aggregations (it kind of works with 2.2, but client evaluation).

On other hand I don't know how many other issues are hidden.
For sure SelectMany and Group by (https://github.com/aspnet/EntityFrameworkCore/blob/22711701cc41d53cb21350634246b30caf57033b/test/EFCore.Specification.Tests/Query/GroupByQueryTestBase.cs#L1395) will be an issue for advanced scenarios:

        [ConditionalTheory(Skip = "Issue#15711")]
        [MemberData(nameof(IsAsyncData))]
        public virtual Task SelectMany_GroupBy_Aggregate(bool isAsync)
        {
            return AssertQuery<Customer>(
                isAsync,
                cs =>
                    cs.SelectMany(c => c.Orders)
                        .GroupBy(o => o.EmployeeID)
                        .Select(
                            g => new
                            {
                                g.Key,
                                c = g.Count()
                            }),
                e => e.Key);
        } 

Service that I'm working on will do migrations in stages ASP.NET Classic/EF6 -> ASP.NET Core/EF6.3 -> ASP.NET Core/EF Core. It decreases criticality a bit for me, but still important for OData in general.

Adding a few folks from OData @madansr7 @KanishManuja-MS

@smitpatel
Copy link
Member Author

For sure SelectMany and Group by (

already fixed in daily builds.

@kosinsky
Copy link
Contributor

I've notice that :). And I found related case that fails with System.NotImplementedException : NonNavSource at NavigationExpandingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression):

            return AssertQuery<Customer>(
                isAsync,
                cs =>
                    cs.GroupBy(c => c.City)
                        .Select(
                            g => new
                            {
                                g.Key,
                                c = g.SelectMany(o=>o.Orders).Count()
                            }),
                e => e.Key);

My guess it caused by g.SelectMany(o=>o.Orders).Count() really being g.SelectMany(o=>o.Orders).AsQueryable().Count() expression and as a result (firstArgument is NavigationExpansionExpression source) is false

@smitpatel
Copy link
Member Author

That would be duplicate of #15249

@smitpatel
Copy link
Member Author

Re-opening to use approach of not adding constant for any grouping key and not just new expresison.

@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Aug 15, 2019
smitpatel added a commit that referenced this issue Aug 15, 2019
Constant/Parameter can be put in projection even if not appearing in grouping key in SQL.
So now we avoid putting Constant/parameter in GROUP BY clause but treat it as normal in other places.
When generating grouping key, wrap convert node to match types in initialization expression (as SQL tree ignores type nullability)
Also merged leftover async group by async tests in single version.

Resolves #14152
Resovles #16844
smitpatel added a commit that referenced this issue Aug 15, 2019
Constant/Parameter can be put in projection even if not appearing in grouping key in SQL.
So now we avoid putting Constant/parameter in GROUP BY clause but treat it as normal in other places.
When generating grouping key, wrap convert node to match types in initialization expression (as SQL tree ignores type nullability)
Also merged leftover async group by async tests in single version.

Resolves #14152
Resovles #16844
smitpatel added a commit that referenced this issue Aug 16, 2019
Constant/Parameter can be put in projection even if not appearing in grouping key in SQL.
So now we avoid putting Constant/parameter in GROUP BY clause but treat it as normal in other places.
When generating grouping key, wrap convert node to match types in initialization expression (as SQL tree ignores type nullability)
Also merged leftover async group by async tests in single version.

Resolves #14152
Resovles #16844
smitpatel added a commit that referenced this issue Aug 16, 2019
Constant/Parameter can be put in projection even if not appearing in grouping key in SQL.
So now we avoid putting Constant/parameter in GROUP BY clause but treat it as normal in other places.
When generating grouping key, wrap convert node to match types in initialization expression (as SQL tree ignores type nullability)
Also merged leftover async group by async tests in single version.

Resolves #14152
Resovles #16844
@ajcvickers ajcvickers modified the milestones: 3.0.0, 3.0.0-preview9 Aug 21, 2019
@ajcvickers ajcvickers modified the milestones: 3.0.0-preview9, 3.0.0 Nov 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Projects
None yet
5 participants