Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workarounds for missing GroupBy translation for large data sets #4959

Closed
sergeitemkin opened this issue Apr 1, 2016 · 7 comments
Closed

Comments

@sergeitemkin
Copy link

As I understand, translating GroupBy into SQL hasn't been added to EF Core yet - #2341. However, I'm having difficulty coming up with good workarounds for something we're trying to do in our application.

Currently our goal is to dynamically select a property on an object to group by, and group by that returning a list of distinct groups and the number of items in each group. Basically something that would result in some form of the following in SQL (although I understand that EF would probably have to generate something more verbose):

SELECT SomeColumn, COUNT(*) FROM MyObjects GROUP BY SomeColumn

Asynchronicity aside, the original prototype for the method I envisioned would have been something like this:

public IEnumerable<Group> GetGroups(Expression<Func<MyObject, object>> groupBy)
{
    var groups = dbcontext.MyObjects
        .GroupBy(groupBy)
        .Select(g => new Group(g.Key, g.Items.Count()));
    return groups.ToList();
}

Obviously this won't work right now anyway unless GroupBy is evaluated client side, and I can't evaluate on client side in this case because this is a large table that can have hundreds of thousands of rows.

Our current direction on a workaround involves selecting a distinct set of entities I'm grouping by and then selecting the count for each group individually based on dynamic filter expressions, but I'm not sure if that's the right direction:

public IEnumerable<Group> GetGroups(Expression<Func<MyObject, object>> groupBy)
{
    // get a distinct list of "groups"
    var groupedBys = dbcontext.MyObjects
        .Select(groupBy)
        .Distinct();

    var groupByProperty = groupBy.GetPropertyAccess();
    var idProperty = groupByProperty
        .PropertyType
        .GetProperty("Id"); // a hack, in case the object is complex

    var parameter = Expression.Parameter(typeof(MyObject));
    // create the left side of the o => o.GroupByProperty == Value epxression
    var groupByExpression = Expression.MakeMemberAccess(parameter, groupByProperty);
    if (idProperty != null)
        groupByExpression = Expression.MakeMemberAccess(groupByExpression, idProperty);

    var groups = new List<Group>();
    foreach (var item in groupedBys)
    {
        // create the right side of the o => o.GroupByProperty == Value expression
        var itemValue = Expression.Constant(idProperty == null ? item : idProperty.GetValue(item));
        var groupFilter = 
            Expression.Lambda<Func<CRMTask, bool>>(Expression.Equal(groupByExpression, itemValue), parameter);

        var count = dbcontext.MyObjects
            .Where(groupFilter)
            .Count();
        groups.Add(new Group(item, count));
    }
    return groups;
}

This workaround does involve hitting the database N + 1 times where N is the number of groups, so it's not ideal, but it's still a lot faster than client side evaluation.

Is this a good workaround for the problem or is there something I'm missing? Also, once GroupBy translation to SQL is implemented, will something akin to dbcontext.MyObjects.GroupBy(...).Select(g => g.Items.Count()) be possible to get an item count (or other aggregate such as min, max, etc) for each group?

@rowanmiller
Copy link
Contributor

@anpete @maumar any thoughts on this one?

@anpete
Copy link
Contributor

anpete commented Apr 4, 2016

@sergeitemkin Did you consider using FromSql for this?

Update: I guess the problem is that right now FromSql needs an Entity Type. A workaround might be to create a fake Entity Type representing the desired result shape.

@sergeitemkin
Copy link
Author

@anpete We considered FromSql with a fake entity that represents the desired shape, but is there any way to create an entity that EF can hydrate without having EF also create a corresponding database table that wouldn't serve any purpose?

@anpete
Copy link
Contributor

anpete commented Apr 4, 2016

@sergeitemkin Yes, the usual way is simply to delete the corresponding CreateTable from the generated migration. Long term, we are going to make FromSql work with unmapped types.

@sergeitemkin
Copy link
Author

Thanks for letting me know, as we weren't aware we could use FromSql with entities that don't come directly from a table. Given the dynamic nature of our use-case and the difficulty of translating it to a dynamic query for FromSql, we're going to stick with what we've got for now. However, given that our solution is far from ideal, we'd like to voice our request that translation of the GroupBy clause be a priority. Does @rowanmiller's last post on the subject (#2341) still accurately indicate the state of the feature?

@rowanmiller
Copy link
Contributor

@sergeitemkin - yes, this is still a post v1.0.0 thing.

@sergeitemkin
Copy link
Author

Great, thanks for your help, everyone!

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants