Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The generic AsQueryableValues throws if options are provided.. #30

Closed
fiseni opened this issue Apr 5, 2023 · 18 comments
Closed

The generic AsQueryableValues throws if options are provided.. #30

fiseni opened this issue Apr 5, 2023 · 18 comments

Comments

@fiseni
Copy link

fiseni commented Apr 5, 2023

Hi,

Thanks for the great work on this library. I just tested the latest release with Json serialization and performance improvements are great!

I'm working on some extensions so I can support the in-memory provider (details here), and I noticed a small issue. I'm unable to use properly the generic overload.

This works

using var dbContext = new AppDbContextQueryableJSON();
var names = new string[] { "abc", "def" };
var query = dbContext.Customers.Where(x => dbContext.AsQueryableValues(names, true).Contains(x.Name));
var result = await query.ToListAsync();

This fails

using var dbContext = new AppDbContextQueryableJSON();
var names = new string[] { "abc", "def" };
var query = dbContext.Customers.Where(x => dbContext.AsQueryableValues(names, opt => opt.DefaultForIsUnicode(true)).Contains(x.Name));
var result = await query.ToListAsync();

Stack trace:

System.InvalidOperationException
  HResult=0x80131509
  Message=The LINQ expression '__8__locals2_dbContext_0
    .AsQueryableValues(
        values: __names_1, 
        configure: opt => opt.DefaultForIsUnicode(True))
    .Contains(NavigationTreeExpression
        Value: EntityReference: Customer
        Expression: c.Name)' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.
  Source=Microsoft.EntityFrameworkCore
  StackTrace:
   at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.ExpandNavigationsForSource(NavigationExpansionExpression source, Expression expression)
   at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.ProcessLambdaExpression(NavigationExpansionExpression source, LambdaExpression lambdaExpression)
   at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.ProcessWhere(NavigationExpansionExpression source, LambdaExpression predicate)
   at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.Expand(Expression query)
   at Microsoft.EntityFrameworkCore.Query.QueryTranslationPreprocessor.Process(Expression query)
   at Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.CreateQueryExecutor[TResult](Expression query)
   at Microsoft.EntityFrameworkCore.Storage.Database.CompileQuery[TResult](Expression query, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileQueryCore[TResult](IDatabase database, Expression query, IModel model, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.<>c__DisplayClass12_0`1.<ExecuteAsync>b__0()
   at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func`1 compiler)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.ExecuteAsync[TResult](Expression query, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.ExecuteAsync[TResult](Expression expression, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable`1.GetAsyncEnumerator(CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.ConfiguredCancelableAsyncEnumerable`1.GetAsyncEnumerator()
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.<ToListAsync>d__65`1.MoveNext()

Any feedback is appreciated.

@yv989c
Copy link
Owner

yv989c commented Apr 5, 2023

Hi @fiseni , thank you for the feedback!

Please use the overload meant for non-complex types in your example, like this:

using var dbContext = new AppDbContextQueryableJSON();
var names = new string[] { "abc", "def" };
var query = dbContext.Customers.Where(x => dbContext.AsQueryableValues(names, true).Contains(x.Name));
var result = await query.ToListAsync();

You can also try:

using var dbContext = new AppDbContextQueryableJSON();

var names = new string[] { "abc", "def" };
var queryableNames = dbContext.AsQueryableValues(names, true);

var query = dbContext.Customers.Where(x => queryableNames.Contains(x.Name));
var result = await query.ToListAsync();

Let me know if this works.

@fiseni
Copy link
Author

fiseni commented Apr 5, 2023

Hi @yv989c,

Yes, as I mentioned previously, that works perfectly fine. No problem there.
I wanted to utilize the following overload

public static IQueryable<TSource> AsQueryableValues<TSource>(this DbContext dbContext, IEnumerable<TSource> values, Action<EntityOptionsBuilder<TSource>>? configure = null) where TSource : notnull
{
	ValidateParameters(dbContext, values);
	return GetQueryableFactory(dbContext).Create(dbContext, values, configure);
}

If I understood you correctly, that one is meant only for complex types. I can't utilize that one if TSource is primitive or string, right? Then, it seems I have to create overloads for each specific type in the implementation here #2.

@yv989c
Copy link
Owner

yv989c commented Apr 5, 2023

@fiseni I see, sorry I did not catch that. IMO if your intention is to make something to support #2 , then every public API must be implemented to some degree.

I can't utilize that one if TSource is primitive or string, right?

It should still work that way though. I have a fallback in place in case of misuse.

Does this works? (not recommended)

using var dbContext = new AppDbContextQueryableJSON();

var names = new string[] { "abc", "def" };
var queryableNames = dbContext.AsQueryableValues(names, opt => opt.DefaultForIsUnicode(true));

var query = dbContext.Customers.Where(x => queryableNames.Contains(x.Name));
var result = await query.ToListAsync();

@fiseni
Copy link
Author

fiseni commented Apr 5, 2023

If you capture the IQueryable in a closure, then the evaluation doesn't throw. But, the config is not respected. Here is the generated SQL, it uses varchar

SELECT [c].[Id], [c].[Age], [c].[Name], [c].[NameNullable]
FROM [Customers] AS [c]
WHERE EXISTS (
  SELECT 1
  FROM (
	  SELECT TOP(@p1) [V] FROM OPENJSON(@p0) WITH ([V] varchar(max) '$', [_] BIT '$._') ORDER BY [_]
  ) AS [b]
  WHERE [b].[V] = [c].[Name])

@yv989c
Copy link
Owner

yv989c commented Apr 5, 2023

Yes, that's expected. Those options are meant for complex types. I guess some work could be done to propagate some of those options for simple types, but as I said, this overload is for complex types only. For simple types, you must use their respective overload.

@fiseni
Copy link
Author

fiseni commented Apr 5, 2023

Ok, it's clear now. I'm closing the issue. Thanks!

@fiseni fiseni closed this as completed Apr 5, 2023
@yv989c
Copy link
Owner

yv989c commented Apr 5, 2023

Btw @fiseni just a tip. When possible, always prefer a Join instead of using Contains. A join will likely give you better performance on SQL Server's side. The SQL generated by EF when using Contains —even with this library— is not optimal.

@fiseni
Copy link
Author

fiseni commented Apr 5, 2023

Thanks @yv989c . I'm aware, but writing the extension for Join is just tedious. Also, teams hate using it. Anyhow, I'm quite happy with the performance. Here are my benchmarks. The In is the exact same implementation that I provided in the PR. I don't see any downgrade in using the overload for complex types (I checked your code, and you fallback early in the case of simple types, so it's ok). I just need an overload for string so I can pass the flag for unicode.

image

Btw, once you decide what to do for the in-memory provider, I can try to be of help. The problem is not the in-memory provider per se, but also Sqlite. And Sqlite is used in tests extensively.

EDIT: I think I misunderstood your point, you meant not passing a list of values. Yes, definitely, whenever a join is feasible. This is just for cases where we don't have such an option, and getting the list from external sources.

@yv989c
Copy link
Owner

yv989c commented Apr 6, 2023

EDIT: I think I misunderstood your point, you meant not passing a list of values. Yes, definitely, whenever a join is feasible. This is just for cases where we don't have such an option, and getting the list from external sources.

@fiseni Now I'm a little confused tbh 🙃. I think you got my point initially. I was talking about choosing between the following two approaches:

var myQuery1 = 
    from i in dbContext.MyEntities
    where dbContext
        .AsQueryableValues(values)
        .Contains(i.MyEntityID)
    select new
    {
        i.MyEntityID,
        i.PropA
    };


var myQuery2 = 
    from i in dbContext.MyEntities
    join v in dbContext.AsQueryableValues(values) on i.MyEntityID equals v 
    select new
    {
        i.MyEntityID,
        i.PropA
    };

@fiseni
Copy link
Author

fiseni commented Apr 6, 2023

Haha.. yea that's what I was referring to initially. The Contains will generate WHERE EXISTS if I'm not wrong and it's not as efficient as a pure inner join. But, the LINQ method syntax for join is clunky, so people avoid using it.

As for the EDIT, I thought you perhaps referring to the cases where people read one table, then use the ids to query another table. If you can relate those tables just write a join. Usually, people think you must have navigation to write join in EF Core, which is not true.

@fiseni
Copy link
Author

fiseni commented May 5, 2023

Hey @yv989c,

I have a quick question. I'm trying to make it work with Join, but facing an issue if there is an additional Include. Let me provide an example:

This works:

using var dbContext = new AppDbContextQueryableJSON();
var ids = Enumerable.Range(1, 2).Select(x => Guid.NewGuid()).ToList();
var query1 = dbContext.Customers
    .Join(
        dbContext.AsQueryableValues(ids),
        x => x.Id,
        v => v,
        (x, v) => x
    );

var result1 = await query1.ToListAsync();

It fails if we include some navigation

using var dbContext = new AppDbContextQueryableJSON();
var ids = Enumerable.Range(1, 2).Select(x => Guid.NewGuid()).ToList();
var query1 = dbContext.Customers
    .Include(x => x.Addresses)
    .Join(
        dbContext.AsQueryableValues(ids),
        x => x.Id,
        v => v,
        (x, v) => x
    );

var result1 = await query1.ToListAsync();

The exception message:

System.InvalidOperationException: 'Unable to translate a collection subquery in a projection since either parent or the subquery doesn't project necessary information required to uniquely identify it and correctly generate results on the client side. This can happen when trying to correlate on keyless entity type. This can also happen for some cases of projection before 'Distinct' or some shapes of grouping key in case of 'GroupBy'. These should either contain all key properties of the entity that the operation is applied on, or only contain simple property access expressions.'

Usually, there is no problem in combining the Include and Join operations. But in this case, since the ids are converted to temp keyless table, it throws. Any suggestions? Have you faced this?

@yv989c
Copy link
Owner

yv989c commented May 8, 2023

Hey @fiseni, after some trial and error the only non-ideal solution I found was using Contains. It seems that in this join case where EF has to project the child collection (Addresses) it needs some metadata –key?– not present in the underlying entity I relying on to make QueryableValues work.

Well, I did found a way to make it work but it's less than ideal; if you return a flat object then it does work but that's very inconvenient, e.g.:

from c in dbContext.Customers
join id in dbContext.AsQueryableValues(ids) on c.Id equals id
from a in c.Addresses
select new
{
    Customer = c,
    Address = a
}

@yv989c
Copy link
Owner

yv989c commented May 8, 2023

I'll experiment adding a key to the underlying entity to see if that's enough to enable this scenario.

@fiseni
Copy link
Author

fiseni commented May 8, 2023

Hey @yv989c ,

I've been experimenting with this. Let me post my findings (hopefully it will be helpful for anyone reading this).

  • There is no issue if Contains approach is used. EF generates an inner "EXISTS" statement. It works but is not an ideal query.
  • If join approach is used, there is no issue in the case of 1:1 relationship, only in the case of 1:m (child collections). For instance, this works.
var query = from x in dbContext.Customers.Include(x => x.City)
                join v in dbContext.AsQueryableValues(ids) on x.Id equals v
                select x;

var query = dbContext.Customers
    .Include(x => x.City)
    .Join(dbContext.AsQueryableValues(ids), x => x.Id, v => v, (x, v) => x);
  • If you return a flat object, it works, and the objects are properly added to the tracker. But, it's not ideal since in order to get a list of customers, you will have to group/select.
  • The following approach works best in my opinion. Yes, there are 2 roundtrips (it's equivalent to AsSplitQuery), but that might be desirable to avoid Cartesian explosion.
var query = dbContext.Customers
    .Join(dbContext.AsQueryableValues(ids), x => x.Id, v => v, (x, v) => x);
	
// Load the navigation into EF tracker
_ = dbContext.Addresses
    .Join(dbContext.AsQueryableValues(ids), x => x.CustomerId, v => v, (x, v) => x);

@fiseni
Copy link
Author

fiseni commented May 8, 2023

Also, for anyone that needs to revert to classic EF Contains for tests (using In-Memory or Sqlite provider), here are few extensions

Usage:

var customers = await dbContext.Customers
    .InJoin(x => x.Id, ids, dbContext)
    .ToListAsync();

// Load child collections
_ = await dbContext.Addresses
    .InJoin(x => x.CustomerId, ids, dbContext)
    .ToListAsync();
public static class IQueryableValuesExtensions2
{
    private static readonly MethodInfo _containsEnumerableMethodInfo = typeof(Enumerable)
            .GetTypeInfo()
            .GetDeclaredMethods(nameof(Queryable.Contains))
            .Single(mi => mi.GetParameters().Length == 2
                && mi.GetGenericArguments().Length == 1
                && mi.GetParameters()[0].ParameterType.GetGenericTypeDefinition() == typeof(IEnumerable<>)
                && mi.GetParameters()[1].ParameterType.IsGenericParameter);

    public static IQueryable<TEntity> InJoin<TEntity, TKey>(
        this IQueryable<TEntity> source,
        Expression<Func<TEntity, TKey>> keySelector,
        IEnumerable<TKey> values,
        DbContext dbContext,
        Action<EntityOptionsBuilder<TKey>>? configure = null) where TKey : notnull
        => dbContext.Database.IsSqlServer()
            ? source.Join(dbContext.AsQueryableValues(values, configure), keySelector, v => v, (x, v) => x)
            : InMemory(source, keySelector, values);

    public static IQueryable<TEntity> InJoin<TEntity>(
        this IQueryable<TEntity> source,
        Expression<Func<TEntity, Guid?>> keySelector,
        IEnumerable<Guid> values,
        DbContext dbContext)
        => dbContext.Database.IsSqlServer()
            ? source.Join(dbContext.AsQueryableValues(values), keySelector, v => v, (x, v) => x)
            : InMemory(source, keySelector, values);

    public static IQueryable<TEntity> InJoin<TEntity>(
        this IQueryable<TEntity> source,
        Expression<Func<TEntity, int?>> keySelector,
        IEnumerable<int> values,
        DbContext dbContext)
        => dbContext.Database.IsSqlServer()
            ? source.Join(dbContext.AsQueryableValues(values), keySelector, v => v, (x, v) => x)
            : InMemory(source, keySelector, values);

    public static IQueryable<TEntity> InJoin<TEntity>(
        this IQueryable<TEntity> source,
        Expression<Func<TEntity, string?>> keySelector,
        IEnumerable<string> values,
        DbContext dbContext,
        bool isUnicode = true)
        => dbContext.Database.IsSqlServer()
            ? source.Join(dbContext.AsQueryableValues(values, isUnicode), keySelector, v => v, (x, v) => x)
            : InMemory(source, keySelector, values);

    private static IQueryable<TEntity> InMemory<TEntity, TKey1, TKey2>(
        this IQueryable<TEntity> source,
        Expression<Func<TEntity, TKey1>> keySelector,
        IEnumerable<TKey2> values)
    {
        ArgumentNullException.ThrowIfNull(values);
        ArgumentNullException.ThrowIfNull(keySelector);

        var parameter = Expression.Parameter(typeof(TEntity), "x");
        var propertySelector = ParameterReplacerVisitor.Replace(keySelector, keySelector.Parameters[0], parameter) as LambdaExpression;
        _ = propertySelector ?? throw new InvalidExpressionException();

        // Create closures so EF can parameterize the query.
        var valuesAsExpression = ((Expression<Func<IEnumerable<TKey2>>>)(() => values)).Body;

        // Get MethodInfo for Contains.
        var containsMethod = _containsEnumerableMethodInfo.MakeGenericMethod(typeof(TKey1));

        // Create an expression for the Contains method.
        var containsExpression = Expression.Call(
            null,
            containsMethod,
            valuesAsExpression,
            propertySelector.Body);

        // Create the final lambda expression
        var whereLambda = Expression.Lambda<Func<TEntity, bool>>(containsExpression, parameter);

        // Use the expression with the Where method
        var result = source.Where(whereLambda);

        return result;
    }
}

public class ParameterReplacerVisitor : ExpressionVisitor
{
    private readonly Expression _newExpression;
    private readonly ParameterExpression _oldParameter;

    private ParameterReplacerVisitor(ParameterExpression oldParameter, Expression newExpression)
    {
        _oldParameter = oldParameter;
        _newExpression = newExpression;
    }

    internal static Expression Replace(Expression expression, ParameterExpression oldParameter, Expression newExpression)
    {
        return new ParameterReplacerVisitor(oldParameter, newExpression).Visit(expression);
    }

    protected override Expression VisitParameter(ParameterExpression node)
    {
        return node == _oldParameter ? _newExpression : node;
    }
}

@yv989c
Copy link
Owner

yv989c commented May 15, 2023

Hi @fiseni, thanks for your feedback. I found a way to fix this so a join can be used when projecting out a child collection. I'm going to create a new issue and link it to the fix.

@fiseni
Copy link
Author

fiseni commented May 16, 2023

That's great! Do you expect any performance degradation?

@yv989c
Copy link
Owner

yv989c commented May 16, 2023

That's great! Do you expect any performance degradation?

Great question. The changes I'm making may impact performance to a certain degree. I'll run the benchmarks and compare with the previous version to see if there's a regression in perf. I'm also addressing another potential issue around ordering guarantees. Will keep you posted in #31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants