Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning/code analyzer that mapping via function call produces different SQL (selects all columns) than inline mapping #30162

Closed
janseris opened this issue Jan 27, 2023 · 12 comments

Comments

@janseris
Copy link

janseris commented Jan 27, 2023

While selecting for photo IDs from the database for a list of ID of records,
we noticed that columns which are not a part of the projection are also selected even if the projection selects only IDs when the mapping is done via an external function.
I assumed the effect is the same as in LINQ to SQL (this (calling a function to map the data without a penalty) "worked" there I believe).
This issue happens when the mapping is called as a function and not directly in the query.
Using a mapping function is essential in large systems (DRY principle).
We noticed it when the table contained large byte array data - instead of kBs, hundreds of MBs were coming because all columns from all the tables were selected instead of only these requested in the projection in code.

I suggest a "feature" request: Add warning or intellisense suggestion for this behavior because it is a big deal.

Image showing generated SQL:

pozor na mapping

Code:

public async Task<List<SeznamIDPrilohInfo>> GetPrilohyIDs(List<int> pozadavkyIDs)
        {
            using var db = DB;
            var view =
                db.POZADAVEK.Include(p => p.Zaznam).ThenInclude(z => z.Soubor);
            var query =
                from p in view
                where pozadavkyIDs.Contains(p.ID)
                select new SeznamIDPrilohInfo
                {
                    VlastnikID = p.ID,
                    PrilohyIDs = (from s in p.Zaznam.Soubor select s.ID).ToList()
                };
            return await query.AsNoTracking().ToListAsync();
        }

        public async Task<List<SeznamIDPrilohInfo>> GetPrilohyIDsWithMapper(List<int> pozadavkyIDs)
        {
            using var db = DB;
            var view =
                db.POZADAVEK.Include(p => p.Zaznam).ThenInclude(z => z.Soubor);
            var query =
                from p in view
                where pozadavkyIDs.Contains(p.ID)
                select prilohyIDsMapper.Map(p);
            return await query.AsNoTracking().ToListAsync();
        }

        public async Task<List<SeznamIDPrilohInfo>> GetPrilohyIDsWithFunctionInTheSameClass(List<int> pozadavkyIDs)
        {
            using var db = DB;
            var view =
                db.POZADAVEK.Include(p => p.Zaznam).ThenInclude(z => z.Soubor);
            var query =
                from p in view
                where pozadavkyIDs.Contains(p.ID)
                select Map(p);
            return await query.AsNoTracking().ToListAsync();
        }

        private static SeznamIDPrilohInfo Map(POZADAVEK p)
        {
            return new SeznamIDPrilohInfo
            {
                VlastnikID = p.ID,
                PrilohyIDs = (from s in p.Zaznam.Soubor select s.ID).ToList()
            };
        }

Schema with comments:
schema

Thank you

@janseris
Copy link
Author

janseris commented Jan 27, 2023

Btw solution to this performance issue is:
a) use only inline projection
b) use Expression<Func<T1,T2> instead of projection function
c) use projection function and then automatically add it inline in the query using some magic which is now more simple in EF Core 7+ thanks to #28505

@roji
Copy link
Member

roji commented Jan 28, 2023

@janseris are you saying that you're seeing more columns columns being projected out when using Map() in your queries? If so, Map() isn't part of EF Core, and so EF isn't involved in any of this... This comes from a 3rd-party package (Automapper?), you probably want to consult the their docs/maintainers.

@janseris
Copy link
Author

janseris commented Jan 28, 2023

@roji

are you saying that you're seeing more columns columns being projected out when using Map() in your queries

Yes

But Map() here is just a plain C# function call, no external library.
It is also well explained in this issue DentallApp/back-end#6
The Map function is visible in the end of the source code that I posted.

The prilohyIDsMapper is an instance of a class with that same function as well. I am sorry, the comment disappeared from the posted source code when I was editing.

If you do the mapping inline in the query, the projection is applied correctly - only required columns are selected in SQL.
When you do the same but it is extracted into a function, all columns from all tables from Include/ThenInclude are selected (and the projection is done on client).

All columns are selected even if I select only the int VlastnikID in the mapping function.

@janseris janseris changed the title Add warning/code analyzer that mapping in function call produces different SQL than direct mapping Add warning/code analyzer that mapping via function call produces different SQL (selects all columns) than inline mapping Jan 28, 2023
@stevendarby
Copy link
Contributor

stevendarby commented Jan 28, 2023

Related issue: #24509

Current issue is requesting a warning when it switches to client evaluation due to an opaque function. Linked issue is about throwing in that scenario.

@ajcvickers
Copy link
Member

@janseris The call into Map is opaque to EF because nothing inside the function is included in the expression tree that EF is translating. In other words, EF has no way of knowing which properties are being selected. This isn't specific to EF, but rather the way LINQ works in .NET.

If the function is to be translated, then it needs to contribute to the expression tree. One relatively easy way to do this is to write it in terms of IQueryables. For example:

public static IQueryable<SeznamIDPrilohInfo> Map(this IQueryable<POZADAVEK> poz)
    => poz.Select(p => new SeznamIDPrilohInfo
    {
        VlastnikID = p.ID,
        PrilohyIDs = (from s in p.Zaznam.Soubor select s.ID).ToList()
    });

@janseris
Copy link
Author

janseris commented Jan 30, 2023

@ajcvickers
Hi yes I know the cause is the expression tree and my proposal was that it would be nice if there was some sort of warning for that behavior.
That is what my issue suggests and also the issue #24509

It is not a bug.

Thanks for the sample with IQueryable, I didn't think of that way of using it (without the penalty of silently selecting all columns and then dropping them) like that, will try out!

@roji
Copy link
Member

roji commented Jan 30, 2023

@janseris in many cases, projecting out all the columns is what the user wants; even if in your particular case you don't want this, I don't think it would be appropriate to warn on this for everyone.

@janseris
Copy link
Author

janseris commented Jan 30, 2023

@janseris in many cases, projecting out all the columns is what the user wants; even if in your particular case you don't want this, I don't think it would be appropriate to warn on this for everyone.

Do you have an example?
I can see 16 upvotes for the "warning" behavior in the issue #24509 and no downwotes or mentions of downsides of the warning/throw behavior.

@roji
Copy link
Member

roji commented Jan 30, 2023

#24509 isn't quite the same as what we're discussing above - that would be a general opting out of any client evaluation (so queries would fail). If I understood you correctly, you were proposing a warning by default (without an opt-in) when a function is client-evaluated.

In any case, if #24509 is what you want, then we do agree that makes sense, and we can close this issue as a dup of that.

@ajcvickers
Copy link
Member

ajcvickers commented Feb 1, 2023

Note from triage: agreed that #24509 is sufficient here.

@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Feb 1, 2023
@janseris
Copy link
Author

janseris commented Feb 1, 2023

Hi, I am sorry, I didn't have time to respond earlier,
the issue #24509 is stale. It seems that noone from the EF team is paying attention to it according to no assigned persons and no milestone.
Are you planning to assign it to someone or include it in some planning?

@roji
Copy link
Member

roji commented Feb 1, 2023

@janseris #24509 is in the backlog milestone, and has a pretty low number of votes. This probably isn't something we're going to do in the immediate near future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants