Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] Incorrect FQCN for withAggregate() on self-relation #35389

Closed
wants to merge 1 commit into from
Closed

[8.x] Incorrect FQCN for withAggregate() on self-relation #35389

wants to merge 1 commit into from

Conversation

bjuppa
Copy link

@bjuppa bjuppa commented Nov 27, 2020

This PR provides a failing test case for when methods in the withAggregate()-family are used on an Eloquent "self-relation".

Aggregate queries on self-relations use a generated table alias, but the aggregate column will fail to be qualified to that alias, silently producing incorrect results, or an sql-error when sql_mode=only_full_group_by.

I not sure yet what the solution would be, but hopefully this failing-test PR can be a starting point to discuss a good approach!

Background

Laravel 8.13 included b0dc0b2 that added $relation->getRelated()->qualifyColumn($column) to qualify the column name in the aggregate subquery. This seems reasonable as that column name used to end up unqualified, and that is of course a risk.

Unfortunately the introduced qualification is done before knowing what the actual table alias will be in the final query.
It will always qualify using the table name from the relation for the aggregate column.

This goes unnoticed for common relationships between two different models, each with their own table...

What goes wrong?

For a relationship that references the same model (e.g. a student tutoring other students), the table in the subquery is aliased (see example in HasOneOrMany – variants are in other relationship types too).

So, the "child"-table of the subquery becomes aliased (a generated hash), while the selected aggregate column in the very same sub-query is still qualified to what at this point is actually the "parent" table. 😞

Example sql

This reflects current sql produced when ->withMin('created_at') is used on a self-relationship:

select `self_related_stubs`.*,

(select
min(`self_related_stubs`.`created_at`) -- This FQCN is incorrect, it should reference the alias set below!
from `self_related_stubs` as `self_alias_hash`
where `self_related_stubs`.`id` = `self_alias_hash`.`parent_id`
) as `child_foos_min_created_at`

from `self_related_stubs`

Sql error looks something like this:
In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'test_db.self_related_stubs.id'; this is incompatible with sql_mode=only_full_group_by

Possible solutions

Can we go back to not qualifying the aggregate column?

Between 8.12 (when the super-useful withAggregate-methods were introduced in 95d0c9a, thanks @khalilst!) and 8.13, aggregates on self-relations worked as expected (at least with mysql 8 that I've been running). The column name wasn't qualified at all and there was just a single from-table in the actual subquery part, so I guess it was unambiguous.

  • Was the unqualified column ambiguous in other sql-dialects than mysql?
  • Could there be cases where the aggregate subquery has joins or sub-sub-queries?
  • Perhaps contributor @dbakan has more information or examples as to why the FQCN was introduced?
  • Is there a policy to always qualify columns throughout illuminate/database?

What can we do if we want to keep the FQCN?

  • Can the qualification wait and be applied inside getRelationExistenceQueryForSelfRelation, where the final alias is in scope?
  • Is it possible to create a general "deferred-FQCN" functionality to be used throughout illuminate/database? So that one can pass an object containing an unqualified column name, indicating that builder methods may qualify at the point they're actually used in queries... This may not make sense... πŸ€” I just made it up! πŸ˜„

Thanks for reading, and thanks for collaborating on this wonderful framework! Please share your thoughts on how we can solve this!

@bjuppa bjuppa marked this pull request as draft November 27, 2020 14:59
@bjuppa
Copy link
Author

bjuppa commented Nov 27, 2020

PS. In my app I've started using withMax('siblings', 'created_at') on a HasMany siblings self-relationship, combined with a having clause on the selected aggregate to create a HasOne relationship on the parent model that only includes the latest item 🀯
An equivalent example would be a Post model having many Comment in a comments HasMany relationship, but also a latestComment HasOne relationship, that can be effectively eager loaded etc. The resulting query is complex but the PHP syntax in the models is really readable. And the query performs just fine with proper indices πŸ˜„ This is something I've been wanting to do for years, and withAggregate finally made it possible using nothing but standard Eloquent methods!

@driesvints driesvints changed the title Incorrect FQCN for withAggregate() on self-relation [8.x] Incorrect FQCN for withAggregate() on self-relation Nov 27, 2020
@dbakan
Copy link
Contributor

dbakan commented Nov 27, 2020

The reason was described in #35061 (comment)

When using the feature on a belongsToMany, the pivot table and the related table might both have a column of the requested name. That's why I added this qualifyColumn wrapper.

I'm sorry, if this introduced another bug. Is there a way to access the aliases for that kind of relation? Will check on that.

@khalilst
Copy link
Contributor

@bjuppa Thanks for your full descriptions. I am trying to find the solution.
@dbakan Please let me know if you are working on the solution for this test case.

@bjuppa
Copy link
Author

bjuppa commented Nov 27, 2020

@dbakan & @khalilst, big thanks to both of you for joining the effort, and thanks for your earlier contributions!

So the qualification was needed to handle aggregates on belongsToMany... that's a good reason! There's no way we can go back and drop the FQCN completely.

My other option was to move the qualification to later, in the getRelationExistenceQuery method. What do you think about that idea, is it even feasible?

The other way around, to somehow "guess" the alias before getRelationExistenceQuery seems very hard to me, as the final alias is accessible within getRelationExistenceQueryForSelfRelation only.

@khalilst
Copy link
Contributor

khalilst commented Nov 27, 2020

@bjuppa @dbakan
I found the solution. I will create a PR here.

@driesvints
Can you please tell me how to fork this branch? bjuppa:self-related-withAggregate
I have to create another PR?

@driesvints
Copy link
Member

Just create a fresh pr if you have to thanks.

@dbakan
Copy link
Contributor

dbakan commented Nov 27, 2020

@khalilst Thank you very much!

@khalilst
Copy link
Contributor

@dbakan My pleasure.

@driesvints Sorry for bothering. I have to include the test provided here by @bjuppa in #35392 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants