Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About ActiveRecord's new #missing and #associated methods #40719

Closed
MaxLap opened this issue Dec 1, 2020 · 8 comments
Closed

About ActiveRecord's new #missing and #associated methods #40719

MaxLap opened this issue Dec 1, 2020 · 8 comments

Comments

@MaxLap
Copy link
Contributor

MaxLap commented Dec 1, 2020

I would like to share some concerns about the new where.missing and where.associated methods in ActiveRecord.

First, easy problem, where.associated uses joins. This can result in duplicated results from the query when the associated is a has_many. Here is a repro: https://gist.github.com/MaxLap/4b66607c23c8c00483a13f06acfb6018

The correct way of doing this is to use the EXISTS SQL keyword. The wrong way would be to add a DISTINCT to the query, this would be an unexpected side-effects when people build complex queries.

Note: I think the symmetry of associated and missing should be kept, so if associated is changed to EXISTS , the same should be done for missing. There is less benefit, but it unifies the code.


The other issue is bit more philosophical:

There already exist many ways of doing a partial equivalent of WHERE EXISTS in ActiveRecord. I wrote a document on all of their issues.

These new methods only offer a slightly more intent-oriented name, but are still very limited. They can't receive conditions (unless you put them directly on a new association... you know someone will do that), and they can't go to a deeper association (again, unless you make a new associatoin).

In 2015, a more complete version of this was proposed and strongly refused: #21438

Since then, there are 2 gems that handle the whole problem:

They both allow having conditions on the association being queried, ex:

my_post.comments.where_assoc_not_exists(:author, is_admin: true)
Post.where_assoc_exists(:comments) { |comments| comments.not_spam }

So, my thinking is, maybe it would be better to endorse (or merge) one or both of the existing gems instead of adding a another partial way of doing it? I don't know if there is a precedent for that. Another idea could be to add mention of the gems in the documentation for these 2 new methods, saying they are more powerful and allow specifying conditions on the records.

Thoughts?

Related PRs:
#40696
#34727

@kaspth
Copy link
Contributor

kaspth commented Dec 1, 2020

Hey,

The API proposed in those gems are vastly inferior from Core's perspective. We won't be adding those or linking them.

The correct way of doing this is to use the EXISTS SQL keyword. The wrong way would be to add a DISTINCT to the query, this would be an unexpected side-effects when people build complex queries.

lol, what a treat to have you show up in this town, are you sure it's big enough for the two of us? 🤠😂Appreciate that you've thought a lot about this particular problem, but there's rarely just a correct and incorrect way of doing things — if it achieves the desired outcome we want for the API, then it's really fine.

Matter of fact, distinct sounds like exactly what we need for what's in scope for this feature, which is the intent-oriented name and not adding EXISTS queries and solving everything under the sun about them. Thanks for the second look and thoughts for improving here, unfortunately I don't think there's much we want to apply though.

@kaspth kaspth closed this as completed Dec 1, 2020
@EugZol
Copy link

EugZol commented Dec 1, 2020

I was referenced here by a ping to my email.

@kaspth Can you please elaborate on who are "we" and what is exactly that "we need" in your acute reply? As there were no arguments regarding technical details in your reply (such arguments were provided in the initial post), it is hard to understand your answer (apart from the general tone) without those clarifications.

To save time, let me suggest my understanding, which you can just shortly confirm if it is correct. Given the information from the previous PR linked above:

Extracted from an app where we had this [...]
#40696

...would it be correct to assume that "we" = Basecamp and "what we need" is "the features of our product/Basecamp"?

Thank you for your time in advance. Please allow others to save theirs as well by giving definitive reply to the two questions above.

@marcandre
Copy link
Contributor

marcandre commented Dec 1, 2020

I'll add to @EugZol's questions this remark:

lol, what a treat to have you show up in this town, are you sure it's big enough for the two of us?

I'm not certain what was intended, but this comment is out of place and can easily be interpreted as derogatory, intimidating or offensive. Hopefully you are not saying that this is your town and that there's no more place for @MaxLap in it? Better than a reply might be to simply edit your comment.

@kaspth
Copy link
Contributor

kaspth commented Dec 1, 2020

@kaspth Can you please elaborate on who are "we" and what is exactly that "we need" in your acute reply? As there were no arguments regarding technical details in your reply

We, in this case, is Rails. Since Rails is all about intention revealing methods we (technically I as a member of the Core Team), have added this method. The first tenet of the Rails Doctrine, Programmer Happiness, is an implicit call to subjectivity. And not every API will perfectly match every Rails users' sensibility and that's ok. Based on our doctrine we don't always consider the most technical argument to be automatically right — there are many graphs where Ruby is not the "best" language and Rails is not the "best" framework, but there are also plenty of other merits that aren't always technical. Having a technical argument is not a slam-dunk here.

It's this charge in the original comment that I'm disagreeing with. This feature happened to cross over with an area that @MaxLap is clearly very invested in, he's got a gem doing something similar to this. But flatly stating that we're wrong for doing something that very clearly works in a production app is way out of line. This is the attitude I'm paraphrasing when I'm feeding it back — I'm not saying that I've come to take over the town, I'm saying that's the attitude built in to the original comment. There's simply a limit to how much decision making you can commandeer over us through issues. We allow participation and second-opinions and we appreciate them most of the time, but we have our limits.

@EugZol
Copy link

EugZol commented Dec 1, 2020

We, in this case, is Rails [...]

If I deduced correctly from that broad paragraph, we = Rails' core team. Thank you for that clarification.

Based on our doctrine we don't always consider the most technical argument to be automatically right

I understand that: you don't argue with the technical concerns, but dismiss them due to some "doctrine"-level concerns. Thank you for clarifying that technical concerns are not at issue here.

but there are also plenty of other merits that aren't always technical

Namely, if you will? Not in general, but which actually were relevant for this issue.

This feature happened to cross over with an area that @MaxLap is clearly very invested in, he's got a gem doing something similar to this.

That is posed backwards. Let me restore the actual chronology.

My solution was, as I remember, first. I did PR to the Rails, which was rejected without code review, because the feature was deemed not necessary. That non-technical reason, of course, seemed subjectively arguable but objectively valid to me.

Then @MaxLap did his own work on the same problem and justified building separate solution due to several points, which he lists in the documentation of his gem. All these differences are minor, when one considers what is common between the two approaches. So, two developers who solved real-world problem, came up with two solutions which are similar in several important aspects.

It should be noted that both provided solutions have somewhat minor, but dedicated number of users. Developers of several projects. Which is, formally, more than one project/company, which you represent (when you play the role of PR author, not reviewer).

Then you build a "not invented here" solution which is, and you don't argue with that, at some relevant and important points is technically inferior.

So what justifies the displayed(?) superiority(?) in technical expertise (that impression is clarified below) is not clear.

But flatly stating that we're wrong

Are you saying that technical concerns in the initial post are valid or invalid? If you dismiss them without consideration, then you should dismiss that point which is connected to those technical concerns in the initial post as well.

You can't express disagreement with technical concerns if you refuse to provide technical counter-argument, otherwise you show that you treat them as way below your level of expertise.

(I, personally, don't think that it is an open question whether having mandatory "DISTINCT" in your query, when you mean "EXISTS", is a good thing or not. However, I'm interested in understanding the aforementioned non-technical merits.)

for doing something that very clearly works in a production app is way out of line.

By default, it should be assumed that his work (and mine – I support the initial post in general) is used in production apps as well. Otherwise, that would show that you mistreat the level of expertise of your peers whom that line is addressed to.

I explicitly stated production use of the referred gems above, however, to get around possible lack of such intuitive courtesy or technical insight.

This is the attitude I'm paraphrasing when I'm feeding it back

I think any interested person could review your posts (and others', for that matter), should they desire, and make their own conclusion who has what attitude. I don't think that should be at issue here. We are who we are.

There's simply a limit to how much decision making you can commandeer over us through issues.

Your subjective decision-making, which I'm interested in, is "to review prior work or not" when you work on some feature. Please allow me to refine the previous question (what is that "we, i.e. Rails core team, need") in the following manner: what actions of a potential contributor would nudge you (i.e., Rails' core team member/representative) to review prior work, adopted and used by the community for years, when you implement some or another feature for Rails (taking this feature as example)?

@MaxLap
Copy link
Contributor Author

MaxLap commented Dec 2, 2020

What is the expected count for these 2 queries?

Post.create!(p_int: 42) # Just an int to allow making conditions

Post.where.associated(:comments).or(Post.where(p_int: 42)).count
Post.where(p_int: 42).or(Post.where.associated(:comments)).count

The first returns 0
The second one fails because the two relations are not structurally equivalent.
I would personally expect both results to be 1.
https://gist.github.com/MaxLap/e80301bb6136227fb2a429029c7dc3ae

I didn't say you had to "[solve] everything under the sun about [EXISTS]". Only that using EXISTS for these 2 APIs would be better. They would mesh better with other APIs and would avoid unexpected side-effects (needing distinct to the query).

But flatly stating that we're wrong for doing something that very clearly works in a production app is way out of line.

I never said anything about your app. This pattern likely still is in my production apps too, and we are using my gem. It's not wrong by itself, it's the interactions with other APIs that is counter-intuitive.

@kaspth
Copy link
Contributor

kaspth commented Dec 2, 2020

Only that using EXISTS for these 2 APIs would be better. They would mesh better with other APIs and would avoid unexpected side-effects (needing distinct to the query).

@MaxLap awesome, I really like this and I'd be happy to see a PR that investigates this! I'm sorry we got off on the wrong foot, I was taking this unnecessarily harshly because I haven't gotten to contribute a feature to Rails in the past 6 months, while reviewing and merging 30+ PRs from other contributors. And so seeing the right/wrong split in the message felt like you were strong-arming me and telling me what to do, when I didn't feel like I'd gotten to do very much personally this past half year. I'm grateful that we get to have you here and that you want to help make Rails better, thank you ❤️

@MaxLap
Copy link
Contributor Author

MaxLap commented Dec 4, 2020

I'd be happy to see a PR that investigates this

I'll consider it. Honestly it wasn't my plan to be the one doing it. I thought it would be better if the Rails team got a little familiar with it EXISTS themselves so that they resort less to JOIN and DISTINCT and possibly find other useful APIs with EXISTS.

Would that be something that is backported to 6.1 or is this doomed to be in 6.2? Since the documentation explicitly says it does a joins, would it be a breaking change? Not sure there can be those in a release candidate / patch update.

The API proposed in those gems are vastly inferior from Core's perspective. We won't be adding those or linking them.

Maybe our APIs are more complex, but they solve real (and complex) problems. Just google rails condition on other table and rails condition on association. You'll find joins, includes, nested subqueries, raw sql. They work, but just don't compose as well as EXISTS and can lead to subtle bugs and incompatibilities between different parts of code.

Since the documentation for associated and missing mentions the SQL tool used, which would become EXISTS, I think it would be helpful to add just a few lines to bottom of their documentation:

If you want to specify extra conditions on the associations (in the EXISTS), consider using one of these gems:
https://github.com/MaxLap/activerecord_where_assoc
https://github.com/EugZol/where_exists

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants