Skip to content

[CALCITE-6249] RelNode::estimatedRowCount should not be used in computeSelfCost#3674

Merged
rubenada merged 1 commit intoapache:mainfrom
sap-contributions:CALCITE-6249
Feb 19, 2024
Merged

[CALCITE-6249] RelNode::estimatedRowCount should not be used in computeSelfCost#3674
rubenada merged 1 commit intoapache:mainfrom
sap-contributions:CALCITE-6249

Conversation

@kramerul
Copy link
Contributor

@kramerul kramerul commented Feb 7, 2024

No description provided.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 7, 2024

@rubenada
Copy link
Contributor

rubenada commented Feb 7, 2024

LGTM

@rubenada rubenada self-assigned this Feb 7, 2024
@rubenada rubenada added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Feb 7, 2024
@mihaibudiu
Copy link
Contributor

This looks fine, but no test is affected, so it's hard to gauge the actual impact of this change.

final RelMetadataQuery mq) {
double rowCount = mq.getRowCount(this);

final double rightRowCount = right.estimateRowCount(mq);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please write a test for this? If it shouldn't be used then a test would be helpful to show why, and how this commit corrects that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@rubenada
Copy link
Contributor

rubenada commented Feb 8, 2024

@mihaibudiu this change is subtle, but it has an impact (although we do not see it in Calcite tests).

By default, this change will have zero effect, because Calcite's default implementation of mq.getRowCount(rel) simply calls rel.estimateRowCount(mq); so it seems we are getting to the same place with one extra step; and indeed we are (that's the reason why we do not see any impact on our tests).

However, the fact of using mq.getRowCount(rel) allows downstream projects to plug their own metadata rowCount and "override" the default rowCount computation proposed by Calcite. If we call directly rel.estimateRowCount(mq), this flexibility is lost. In fact, this PR is just applying our own Calcite rules, see RelNode#estimateRowCount javadoc:

Don't call this method directly. Instead, use
{@link RelMetadataQuery#getRowCount}, which gives plugins a chance to
override the rel's default ideas about row count.

So all these calls to rel.estimateRowCount(mq) should have been mq.getRowCount(rel) from the beginning (and we are likely dealing with a copy-paste of sub-optimal code propagated). As I said before, at that time this might have been difficult to catch, because by default the effect is the same.

@tanclary @JiajunBernoulli Writing tests for these changes is possible, but cumbersome. See e.g. the recent commit f7069cc which performed a similar change in one location, and how "artificial" it was to show the impact on a unit test.
Personally, I understand the logic behind this change, and bearing in mind that is not straightforward to provide tests for this, I'd consider exceptionally merging the PR without tests. But if others consider that this is not acceptable, I'll accept that decision.

@kramerul
Copy link
Contributor Author

kramerul commented Feb 13, 2024

Thanks @rubenada for your detailed explanation.
For me, this PR is more like a refactoring. Therefore, I would also prefer not writing a unit test. The existing unit tests ensure that everything works as before. The changed code locations are touched in many unit tests.

Copy link
Member

@asolimando asolimando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too even without additional tests

@tanclary
Copy link
Contributor

Thanks @rubenada for your detailed explanation. For me, this PR is more like a refactoring. Therefore, I would also prefer not writing a unit test. The existing unit tests ensure that everything works as before. The changed code locations are touched in many unit tests.

What is the existing test that verifies the behavior impacted by this PR? The title says something should not be used, but there is no test to tell me that the claim of the JIRA case is true or has been fixed by this PR. If it is hard to write a test, perhaps we should work on improving the appropriate test fixture, rather than neglecting testing altogether.

@kramerul
Copy link
Contributor Author

The title says something should not be used, but there is no test to tell me that the claim of the JIRA case is true or has been fixed by this PR.

In the current code base, mq.getRowCount is used in most cases inside computeSelfCost. On the other hand, estimatedRowCount is only used in some rare cases, which are fixed by this PR. As @rubenada already pointed out, there is also a javadoc, which states that the claim of the JIRA case is true. In my point of view, there is no need to prove this by a unit test.

@tanclary
Copy link
Contributor

tanclary commented Feb 14, 2024

The title says something should not be used, but there is no test to tell me that the claim of the JIRA case is true or has been fixed by this PR.

In the current code base, mq.getRowCount is used in most cases inside computeSelfCost. On the other hand, estimatedRowCount is only used in some rare cases, which are fixed by this PR. As @rubenada already pointed out, there is also a javadoc, which states that the claim of the JIRA case is true. In my point of view, there is no need to prove this by a unit test.

A javadoc and a JIRA case don't have any impact on the execution of certain code though. I'm not trying to block this commit from getting merged, I just don't think it's good practice to make changes without adding tests, especially when there's no existing tests.

Calcite's test suite is extensive, there are test classes and fixtures for every issue I've come across. If there's not a good way to write a test, I think the solution is to make test-writing easier, not to ignore it.

@rubenada
Copy link
Contributor

@tanclary , I think what @kramerul means is that there are many tests that go trough the Calcite's metadata logic and cost computation (basically every test where a query is executed via VolcanoPlanner), and the fact that all these tests are still working is a sign that this change introduces no regression (as expected), with the advantage of allowing metadata pluggability in downstream projects.

@tanclary
Copy link
Contributor

@tanclary , I think what @kramerul means is that there are many tests that go trough the Calcite's metadata logic and cost computation (basically every test where a query is executed via VolcanoPlanner), and the fact that all these tests are still working is a sign that this change introduces no regression (as expected), with the advantage of allowing metadata pluggability in downstream projects.

Like I said, I'm not trying to block this. If you want to merge it please feel free. What I'm saying is that "it didn't break anything" isn't a good reason to not write tests. I also don't think "writing a test is hard" is a good reason to not write a test. That's just my opinion you can of course choose to ignore it.

@rubenada
Copy link
Contributor

I understand your point of view and I appreciate your comments @tanclary , and I also appreciate the fact that your intention is not to block this.
In principle, I agree with you: generally speaking, we should not accept PRs without tests, especially for bugs. Perhaps this issue should have been filed as an improvement/task rather than a bug (there is no error or exception in the current code, just a lack of pluggability), so it might be open to interpretation.

As I said before, exceptionally, I think we can merge the current PR as it is. Since we have 3 committers who have approved/LGTM'ed, and no explicit veto or -1; I propose to merge the PR, if no further comment (or veto) appears in the next 48h.

@rubenada rubenada merged commit a19a954 into apache:main Feb 19, 2024
@caicancai
Copy link
Member

@rubenada @tanclary Maybe I can try reading this pr change to see if there are any tests that need to be added, I agree with clary to some extent

@rubenada
Copy link
Contributor

That would be great @caicancai .
Apart from this PR, you can also take a look at f7069cc , which was a similar (more localized) change, which included a unit test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

LGTM-will-merge-soon Overall PR looks OK. Only minor things left.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants