Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37670][FOLLOWUP][SQL][TESTS][3.2] Update TPCDS golden files #36815

Closed
wants to merge 1 commit into from
Closed

[SPARK-37670][FOLLOWUP][SQL][TESTS][3.2] Update TPCDS golden files #36815

wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jun 9, 2022

What changes were proposed in this pull request?

This is a followup of #34929 to update TPCDS plan test golden file due to ID number changes.

Why are the changes needed?

Currently, branch-3.2 is broken.

Screen Shot 2022-06-08 at 7 25 46 PM

Screen Shot 2022-06-08 at 7 25 55 PM

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CIs.

@dongjoon-hyun
Copy link
Member Author

cc @maryannxue , @cloud-fan , @HyukjinKwon , @sunchao

@HyukjinKwon
Copy link
Member

👍

@dongjoon-hyun
Copy link
Member Author

Thank you, @HyukjinKwon !

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending CI

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@dongjoon-hyun
Copy link
Member Author

Thank you, @sunchao and @cloud-fan .

@dongjoon-hyun
Copy link
Member Author

q5 failure is fixed but q4 seems to be indeterministic for some reasons in branch-3.2. The result seems to be oscillated with new values and old values. Let me take a look at more.

- check simplified (tpcds-v1.4/q4) *** FAILED *** (1 second, 36 milliseconds)
...
*** 1 TEST FAILED ***
Failed: Total 9599, Failed 1, Errors 0, Passed 9598, Ignored 29

@dongjoon-hyun
Copy link
Member Author

Here is the summary.

  • The original PR proposed Make CTE IDs more deterministic by starting from 0 for each query.
  • However, the current status generate indeterministic IDs in the following two commands.
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly *PlanStabilitySuite -- -z (tpcds-v1.4/q4)"
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly *PlanStabilitySuite"

However, the query plan structure itself looks identical. Only different ID is the root cause of failure here.

@cloud-fan
Copy link
Contributor

Imagine that there is a global counter for CTE IDs, running a query along will have different IDs with running many queries. I think we need to regen the golden files with SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly *PlanStabilitySuite" to match Github Action.

@dongjoon-hyun
Copy link
Member Author

That golden files in branch-3.2 are the one already generated by that method, @cloud-fan . And, it fails in GitHub Action environment.

@dongjoon-hyun
Copy link
Member Author

Let me comment on the original PR.

@dongjoon-hyun
Copy link
Member Author

As an alternative, reverting PR is also under testing.

@dongjoon-hyun
Copy link
Member Author

Thank you everyone. This is closed according to @cloud-fan 's approach.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-37670 branch June 9, 2022 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants