Skip to content

[GLUTEN-1858][CORE] Add PlanOneRowRelation to make gluten work with OneRowRelation#1859

Merged
zhouyuan merged 1 commit intoapache:mainfrom
ulysses-you:one-row-relation
Jun 9, 2023
Merged

[GLUTEN-1858][CORE] Add PlanOneRowRelation to make gluten work with OneRowRelation#1859
zhouyuan merged 1 commit intoapache:mainfrom
ulysses-you:one-row-relation

Conversation

@ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Jun 6, 2023

What changes were proposed in this pull request?

This pr adds a new rule PlanOneRowRelation which inject a fake schema for OneRowRelation. As gluten does not support work with empty output relation.

BTW, if we want to validate data result with vanilla spark and gluten, please disable constant folding
set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConstantFolding

(Fixes: #1858)

How was this patch tested?

add test

@github-actions
Copy link

github-actions bot commented Jun 6, 2023

Run Gluten Clickhouse CI

@github-actions
Copy link

github-actions bot commented Jun 6, 2023

#1858

@ulysses-you ulysses-you changed the title [GLUTEN-1858][CORE] Add PlanOneRowRelation to make gluten work [GLUTEN-1858][CORE] Add PlanOneRowRelation to make gluten work with OneRowRelation Jun 6, 2023
@ulysses-you
Copy link
Contributor Author

cc @zhztheplayer @zhouyuan thank you

@github-actions
Copy link

github-actions bot commented Jun 6, 2023

Run Gluten Clickhouse CI

@ulysses-you ulysses-you force-pushed the one-row-relation branch 2 times, most recently from 0cabf48 to 671610d Compare June 6, 2023 12:47
@github-actions
Copy link

github-actions bot commented Jun 6, 2023

Run Gluten Clickhouse CI

zhouyuan
zhouyuan previously approved these changes Jun 7, 2023
Copy link
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 works for me

val plan = df.queryExecution.executedPlan
assert(plan.find(_.isInstanceOf[RDDScanExec]).isDefined)
assert(plan.find(_.isInstanceOf[ProjectExecTransformer]).isDefined)
assert(plan.find(_.isInstanceOf[GlutenRowToArrowColumnarExec]).isDefined)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a refactor on class name on main branch

Suggested change
assert(plan.find(_.isInstanceOf[GlutenRowToArrowColumnarExec]).isDefined)
assert(plan.find(_.isInstanceOf[RowToArrowColumnarExec]).isDefined)

Copy link
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you can you please do a rebase?

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor Author

thank you @zhouyuan , rebased

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Run Gluten Clickhouse CI

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor Author

@zhouyuan any idea about the failed test ? I'm not sure what's wrong

@zhouyuan
Copy link
Member

zhouyuan commented Jun 8, 2023

@ulysses-you yes, the failure seems due to one change on velox/substrait this morning, fix up ASAP
-yuan

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Run Gluten Clickhouse CI

@github-actions
Copy link

github-actions bot commented Jun 9, 2023

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor Author

@zhouyuan all tests passed !

Copy link
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@zhouyuan zhouyuan merged commit ef63820 into apache:main Jun 9, 2023
@zhouyuan
Copy link
Member

zhouyuan commented Jun 9, 2023

@ulysses-you Thank you for improving on this part! Yes, in the UT gluten should disabled ConstantFolding already. CC: @PHILO-HE

@ulysses-you
Copy link
Contributor Author

yeas, I think it can help improve test coverage a bit. Some tests in Spark are written with select expr(xxx).

@ulysses-you ulysses-you deleted the one-row-relation branch June 9, 2023 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make gluten work with OneRowRelation

2 participants