Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Support Gluten Vectorized Engine #374

Merged
merged 7 commits into from
Dec 16, 2023

Conversation

xuchen-plus
Copy link
Contributor

@xuchen-plus xuchen-plus commented Dec 12, 2023

Close #373
Project Gluten uses Velox to replace Spark's physical plan with native vectorized execution.
LakeSoul's reader is already returning Arrow vectors, which is supported as input for Gluten. However gluten identifies LakeSoul as spark vanillar columnar reader and inserts RowToVeloxColumnarExec(ColumnarToRowExec(BatchScanExec)) and this could be inefficient.
We added a new post columnar rule to remove the above two conversions when gluten plugin detected during runtime.

@xuchen-plus xuchen-plus added enhancement New feature or request spark spark support into lakesoul native-io labels Dec 12, 2023
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
@xuchen-plus xuchen-plus merged commit 74c28a2 into lakesoul-io:main Dec 16, 2023
14 checks passed
@xuchen-plus xuchen-plus deleted the spark_gluten branch December 16, 2023 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request native-io spark spark support into lakesoul
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Spark] Support Gluten Vectorized Engine
2 participants