Skip to content

[GLUTEN-8580][CORE][Part-1] Clean up unnecessary code related to input file expression#8584

Merged
zml1206 merged 2 commits intoapache:mainfrom
zml1206:8580-1
Jan 23, 2025
Merged

[GLUTEN-8580][CORE][Part-1] Clean up unnecessary code related to input file expression#8584
zml1206 merged 2 commits intoapache:mainfrom
zml1206:8580-1

Conversation

@zml1206
Copy link
Contributor

@zml1206 zml1206 commented Jan 21, 2025

What changes were proposed in this pull request?

(Fixes: #8580)

How was this patch tested?

@github-actions github-actions bot added CORE works for Gluten Core CLICKHOUSE labels Jan 21, 2025
@github-actions
Copy link

#8580

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@zhztheplayer
Copy link
Member

The change looks good to me. @baibaichen Can you also take a look?

Comment on lines -66 to -75
// To support input_file_name(). According to semantic we should return
// the exact file name a row belongs to. However in columnar engine it's
// not easy to accomplish this. so we return a list of file(part) names
split match {
case FirstZippedPartitionsPartition(_, g: GlutenPartition, _) =>
InputFileBlockHolderProxy.set(g.files.mkString(","))
case _ =>
InputFileBlockHolderProxy.unset()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we can delete these codes. @gaoyangxiaozhu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are problems with the previous input file expression implementation. #7124 optimizes the solution and pushes the input file expression down to scanTransform or the project before scan. The results come from native scan or spark thread local, so there is no need to retain the information in InputFileBlockHolder.

@zml1206
Copy link
Contributor Author

zml1206 commented Jan 23, 2025

Thanks for review, are there any other comments? If not, it will be merged later. @Yohahaha @baibaichen

@Yohahaha
Copy link
Contributor

Thanks for review, are there any other comments? If not, it will be merged later. @Yohahaha @baibaichen

I have no more comments, thank you!

@zml1206 zml1206 merged commit 059845c into apache:main Jan 23, 2025
47 checks passed
baibaichen pushed a commit to baibaichen/gluten that referenced this pull request Feb 1, 2025
@zml1206 zml1206 deleted the 8580-1 branch December 9, 2025 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLICKHOUSE CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Core] Don't report 'Not supported to map spark function name to substrait function name: input_file_name(), class name: InputFileName.'

4 participants