Skip to content

[GLUTEN-10072][UT] Enable hive execution tests#12005

Merged
rui-mo merged 4 commits into
apache:mainfrom
wecharyu:GLUTEN-10072
May 15, 2026
Merged

[GLUTEN-10072][UT] Enable hive execution tests#12005
rui-mo merged 4 commits into
apache:mainfrom
wecharyu:GLUTEN-10072

Conversation

@wecharyu
Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

Enable unit tests in Spark sql/hive/src/test/scala/org/apache/spark/sql/hive/execution:

  • Extract spark-hive test resources to spark.test.home in script install-spark-resources.sh
  • Use GlutenHiveResourcePathSupport to load test resources from spark.test.home
  • Use GlutenTestHiveTables to register hive qtest tables

#10072

How was this patch tested?

Pass all new tests.

Was this patch authored or co-authored using generative AI tooling?

Codex gpt-5.5

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions github-actions Bot added CORE works for Gluten Core INFRA labels Apr 28, 2026
Comment thread pom.xml
<delta.version>2.3.0</delta.version>
<delta.binary.version>23</delta.binary.version>
<antlr4.version>4.8</antlr4.version>
<hadoop.version>3.3.2</hadoop.version>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Hadoop version of Spark 3.3.1 is 3.3.2, we need explicitly declare it here. Otherwise there are version conflicts:

  • hadoop-common-2.7.4 (default version)
  • hadoop-client-api/runtime-3.3.4 (from spark-hive test)

That mix is unsafe. The old Hadoop 2.7.4 FsUrlStreamHandlerFactory can be loaded together with Hadoop 3.3.4 HTTP/HTTPS filesystem classes. As a result, HTTPS URLs used by Hive test
jar resolution and Ivy ADD JAR resolution may be incorrectly handled as Hadoop filesystem URLs, which can recurse through HttpsFileSystem.open() and fail with StackOverflowError.

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions github-actions Bot added the BUILD label Apr 29, 2026
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@wecharyu
Copy link
Copy Markdown
Contributor Author

wecharyu commented May 8, 2026

@zhouyuan @rui-mo could you help review this patch when you have time? Thanks!

Copy link
Copy Markdown
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work! The change looks good to me. Would you mind sharing some progress details on enabling the Hive execution tests? For example, how many suites were added in this PR, and how many still remain to be added?

spark-test-spark35-slow:
needs: build-native-lib-centos-7
runs-on: ubuntu-22.04
env: #TODO remove after image update
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a bit more detail in the PR description on how we should follow up? Thanks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an existing pain point in gluten image update:

  • Docker image is pushed to apache/gluten repo when merge a related commit to main branch
  • This PR that needs update the image is still test on existing 'old' image

Thus we need add scripts in test action before the PR is merged, and remove such actions when the PR is merged and image is updated.

I'm considering to let the such PR that would update the image build a snapshot image and test on the snapshot image, in this case we do not need add additional scripts on test action and do not need a follow-up remove operation.

@wecharyu
Copy link
Copy Markdown
Contributor Author

Would you mind sharing some progress details on enabling the Hive execution tests?

This PR add 25 hive execution tests in Spark sql/hive/src/test/scala/org/apache/spark/sql/hive/execution. There are some other hive tests could be added:

  • sql/hive/src/test/scala/org/apache/spark/sql/hive/orc
  • sql/hive/src/test/scala/org/apache/spark/sql/hive

Other tests are related to hive client and DDL operations, which is not necessary test in gluten.

@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented May 15, 2026

@wecharyu Thanks. I’m okay with landing this PR. Before merging, could you check whether the CH workflow failure is related?

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@wecharyu
Copy link
Copy Markdown
Contributor Author

@rui-mo CI is green now.

@rui-mo rui-mo changed the title [GLUTEN-10072] Enable hive execution tests [GLUTEN-10072][UT] Enable hive execution tests May 15, 2026
@rui-mo rui-mo merged commit 356fda0 into apache:main May 15, 2026
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BUILD CORE works for Gluten Core INFRA

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants