[GLUTEN-10072][UT] Enable hive execution tests#12005
Conversation
|
Run Gluten Clickhouse CI on x86 |
| <delta.version>2.3.0</delta.version> | ||
| <delta.binary.version>23</delta.binary.version> | ||
| <antlr4.version>4.8</antlr4.version> | ||
| <hadoop.version>3.3.2</hadoop.version> |
There was a problem hiding this comment.
The Hadoop version of Spark 3.3.1 is 3.3.2, we need explicitly declare it here. Otherwise there are version conflicts:
- hadoop-common-2.7.4 (default version)
- hadoop-client-api/runtime-3.3.4 (from spark-hive test)
That mix is unsafe. The old Hadoop 2.7.4 FsUrlStreamHandlerFactory can be loaded together with Hadoop 3.3.4 HTTP/HTTPS filesystem classes. As a result, HTTPS URLs used by Hive test
jar resolution and Ivy ADD JAR resolution may be incorrectly handled as Hadoop filesystem URLs, which can recurse through HttpsFileSystem.open() and fail with StackOverflowError.
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
rui-mo
left a comment
There was a problem hiding this comment.
Thanks for this work! The change looks good to me. Would you mind sharing some progress details on enabling the Hive execution tests? For example, how many suites were added in this PR, and how many still remain to be added?
| spark-test-spark35-slow: | ||
| needs: build-native-lib-centos-7 | ||
| runs-on: ubuntu-22.04 | ||
| env: #TODO remove after image update |
There was a problem hiding this comment.
Could you provide a bit more detail in the PR description on how we should follow up? Thanks.
There was a problem hiding this comment.
This is an existing pain point in gluten image update:
- Docker image is pushed to apache/gluten repo when merge a related commit to main branch
- This PR that needs update the image is still test on existing 'old' image
Thus we need add scripts in test action before the PR is merged, and remove such actions when the PR is merged and image is updated.
I'm considering to let the such PR that would update the image build a snapshot image and test on the snapshot image, in this case we do not need add additional scripts on test action and do not need a follow-up remove operation.
This PR add 25 hive execution tests in Spark
Other tests are related to hive client and DDL operations, which is not necessary test in gluten. |
|
@wecharyu Thanks. I’m okay with landing this PR. Before merging, could you check whether the CH workflow failure is related? |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
@rui-mo CI is green now. |
What changes are proposed in this pull request?
Enable unit tests in Spark
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution:spark-hivetest resources tospark.test.homein scriptinstall-spark-resources.shGlutenHiveResourcePathSupportto load test resources fromspark.test.homeGlutenTestHiveTablesto register hive qtest tables#10072
How was this patch tested?
Pass all new tests.
Was this patch authored or co-authored using generative AI tooling?
Codex gpt-5.5