Add spark3-hudi image #136

codope · 2022-07-15T09:49:44Z

Add spark3-hudi image for product tests

Installs Spark 3.2.1
Installs hudi-spark3-bundle
Add Hive user

ebyhr · 2022-10-12T06:02:15Z

@codope Is this PR ready for review?

codope · 2022-10-12T08:28:08Z

Not yet. I'll update it by tomorrow.

codope · 2022-10-12T12:05:25Z

@ebyhr @findinpath I have updated the PR. When I try to build locally using make testing/spark3-hudi, I get this error:

Dockerfile:13
--------------------
  11 |     # limitations under the License.
  12 |
  13 | >>> FROM testing/centos7-oj11:unlabelled
  14 |
  15 |     ARG SPARK_VERSION=3.2.1
--------------------
error: failed to solve: testing/centos7-oj11:unlabelled: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed

I also tried to first build the dependent centos first using make testing/centos7-oj11 but still getting the same error.

etc/compose/spark3-hudi/docker-compose.yml

nineinchnick · 2022-10-12T12:14:52Z

.github/workflows/ci.yml

@@ -20,6 +20,7 @@ jobs:
            test: spark3-iceberg
          - image: spark3-delta
            test: spark3-delta
+          - image: spark3-hudi


Can you enable tests for this image? We just recently added them to almost all images. See bin/test.sh. The test is just a simple smoke test to see if a container using this image will start.

testing/spark3-hudi/Dockerfile

nineinchnick · 2022-10-12T12:18:33Z

@codope I can't reproduce the issue, can you paste the full output from make? It should recognize this dependency and build the base image first. Maybe you could start a thread on Slack where we could debug this together.

findinpath · 2022-10-12T12:19:35Z

@codope I just tried to build the image locally and the building process went just fine.
Make sure to rebase your changes on top of master before doing again make testing/spark3-hudi

findinpath · 2022-10-12T12:22:29Z

testing/spark3-hudi/Dockerfile

+
+FROM testing/centos7-oj11:unlabelled
+
+ARG SPARK_VERSION=3.2.1


From https://hudi.apache.org/docs/quick-start-guide/

Hudi Supported Spark 3 version

0.12.x 3.3.x (default build), 3.2.x, 3.1.x

0.11.x 3.2.x (default build, Spark bundle only), 3.1.x

Given that hudi is a fresh connector in the Trino ecosystem, let's make use of the latest hudi 0.12.x which has support for the latest Spark version 3.3.0 which can also execute on Java 8/11/17 versions.

Let's build therefore on top of testing/centos7-oj17:unlabelled base image

I have kept the same version as in trino-hudi. The plan is to upgrade to Hudi 0.12.1 which will be out very soon. Then, i'll make the changes here as well in a followup PR.

@codope Hudi doesn't support different version between server and client?

codope · 2022-10-12T14:36:15Z

@nineinchnick @findinpath I still get the error after rebase. Here's the gist of full output of make command: https://gist.github.com/codope/921469ef910e3a76314d251b35060a6e

nineinchnick · 2022-10-12T14:46:55Z

Thanks, can you also include the output of docker info? It looks like the new Docker builder (buildx) can't see images in the local repo.

nineinchnick · 2022-10-12T14:51:04Z

Can you also include the output of docker images | grep testing? It looks like the docker tag command runs fine, but subsequent docker buildx commands can't see the new image built locally. Can you also check if you're not running out of disk space?

codope · 2022-10-12T14:54:41Z

Can you also include the output of docker images | grep testing? It looks like the docker tag command runs fine, but subsequent docker buildx commands can't see the new image built locally. Can you also check if you're not running out of disk space?

@nineinchnick Thanks for the pointers. Looks like i'm running out of space. I have just 0.5gb left. I'll clean up and retry in some time. How much space is required typically?
Output of docker info in: https://gist.github.com/codope/921469ef910e3a76314d251b35060a6e
Output of docker images | grep testing below

docker images | grep testing
ghcr.io/trinodb/testing/hdp3.1-hive                         65                             90bbaa186970   2 months ago    4.5GB
ghcr.io/trinodb/testing/hdp2.6-hive                         65                             953bc3fc663d   2 months ago    3.23GB
testing/spark3-hudi                                         latest                         76ec389112ca   2 months ago    1.37GB
testing/centos7-oj11                                        latest                         e84dea2f6000   2 months ago    966MB
testing/centos7-oj11                                        unlabelled                     e84dea2f6000   2 months ago    966MB
ghcr.io/trinodb/testing/hdp3.1-hive                         64                             4b99d1995fbd   3 months ago    4.5GB
ghcr.io/trinodb/testing/spark3.0-iceberg                    64                             cd45c86b997d   3 months ago    1.33GB
ghcr.io/trinodb/testing/centos7-oj11                        64                             e822b8bf2137   3 months ago    966MB

nineinchnick · 2022-10-12T15:17:34Z

These images are pretty heavy, so, unfortunately, multiple gigabytes. You probably need at least 5gb or more.

codope · 2022-10-12T16:43:08Z

I pruned all older images, removed the build cache, and reclaimed enough space. This time it's a different error:
Error response from daemon: No such image: testing/centos7-oj11:latest. Full log below:

make testing/spark3-hudi

bin/depend.sh -p unlabelled testing/spark3-hudi/Dockerfile testing/kerberos:unlabelled testing/cdh5.12-hive-kerberized:unlabelled testing/cdh5.15-hive-kerberized:unlabelled testing/accumulo:unlabelled testing/centos7-oj11:unlabelled testing/centos7-oj17-openldap-referrals:unlabelled testing/centos7-oj17:unlabelled testing/hdp2.6-hive-kerberized-2:unlabelled testing/gpdb-6:unlabelled testing/spark3-iceberg:unlabelled testing/hdp2.6-hive:unlabelled testing/cdh5.15-hive-kerberized-kms:unlabelled testing/hive3.1-hive:unlabelled testing/spark3-hudi:unlabelled testing/phoenix5:unlabelled testing/centos7-oj17-openldap:unlabelled testing/hdp3.1-hive:unlabelled testing/cdh5.12-hive:unlabelled testing/cdh5.15-hive:unlabelled testing/dns:unlabelled testing/hdp3.1-hive-kerberized:unlabelled testing/hdp2.6-hive-kerberized:unlabelled testing/spark3-delta:unlabelled
testing/centos7-oj11:unlabelled
bin/depend.sh -p unlabelled testing/centos7-oj11/Dockerfile testing/kerberos:unlabelled testing/cdh5.12-hive-kerberized:unlabelled testing/cdh5.15-hive-kerberized:unlabelled testing/accumulo:unlabelled testing/centos7-oj11:unlabelled testing/centos7-oj17-openldap-referrals:unlabelled testing/centos7-oj17:unlabelled testing/hdp2.6-hive-kerberized-2:unlabelled testing/gpdb-6:unlabelled testing/spark3-iceberg:unlabelled testing/hdp2.6-hive:unlabelled testing/cdh5.15-hive-kerberized-kms:unlabelled testing/hive3.1-hive:unlabelled testing/spark3-hudi:unlabelled testing/phoenix5:unlabelled testing/centos7-oj17-openldap:unlabelled testing/hdp3.1-hive:unlabelled testing/cdh5.12-hive:unlabelled testing/cdh5.15-hive:unlabelled testing/dns:unlabelled testing/hdp3.1-hive-kerberized:unlabelled testing/hdp2.6-hive-kerberized:unlabelled testing/spark3-delta:unlabelled
docker pull library/centos:7
7: Pulling from library/centos
Digest: sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407
Status: Image is up to date for centos:7
docker.io/library/centos:7

Building [testing/centos7-oj11@latest] image using buildkit

cd testing/centos7-oj11 && time /bin/sh -c "( docker buildx build --compress --progress=plain --add-host hadoop-master:127.0.0.2   -t testing/centos7-oj11:latest --label io.trino.git.hash=f231f96dc51f183d7b87a6ce7b7da27f1efa5e02 . )"
WARNING: No output specified for docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 1.81kB done
#1 DONE 0.2s

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.2s

#3 [internal] load metadata for docker.io/library/centos:7
#3 DONE 8.6s

#4 [internal] load build context
#4 DONE 0.0s

#5 [1/3] FROM docker.io/library/centos:7@sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407
#5 resolve docker.io/library/centos:7@sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407
#5 resolve docker.io/library/centos:7@sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407 5.2s done
#5 DONE 5.7s

#4 [internal] load build context
#4 transferring context: 385B done
#4 DONE 0.5s

#6 [2/3] COPY ./files /
#6 CACHED

#7 [3/3] RUN     set -xeu &&     yum install -y         nc         wget         &&         rpm -i https://cdn.azul.com/zulu$([ "$(arch)" != "aarch64" ] || echo "-embedded")/bin/zulu11.58.23-ca-jdk11.0.16.1-linux."$(arch)".rpm &&     alternatives --set java /usr/lib/jvm/zulu-11/bin/java &&     alternatives --set javac /usr/lib/jvm/zulu-11/bin/javac &&         yum --enablerepo=extras install -y setuptools epel-release &&     yum install -y python-pip &&     pip install supervisor &&         yum install -y         less `# helpful when troubleshooting product tests`         net-tools `# netstat is required by run_on_docker.sh`         sudo         telnet `# helpful when troubleshooting product tests`         vim `# helpful when troubleshooting product tests`         jq `# helpful json processing tool`         &&     yum -y clean all && rm -rf /tmp/* /var/tmp/*
#7 CACHED

real	0m15.684s
user	0m0.209s
sys	0m0.186s
docker history testing/centos7-oj11:latest
Error response from daemon: No such image: testing/centos7-oj11:latest
make: *** [testing/centos7-oj11@latest] Error 1

nineinchnick · 2022-10-12T17:27:30Z

The warning there looks related. Your docker server version looks like the latest one, but your buildx plugin is not. Can you check if you could update it?

codope · 2022-10-14T05:35:23Z

I am still facing the same issue after upgrading buildx. Let me check with others. Maybe it has got something to do with my local environment. Just to ensure I'm doing it right, there is no step other than running make testing/spark3-hudi for this PR right?

codope · 2022-10-14T05:36:31Z

Also, I see that the spark3-hudi image was built and smoke test ran successfully: https://github.com/trinodb/docker-images/actions/runs/3235524333/jobs/5300075791
Can we merge this PR? @nineinchnick

nineinchnick · 2022-10-14T05:45:37Z

Yes, but we have to wait for a maintainer to do so

nineinchnick · 2022-10-14T05:52:27Z

Just to ensure I'm doing it right, there is no step other than running make testing/spark3-hudi for this PR right?

That's right, there are no other steps required. I saw some issues reported in Docker that buildx does not load the image into the local repo after building it, but it is supposed to be fixed. If you have the latest veesion, as us, you should not be affected.

Building the image locally is only required if you want to test it to run product tests in Trino, which i assume you do want, since you're working on adding new tests.

You could try building the images without using make, it prints all the docker commands it executes.

codope · 2022-10-14T08:55:50Z

Thanks for the help @nineinchnick .
Actually, my colleague can build it with the make command. So, it's my local environment that is causing some issues. I am going to uninstall docker and start from scratch.
For product tests, once the spark3-hudi image is released, I can pull and use it (even if I can't build it locally).
@ebyhr @findinpath This PR is ready to land now. Would be great if you can help with releasing the image.

findinpath · 2022-10-14T12:52:08Z

@codope can you please add a basic product test creating a table via spark hudi and reading it via trino in trinodb/trino#14365 with the newly created image (It is ok to reference in the code the image you have locally - I just want to test it myself as well) so that we can actually verify that the image indeed basically works? Otherwise, we may end up that something is missing and we need to do yet another release.

findinpath · 2022-10-17T22:03:39Z

@codope trinodb/trino#14669 draft PR is a working draft which you can use as a sample template for building the tests required for trinodb/trino#14365

I based the tests (same as for Delta Lake OSS) on a MinIO data lake because I think (@codope correct me if I'm wrong) this kind of setup is close to the integration with AWS S3 which we want to test with Trino for Hudi.

@ebyhr please do test the draft PR on your machine as well and if you're ok with the results, I think we can release a new version of the docker images so that @codope can continue the work on the product tests.

findinpath · 2022-10-17T22:04:30Z

@codope pls squash the commits and add a relevant commit message.

ebyhr · 2022-10-18T08:14:18Z

I'm testing the draft PR based on this. Let me merge once it's complete.

ebyhr · 2022-10-18T11:37:54Z

@findinpath I got the below exception in my environment. Is the command same as you executed?

testing/trino-product-tests-launcher/bin/run-launcher suite run --config config-hdp3 --suite suite-hudi

spark               | 22/10/18 10:13:45 ERROR SparkExecuteStatementOperation: Error executing query with e71ce1da-cc32-4665-91b8-34003887d616, currentState RUNNING,
spark               | java.lang.ClassNotFoundException:
spark               | Failed to find data source: hudi. Please find packages at
spark               | http://spark.apache.org/third-party-projects.html
spark               |
spark               | 	at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:443)
spark               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:670)
spark               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:720)
spark               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.org$apache$spark$sql$catalyst$analysis$ResolveSessionCatalog$$isV2Provider(ResolveSessionCatalog.scala:636)
spark               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:165)
spark               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:48)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
spark               | 	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:111)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:110)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
spark               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:48)
spark               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:42)
spark               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
spark               | 	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
spark               | 	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
spark               | 	at scala.collection.immutable.List.foldLeft(List.scala:91)
spark               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
spark               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
spark               | 	at scala.collection.immutable.List.foreach(List.scala:431)
spark               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
spark               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:222)
spark               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:218)
spark               | 	at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:167)
spark               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
spark               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:182)
spark               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
spark               | 	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
spark               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
spark               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:203)
spark               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
spark               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:202)
spark               | 	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:88)
spark               | 	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
spark               | 	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:196)
spark               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
spark               | 	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:196)
spark               | 	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:88)
spark               | 	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:86)
spark               | 	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:78)
spark               | 	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
spark               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
spark               | 	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
spark               | 	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
spark               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
spark               | 	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
spark               | 	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:291)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:230)
spark               | 	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:79)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:63)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:230)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:225)
spark               | 	at java.base/java.security.AccessController.doPrivileged(Native Method)
spark               | 	at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
spark               | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
spark               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:239)
spark               | 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
spark               | 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
spark               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
spark               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
spark               | 	at java.base/java.lang.Thread.run(Thread.java:829)
spark               | Caused by: java.lang.ClassNotFoundException: hudi.DefaultSource
spark               | 	at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476)
spark               | 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
spark               | 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
spark               | 	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:656)
spark               | 	at scala.util.Try$.apply(Try.scala:213)
spark               | 	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:656)
spark               | 	at scala.util.Failure.orElse(Try.scala:224)
spark               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:656)
spark               | 	... 67 more
spark               | 22/10/18 10:13:45 INFO DAGScheduler: Asked to cancel job group e71ce1da-cc32-4665-91b8-34003887d616
spark               | 22/10/18 10:13:45 INFO SparkExecuteStatementOperation: Close statement with e71ce1da-cc32-4665-91b8-34003887d616
tests               | 2022-10-18 15:58:45 INFO: FlakyTestRetryAnalyzer not enabled: CONTINUOUS_INTEGRATION environment is not detected or system property 'io.trino.testng.services.FlakyTestRetryAnalyzer.enabled' is not set to 'true' (actual: <not set>)
spark               | 22/10/18 10:13:45 INFO metastore: Closed a connection to metastore, current connections: 1
spark               | 22/10/18 10:13:45 INFO metastore: Trying to connect to metastore with URI thrift://hadoop-master:9083
spark               | 22/10/18 10:13:45 INFO metastore: Opened a connection to metastore, current connections: 2
spark               | 22/10/18 10:13:45 INFO metastore: Connected to metastore.
spark               | 22/10/18 10:13:45 INFO metastore: Closed a connection to metastore, current connections: 1
spark               | 22/10/18 10:13:45 ERROR TThreadPoolServer: Thrift error occurred during processing of message.
spark               | org.apache.thrift.transport.TTransportException
spark               | 	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
spark               | 	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
tests               | 2022-10-18 15:58:45 INFO: FAILURE     /    io.trino.tests.product.hudi.TestHudiCompatibility.testDemo (Groups: profile_specific_tests, hudi) took 4.2 seconds
spark               | 	at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
spark               | 	at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
spark               | 	at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
spark               | 	at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43)
spark               | 	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
spark               | 	at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
spark               | 	at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
spark               | 	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
spark               | 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
spark               | 	at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:52)
spark               | 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
spark               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
spark               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
spark               | 	at java.base/java.lang.Thread.run(Thread.java:829)
tests               | 2022-10-18 15:58:45 SEVERE: Failure cause:
tests               | io.trino.tempto.query.QueryExecutionException: java.sql.SQLException: org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.ClassNotFoundException:
tests               | Failed to find data source: hudi. Please find packages at
tests               | http://spark.apache.org/third-party-projects.html
tests               |
tests               | 	at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:44)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:325)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:230)
tests               | 	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:79)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:63)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:230)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:225)
tests               | 	at java.base/java.security.AccessController.doPrivileged(Native Method)
tests               | 	at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
tests               | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:239)
tests               | 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
tests               | 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
tests               | 	at java.base/java.lang.Thread.run(Thread.java:829)
tests               | Caused by: java.lang.ClassNotFoundException:
tests               | Failed to find data source: hudi. Please find packages at
tests               | http://spark.apache.org/third-party-projects.html
tests               |
tests               | 	at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:443)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:670)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:720)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.org$apache$spark$sql$catalyst$analysis$ResolveSessionCatalog$$isV2Provider(ResolveSessionCatalog.scala:636)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:165)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:48)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
tests               | 	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:111)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:110)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:48)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:42)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
tests               | 	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
tests               | 	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
tests               | 	at scala.collection.immutable.List.foldLeft(List.scala:91)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
tests               | 	at scala.collection.immutable.List.foreach(List.scala:431)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:222)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:218)
tests               | 	at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:167)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:182)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
tests               | 	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:203)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:202)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:88)
tests               | 	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:196)
tests               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:196)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:88)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:86)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:78)
tests               | 	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
tests               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
tests               | 	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
tests               | 	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
tests               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
tests               | 	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
tests               | 	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:291)
tests               | 	... 16 more
tests               | Caused by: java.lang.ClassNotFoundException: hudi.DefaultSource
tests               | 	at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476)
tests               | 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
tests               | 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:656)
tests               | 	at scala.util.Try$.apply(Try.scala:213)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:656)
tests               | 	at scala.util.Failure.orElse(Try.scala:224)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:656)
tests               | 	... 67 more
tests               |
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:119)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:84)
tests               | 	at io.trino.tests.product.utils.QueryExecutors$4.executeQuery(QueryExecutors.java:168)
tests               | 	at io.trino.tests.product.hudi.TestHudiCompatibility.testDemo(TestHudiCompatibility.java:51)
tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
tests               | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
tests               | 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
tests               | 	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
tests               | 	at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
tests               | 	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
tests               | 	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
tests               | 	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
tests               | 	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
tests               | 	at java.base/java.lang.Thread.run(Thread.java:833)
tests               | Caused by: java.sql.SQLException: org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.ClassNotFoundException:
tests               | Failed to find data source: hudi. Please find packages at
tests               | http://spark.apache.org/third-party-projects.html
tests               |
tests               | 	at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:44)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:325)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:230)
tests               | 	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:79)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:63)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:230)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:225)
tests               | 	at java.base/java.security.AccessController.doPrivileged(Native Method)
tests               | 	at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
tests               | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:239)
tests               | 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
tests               | 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
tests               | 	at java.base/java.lang.Thread.run(Thread.java:829)
tests               | Caused by: java.lang.ClassNotFoundException:
tests               | Failed to find data source: hudi. Please find packages at
tests               | http://spark.apache.org/third-party-projects.html
tests               |
tests               | 	at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:443)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:670)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:720)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.org$apache$spark$sql$catalyst$analysis$ResolveSessionCatalog$$isV2Provider(ResolveSessionCatalog.scala:636)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:165)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:48)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
tests               | 	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:111)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:110)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:48)
tests               | 	at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:42)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
tests               | 	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
tests               | 	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
tests               | 	at scala.collection.immutable.List.foldLeft(List.scala:91)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
tests               | 	at scala.collection.immutable.List.foreach(List.scala:431)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:222)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:218)
tests               | 	at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:167)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:182)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
tests               | 	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
tests               | 	at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:203)
tests               | 	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
tests               | 	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:202)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:88)
tests               | 	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:196)
tests               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:196)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:88)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:86)
tests               | 	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:78)
tests               | 	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
tests               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
tests               | 	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
tests               | 	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
tests               | 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
tests               | 	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
tests               | 	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
tests               | 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:291)
tests               | 	... 16 more
tests               | Caused by: java.lang.ClassNotFoundException: hudi.DefaultSource
tests               | 	at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476)
tests               | 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
tests               | 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:656)
tests               | 	at scala.util.Try$.apply(Try.scala:213)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:656)
tests               | 	at scala.util.Failure.orElse(Try.scala:224)
tests               | 	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:656)
tests               | 	... 67 more
tests               |
tests               | 	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:275)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:128)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:112)
tests               | 	... 16 more
tests               | 	Suppressed: java.lang.Exception: Query: CREATE TABLE default.test_hudi_demo_kqy3etbz6v5z (uuid int, col string) USING hudi LOCATION 's3://presto-ci-test/hudi-compatibility-test-test_hudi_demo_kqy3etbz6v5z'
tests               | 		at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:136)
tests               | 		... 17 more

findinpath · 2022-10-18T13:15:16Z

@ebyhr it was a copy-paste issue in SuiteHudi - corrected in the meantime. EnvSinglenodeDeltaLakeOss -> EnvSinglenodeHudi

ebyhr · 2022-10-19T10:52:42Z

Merged, thanks!

cla-bot bot added the cla-signed label Jul 15, 2022

codope mentioned this pull request Sep 26, 2022

Add Hudi connector trinodb/trino#10228

Merged

findinpath self-requested a review October 3, 2022 11:03

findinpath mentioned this pull request Oct 3, 2022

Add product test for Hudi trinodb/trino#14365

Closed

codope force-pushed the hudi-image branch from 569b479 to eeac17c Compare October 12, 2022 12:00

nineinchnick reviewed Oct 12, 2022

View reviewed changes

findinpath reviewed Oct 12, 2022

View reviewed changes

findinpath approved these changes Oct 17, 2022

View reviewed changes

ebyhr approved these changes Oct 19, 2022

View reviewed changes

Add spark3-hudi image

9904925

ebyhr force-pushed the hudi-image branch from f231f96 to 9904925 Compare October 19, 2022 10:23

ebyhr merged commit c8429c9 into trinodb:master Oct 19, 2022

findinpath mentioned this pull request Nov 4, 2022

Draft: Add singlenode-hudi product test environment trinodb/trino#14669

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add spark3-hudi image #136

Add spark3-hudi image #136

codope commented Jul 15, 2022

ebyhr commented Oct 12, 2022

codope commented Oct 12, 2022

codope commented Oct 12, 2022

nineinchnick Oct 12, 2022

codope Oct 12, 2022

nineinchnick commented Oct 12, 2022

findinpath commented Oct 12, 2022

findinpath Oct 12, 2022 •

edited

codope Oct 12, 2022

ebyhr Oct 14, 2022

codope commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

codope commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

codope commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

codope commented Oct 14, 2022

codope commented Oct 14, 2022

nineinchnick commented Oct 14, 2022

nineinchnick commented Oct 14, 2022

codope commented Oct 14, 2022

findinpath commented Oct 14, 2022 •

edited

findinpath commented Oct 17, 2022

findinpath commented Oct 17, 2022

ebyhr commented Oct 18, 2022

ebyhr commented Oct 18, 2022

findinpath commented Oct 18, 2022

ebyhr commented Oct 19, 2022

Hudi	Supported Spark 3 version
0.12.x	3.3.x (default build), 3.2.x, 3.1.x
0.11.x	3.2.x (default build, Spark bundle only), 3.1.x

Add spark3-hudi image #136

Add spark3-hudi image #136

Conversation

codope commented Jul 15, 2022

ebyhr commented Oct 12, 2022

codope commented Oct 12, 2022

codope commented Oct 12, 2022

nineinchnick Oct 12, 2022

Choose a reason for hiding this comment

codope Oct 12, 2022

Choose a reason for hiding this comment

nineinchnick commented Oct 12, 2022

findinpath commented Oct 12, 2022

findinpath Oct 12, 2022 • edited

Choose a reason for hiding this comment

codope Oct 12, 2022

Choose a reason for hiding this comment

ebyhr Oct 14, 2022

Choose a reason for hiding this comment

codope commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

codope commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

codope commented Oct 12, 2022

nineinchnick commented Oct 12, 2022

codope commented Oct 14, 2022

codope commented Oct 14, 2022

nineinchnick commented Oct 14, 2022

nineinchnick commented Oct 14, 2022

codope commented Oct 14, 2022

findinpath commented Oct 14, 2022 • edited

findinpath commented Oct 17, 2022

findinpath commented Oct 17, 2022

ebyhr commented Oct 18, 2022

ebyhr commented Oct 18, 2022

findinpath commented Oct 18, 2022

ebyhr commented Oct 19, 2022

findinpath Oct 12, 2022 •

edited

findinpath commented Oct 14, 2022 •

edited