-
Notifications
You must be signed in to change notification settings - Fork 5k
Closed
Labels
Description
Search before asking
- I had searched in the issues and found no similar issues.
What happened
When using the seatunel engine to submit Spark jobs, selecting Spark as the master did not concatenate the master address in the generated job submission script, resulting in job failure
it generated a script like this:
[INFO] 2023-07-14 00:39:14.376 +0000 - SeaTunnel task command: ${SEATUNNEL_HOME}/bin/start-seatunnel-spark-3-connector-v2.sh --config /tmp/dolphinscheduler/exec/process/root/10215811485920/10215863853024_2/9/9/seatunnel_9_9.conf --deploy-mode client --master spark:// sparkhost:port
and report this exception:
23/07/14 08:39:18 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Invalid master URL: spark://
at org.apache.spark.util.Utils$.extractHostPortFromSparkUrl(Utils.scala:2551)
at org.apache.spark.rpc.RpcAddress$.fromSparkURL(RpcAddress.scala:47)
at org.apache.spark.deploy.client.StandaloneAppClient.$anonfun$masterRpcAddresses$1(StandaloneAppClient.scala:53)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
at org.apache.spark.deploy.client.StandaloneAppClient.<init>(StandaloneAppClient.scala:53)
at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.start(StandaloneSchedulerBackend.scala:123)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:222)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:595)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2714)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
at org.apache.seatunnel.core.starter.spark.execution.SparkRuntimeEnvironment.prepare(SparkRuntimeEnvironment.java:113)
at org.apache.seatunnel.core.starter.spark.execution.SparkRuntimeEnvironment.prepare(SparkRuntimeEnvironment.java:38)
at org.apache.seatunnel.core.starter.execution.RuntimeEnvironment.initialize(RuntimeEnvironment.java:48)
at org.apache.seatunnel.core.starter.spark.execution.SparkRuntimeEnvironment.<init>(SparkRuntimeEnvironment.java:60)
at org.apache.seatunnel.core.starter.spark.execution.SparkRuntimeEnvironment.getInstance(SparkRuntimeEnvironment.java:173)
at org.apache.seatunnel.core.starter.spark.execution.SparkExecution.<init>(SparkExecution.java:50)
at org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:59)
at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
at org.apache.seatunnel.core.starter.spark.SeaTunnelSpark.main(SeaTunnelSpark.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.URISyntaxException: Expected authority at index 8: spark://
at java.net.URI$Parser.fail(URI.java:2847)
at java.net.URI$Parser.failExpecting(URI.java:2853)
at java.net.URI$Parser.parseHierarchical(URI.java:3101)
at java.net.URI$Parser.parse(URI.java:3052)
at java.net.URI.<init>(URI.java:588)
at org.apache.spark.util.Utils$.extractHostPortFromSparkUrl(Utils.scala:2536)
... 38 more
What you expected to happen
There should be no spaces in the master address, like this
--master spark://sparkhost:port
How to reproduce
the code of generating master args in org.apache.dolphinscheduler.plugin.task.seatunnel.spark.SeatunnelSparkTask: buildOptions() needs to be modified
if master is spark or mesos, the master command needs to be concatenated with the url
like this
args.add(MASTER_OPTIONS);
if (MasterTypeEnum.SPARK.equals(master) || MasterTypeEnum.MESOS.equals(master)) {
args.add(master.getCommand() + seatunnelParameters.getMasterUrl());
} else {
args.add(master.getCommand());
}
Anything else
No response
Version
3.1.x
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable