Skip to content

I got Stuck when i submitted the spark task (TFOS MNIST the official sample) #217

@skyWalker1997

Description

@skyWalker1997

I am a noooooob on TFOS and here is the question i met
hope some guy can help me to slove it .thank you first!

i got stuck when i submitted the spark task.I used spark standalone mode and here is my configurations :
hadoop-2.6.5
spark-2.2.0 for hadoop-2.6
jdk1.8.0
tensorflow1.5.0rc0
tensorflowonspark (1.1.0)
i have two PC with ubuntu16.04 LST system and one of them is the Master (namenode && datanode) and also a slaver (worker) and the other one is the slaver

before i got stuck i successfully run that csv convert program,and i got image folder and lable forlder on my hdfs system.(which was done by my two slaver,when that program ran(mnist_data_setup.py)) it showed that TFOS assigned tasks to executor 0(ip .158 which is Master ) and executor 1(ip .159 which is Slaver )
i thought that message show that my Hadoop+Spark +Tensorflow is ok
but when i go on to train the MNIST it stuck ? And what i submit is refer to many blogs or question solutions , nut none of them work for me ;(

here is the submit :
export PYTHON_ROOT=/usr/bin/python
export LD_LIBRARY_PATH=${PATH}
export PYSPARK_PYTHON=/usr/bin/python
export SPARK_YARN_USER_ENV="PYSPARK_PYTHON=/usr/bin/python"
export PATH=${PYTHON_ROOT}/bin/:$PATH
export QUEUE=default
export LIB_HDFS=$HADOOP_PREFIX/lib/native/Linux-amd64-64
export LIB_JVM=$JAVA_HOME/jre/lib/amd64/server
export MASTER=${MASTER}
export SPARK_WORKER_INSTANCES=2
export CORES_PER_WORKER=1
export TOTAL_CORES=$((${CORES_PER_WORKER}*${SPARK_WORKER_INSTANCES}))
${SPARK_HOME}/sbin/start-master.sh; ${SPARK_HOME}/sbin/start-slaves.sh -c $CORES_PER_WORKER -m 1G ${MASTER}

(BTW, the official document of standalone TFOS may have a mistake, when start the spark it submit "start-slave.sh", it will lead to start 2 Worker on master,but no Worker on Slaver ."start-slaves.sh" will start slaver correctly )
(it may also my mistake cause the question above but "start-salves,sh" work for me)

${SPARK_HOME}/bin/spark-submit
--master=spark://master:7077
--conf spark.executorEnv.LD_LIBRARY_PATH="${JAVA_HOME}/jre/lib/amd64/server"
--conf spark.executorEnv.CLASSPATH="$($HADOOP_HOME/bin/hadoop classpath --glob):${CLASSPATH}"
--py-files ${TFoS_HOME}/examples/mnist/spark/mnist_dist.py,${TFoS_HOME}/tfspark.zip
--conf spark.cores.max=2
--conf spark.task.cpus=1
${TFoS_HOME}/examples/mnist/spark/mnist_spark.py
--cluster_size 2
--images hdfs://Master:9000/user/ubuntu/examples/mnist/csv/train/images
--labels hdfs://Master:9000/user/ubuntu/examples/mnist/csv/train/labels
--format csv
--mode train
--model mnist_model

command line shows below:

18/02/01 17:10:04 INFO spark.SparkContext: Running Spark version 2.2.0
18/02/01 17:10:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/02/01 17:10:04 INFO spark.SparkContext: Submitted application: mnist_spark
18/02/01 17:10:04 INFO spark.SecurityManager: Changing view acls to: ubuntu
18/02/01 17:10:04 INFO spark.SecurityManager: Changing modify acls to: ubuntu
18/02/01 17:10:04 INFO spark.SecurityManager: Changing view acls groups to:
18/02/01 17:10:04 INFO spark.SecurityManager: Changing modify acls groups to:
18/02/01 17:10:04 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ubuntu); groups with view permissions: Set(); users with modify permissions: Set(ubuntu); groups with modify permissions: Set()
18/02/01 17:10:04 INFO util.Utils: Successfully started service 'sparkDriver' on port 36233.
18/02/01 17:10:04 INFO spark.SparkEnv: Registering MapOutputTracker
18/02/01 17:10:04 INFO spark.SparkEnv: Registering BlockManagerMaster
18/02/01 17:10:04 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/02/01 17:10:04 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/02/01 17:10:04 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-63ebc0f7-8382-4ede-b45f-391ee95b8f55
18/02/01 17:10:04 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
18/02/01 17:10:04 INFO spark.SparkEnv: Registering OutputCommitCoordinator
18/02/01 17:10:05 INFO util.log: Logging initialized @1444ms
18/02/01 17:10:05 INFO server.Server: jetty-9.3.z-SNAPSHOT
18/02/01 17:10:05 INFO server.Server: Started @1491ms
18/02/01 17:10:05 INFO server.AbstractConnector: Started ServerConnector@5eaf53af{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
18/02/01 17:10:05 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@20addad8{/jobs,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@a0b04f9{/jobs/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24f86289{/jobs/job,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@71352c8a{/jobs/job/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@66f9b986{/stages,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6b8f78de{/stages/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@21eeda6d{/stages/stage,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6e80441f{/stages/stage/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@302da50a{/stages/pool,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@33031d8c{/stages/pool/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@379d3a55{/storage,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3be78f3f{/storage/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@214877da{/storage/rdd,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2e848d1d{/storage/rdd/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7ed4a9db{/environment,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@36667fce{/environment/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@762505a1{/executors,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@480addf1{/executors/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cef33ef{/executors/threadDump,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@75e8c5e5{/executors/threadDump/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1b3f38b3{/static,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@682ae9af{/,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@154274b2{/api,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@57946975{/jobs/job/kill,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@59c84ef4{/stages/stage/kill,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.1.158:4040
18/02/01 17:10:05 INFO spark.SparkContext: Added file file:/home/ubuntu/TensorFlowOnSpark/examples/mnist/spark/mnist_spark.py at spark://192.168.1.158:36233/files/mnist_spark.py with timestamp 1517476205189
18/02/01 17:10:05 INFO util.Utils: Copying /home/ubuntu/TensorFlowOnSpark/examples/mnist/spark/mnist_spark.py to /tmp/spark-fcec7a5a-e968-4e64-b5ee-d8752b121deb/userFiles-c929132a-b5fc-43b6-9f9a-24a4b37f1698/mnist_spark.py
18/02/01 17:10:05 INFO spark.SparkContext: Added file file:/home/ubuntu/TensorFlowOnSpark/examples/mnist/spark/mnist_dist.py at spark://192.168.1.158:36233/files/mnist_dist.py with timestamp 1517476205196
18/02/01 17:10:05 INFO util.Utils: Copying /home/ubuntu/TensorFlowOnSpark/examples/mnist/spark/mnist_dist.py to /tmp/spark-fcec7a5a-e968-4e64-b5ee-d8752b121deb/userFiles-c929132a-b5fc-43b6-9f9a-24a4b37f1698/mnist_dist.py
18/02/01 17:10:05 INFO spark.SparkContext: Added file file:/home/ubuntu/TensorFlowOnSpark/tfspark.zip at spark://192.168.1.158:36233/files/tfspark.zip with timestamp 1517476205198
18/02/01 17:10:05 INFO util.Utils: Copying /home/ubuntu/TensorFlowOnSpark/tfspark.zip to /tmp/spark-fcec7a5a-e968-4e64-b5ee-d8752b121deb/userFiles-c929132a-b5fc-43b6-9f9a-24a4b37f1698/tfspark.zip
18/02/01 17:10:05 INFO client.StandaloneAppClient$ClientEndpoint: Connecting to master spark://master:7077...
18/02/01 17:10:05 INFO client.TransportClientFactory: Successfully created connection to Master/192.168.1.158:7077 after 21 ms (0 ms spent in bootstraps)
18/02/01 17:10:05 INFO cluster.StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20180201171005-0005
18/02/01 17:10:05 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20180201171005-0005/0 on worker-20180201163215-192.168.1.159-45847 (192.168.1.159:45847) with 1 cores
18/02/01 17:10:05 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20180201171005-0005/0 on hostPort 192.168.1.159:45847 with 1 cores, 1024.0 MB RAM
18/02/01 17:10:05 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20180201171005-0005/1 on worker-20180201163215-192.168.1.158-43047 (192.168.1.158:43047) with 1 cores
18/02/01 17:10:05 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20180201171005-0005/1 on hostPort 192.168.1.158:43047 with 1 cores, 1024.0 MB RAM
18/02/01 17:10:05 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37081.
18/02/01 17:10:05 INFO netty.NettyBlockTransferService: Server created on 192.168.1.158:37081
18/02/01 17:10:05 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/02/01 17:10:05 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.1.158, 37081, None)
18/02/01 17:10:05 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.1.158:37081 with 366.3 MB RAM, BlockManagerId(driver, 192.168.1.158, 37081, None)
18/02/01 17:10:05 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20180201171005-0005/0 is now RUNNING
18/02/01 17:10:05 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.158, 37081, None)
18/02/01 17:10:05 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20180201171005-0005/1 is now RUNNING
18/02/01 17:10:05 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.1.158, 37081, None)
18/02/01 17:10:05 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@47677170{/metrics/json,null,AVAILABLE,@Spark}
18/02/01 17:10:05 INFO cluster.StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
args: Namespace(batch_size=100, cluster_size=2, epochs=1, format='csv', images='hdfs://Master:9000/user/ubuntu/examples/mnist/csv/train/images', labels='hdfs://Master:9000/user/ubuntu/examples/mnist/csv/train/labels', mode='train', model='mnist_model', output='predictions', rdma=False, readers=1, steps=1000, tensorboard=False)
2018-02-01T17:10:05.715826 ===== Start
18/02/01 17:10:06 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 218.5 KB, free 366.1 MB)
18/02/01 17:10:06 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 20.6 KB, free 366.1 MB)
18/02/01 17:10:06 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.158:37081 (size: 20.6 KB, free: 366.3 MB)
18/02/01 17:10:06 INFO spark.SparkContext: Created broadcast 0 from textFile at NativeMethodAccessorImpl.java:0
18/02/01 17:10:06 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 218.5 KB, free 365.9 MB)
18/02/01 17:10:06 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 20.6 KB, free 365.8 MB)
18/02/01 17:10:06 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.1.158:37081 (size: 20.6 KB, free: 366.3 MB)
18/02/01 17:10:06 INFO spark.SparkContext: Created broadcast 1 from textFile at NativeMethodAccessorImpl.java:0
zipping images and labels
18/02/01 17:10:06 INFO mapred.FileInputFormat: Total input paths to process : 10
18/02/01 17:10:06 INFO mapred.FileInputFormat: Total input paths to process : 10
2018-02-01 17:10:06,523 INFO (MainThread-11280) Reserving TFSparkNodes
2018-02-01 17:10:06,525 INFO (MainThread-11280) listening for reservations at ('192.168.1.158', 42071)
2018-02-01 17:10:06,525 INFO (MainThread-11280) Starting TensorFlow on executors
2018-02-01 17:10:06,533 INFO (MainThread-11280) Waiting for TFSparkNodes to start
2018-02-01 17:10:06,533 INFO (MainThread-11280) waiting for 2 reservations
18/02/01 17:10:06 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.1.159:36946) with ID 0
18/02/01 17:10:06 INFO spark.SparkContext: Starting job: foreachPartition at /usr/local/lib/python2.7/dist-packages/tensorflowonspark/TFCluster.py:247
18/02/01 17:10:06 INFO scheduler.DAGScheduler: Got job 0 (foreachPartition at /usr/local/lib/python2.7/dist-packages/tensorflowonspark/TFCluster.py:247) with 2 output partitions
18/02/01 17:10:06 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (foreachPartition at /usr/local/lib/python2.7/dist-packages/tensorflowonspark/TFCluster.py:247)
18/02/01 17:10:06 INFO scheduler.DAGScheduler: Parents of final stage: List()
18/02/01 17:10:06 INFO scheduler.DAGScheduler: Missing parents: List()
18/02/01 17:10:06 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (PythonRDD[8] at foreachPartition at /usr/local/lib/python2.7/dist-packages/tensorflowonspark/TFCluster.py:247), which has no missing parents
18/02/01 17:10:06 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 24.5 KB, free 365.8 MB)
18/02/01 17:10:06 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.1.159:44733 with 366.3 MB RAM, BlockManagerId(0, 192.168.1.159, 44733, None)
18/02/01 17:10:06 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 10.3 KB, free 365.8 MB)
18/02/01 17:10:06 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.1.158:37081 (size: 10.3 KB, free: 366.2 MB)
18/02/01 17:10:06 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
18/02/01 17:10:06 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (PythonRDD[8] at foreachPartition at /usr/local/lib/python2.7/dist-packages/tensorflowonspark/TFCluster.py:247) (first 15 tasks are for partitions Vector(0, 1))
18/02/01 17:10:06 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
18/02/01 17:10:06 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 192.168.1.159, executor 0, partition 0, PROCESS_LOCAL, 4834 bytes)
18/02/01 17:10:06 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.1.158:51760) with ID 1
18/02/01 17:10:06 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 192.168.1.158, executor 1, partition 1, PROCESS_LOCAL, 4834 bytes)
18/02/01 17:10:06 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.1.158:41971 with 366.3 MB RAM, BlockManagerId(1, 192.168.1.158, 41971, None)
18/02/01 17:10:07 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.1.158:41971 (size: 10.3 KB, free: 366.3 MB)
2018-02-01 17:10:07,534 INFO (MainThread-11280) waiting for 2 reservations
2018-02-01 17:10:08,535 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:09,536 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:10,538 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:11,539 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:12,540 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:13,541 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:14,542 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:15,543 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:16,545 INFO (MainThread-11280) waiting for 1 reservations
2018-02-01 17:10:17,546 INFO (MainThread-11280) waiting for 1 reservations
18/02/01 17:10:17 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.1.159:44733 (size: 10.3 KB, free: 366.3 MB)
2018-02-01 17:10:18,547 INFO (MainThread-11280) all reservations completed
2018-02-01 17:10:18,547 INFO (MainThread-11280) All TFSparkNodes started
2018-02-01 17:10:18,548 INFO (MainThread-11280) {'addr': '/tmp/pymp-tLjuIS/listener-mfx0pd', 'task_index': 0, 'port': 43545, 'authkey': '6\x11b\x99\xaf\x9eCU\xbba"{\xef\x18\xe3\x83', 'worker_num': 1, 'host': '192.168.1.158', 'ppid': 11424, 'job_name': 'worker', 'tb_pid': 0, 'tb_port': 0}
2018-02-01 17:10:18,548 INFO (MainThread-11280) {'addr': ('192.168.1.159', 39975), 'task_index': 0, 'port': 45045, 'authkey': '.\x0e\xac\x1e#\x8bEk\x93\x8f\x83y\x10fnt', 'worker_num': 0, 'host': '192.168.1.159', 'ppid': 15620, 'job_name': 'ps', 'tb_pid': 0, 'tb_port': 0}
2018-02-01 17:10:18,548 INFO (MainThread-11280) Feeding training data
18/02/01 17:10:18 INFO spark.SparkContext: Starting job: collect at PythonRDD.scala:458
18/02/01 17:10:18 INFO scheduler.DAGScheduler: Got job 1 (collect at PythonRDD.scala:458) with 10 output partitions
18/02/01 17:10:18 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (collect at PythonRDD.scala:458)
18/02/01 17:10:18 INFO scheduler.DAGScheduler: Parents of final stage: List()
18/02/01 17:10:18 INFO scheduler.DAGScheduler: Missing parents: List()
18/02/01 17:10:18 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (PythonRDD[10] at RDD at PythonRDD.scala:48), which has no missing parents
18/02/01 17:10:18 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 52.5 KB, free 365.7 MB)
18/02/01 17:10:18 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 12.5 KB, free 365.7 MB)
18/02/01 17:10:18 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.1.158:37081 (size: 12.5 KB, free: 366.2 MB)
18/02/01 17:10:18 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1006
18/02/01 17:10:18 INFO scheduler.DAGScheduler: Submitting 10 missing tasks from ResultStage 1 (PythonRDD[10] at RDD at PythonRDD.scala:48) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
18/02/01 17:10:18 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 10 tasks
18/02/01 17:10:19 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, 192.168.1.158, executor 1, partition 0, ANY, 5496 bytes)
18/02/01 17:10:19 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 12790 ms on 192.168.1.158 (executor 1) (1/2)
18/02/01 17:10:19 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.1.158:41971 (size: 12.5 KB, free: 366.3 MB)
18/02/01 17:10:19 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.158:41971 (size: 20.6 KB, free: 366.3 MB)
18/02/01 17:10:20 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.1.158:41971 (size: 20.6 KB, free: 366.2 MB)

it stuck here and i force it to quit by ctrl+c
so what caught me?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions