Skip to content

Deltastreamer not getting auto triggered in continuous mode #3826

@stackls

Description

@stackls

Triggered spark submit --continuous --min sync interval 180 for delta streamer.

Spark submit is not getting triggered on the interval of each 3 minutes.

spark-submit
--jars "/usr/lib/hudi/hudi-utilities-bundle_2.11-0.5.2-incubating.jar,/usr/lib/spark/external/lib/spark-avro_2.11-2.4.5-amzn-0.jar"
--master yarn --deploy-mode cluster
--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
--conf 'spark.sql.hive.convertMetastoreParquet=false'
--class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer /usr/lib/hudi/hudi-utilities-bundle_2.11-0.5.2-incubating.jar
--props /test/base_test.properties
--source-class org.apache.hudi.utilities.sources.ParquetDFSSource
--table-type MERGE_ON_READ
--target-base-path targetpath --target-table testhudi
--hoodie-conf hoodie.datasource.write.recordkey.field=id
--hoodie-conf hoodie.deltastreamer.source.dfs.root=sourcepath'
--hoodie-conf hoodie.datasource.write.partitionpath.field='org.apache.hudi.keygen.NonpartitionedKeyGenerator'
--continuous --min-sync-interval-seconds 180

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version : 0.5.2

  • Spark version : 2.4.7

  • Storage (HDFS/S3/GCS..) : s3

  • Running on Docker? (yes/no) : spark cli command

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority:mediumModerate impact; usability gaps

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions