Skip to content

Running Apache Beam pipeline on Azure Databricks #21572

@damccorm

Description

@damccorm

I'm trying to create a simple streaming app with Apache Beam, where it reads data from an Azure event hub and produces messages into another Azure event hub.
 
I'm creating and running spark jobs on Azure Databricks. 
The problem is the consumer (uses SparkRunner) is not receiving any messages from Event hub (topic). There is no activity and no errors on the Spark cluster.
 I tried to consume event hub messages without using Apache beam on the same cluster and it is working without any issues. In addition to that I'm also able to produce message from same cluster using Apache Beam Kafka IO. 
 
I'm not sure is this a issue in Kafka IO or Spark runner. Could anyone help on this?

Imported from Jira BEAM-14120. Original Jira may contain additional context.
Reported by: mdumanoj.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions