Skip to content

DataFusion: support stopping a specific pipeline run (CDAP Stop a Program Run) in DataFusionHook / Stop operator #61224

@shahar1

Description

@shahar1

Description

The Google provider’s Data Fusion integration can start a pipeline and returns a run_id (aka “pipeline_id” in Airflow operators), but the stop functionality only stops the program (not a specific run). CDAP/Data Fusion supports stopping a specific run via “Stop a Program Run”, and Airflow should expose that to avoid stopping an arbitrary run when multiple runs are active.

Use case/motivation

In CDAP, workflows can have multiple concurrent runs. The current “Stop a Program” endpoint:

POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/stop

“…will stop one of the runs, but not all of the runs.” (CDAP Lifecycle Microservices docs)

Airflow already tracks a specific runId returned by DataFusionStartPipelineOperator (via XCom / returned value). Users need to stop that specific run deterministically, e.g. on DAG cancellation, failure cleanup, or manual stop workflows.

CDAP provides a precise endpoint:

POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/runs/<run-id>/stop

Airflow should support calling that endpoint when a runId is available.

Related issues

#60688

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Disclaimer: This issue was generated by GPT 5.2, under my supervision.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions