Skip to content

[SPARK-52663][SDP] Introduce name field to pipeline spec #51353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

sryza
Copy link
Contributor

@sryza sryza commented Jul 3, 2025

What changes were proposed in this pull request?

The Declarative Pipelines SPIP included a "name" field in the pipeline spec, but we left that out in the earlier implementation. This adds it in.

The name field is required. This matches behavior for similar systems, like dbt.

Why are the changes needed?

See above.

Does this PR introduce any user-facing change?

Yes, but only to unreleased code.

How was this patch tested?

Updated existing tests, and added tests for proper error when the name is missing.

Was this patch authored or co-authored using generative AI tooling?

@gengliangwang
Copy link
Member

@sryza shall we update the programming guide as well?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the above comment about documentations.

+1 for the code too.

@github-actions github-actions bot added the DOCS label Jul 5, 2025
@sryza
Copy link
Contributor Author

sryza commented Jul 5, 2025

👍 thanks for the review @HyukjinKwon @gengliangwang and @dongjoon-hyun – I updated the programming guide to include the name property in the pipeline spec example, and I also filed a JIRA to track adding explanations of each of the properties: https://issues.apache.org/jira/browse/SPARK-52690.

@@ -99,7 +100,7 @@ The `spark-pipelines` command line interface (CLI) is the primary way to execute

### `spark-pipelines init`

`spark-pipelines init` generates a simple pipeline project, including a spec file and example definitions.
`spark-pipelines init --name my_pipeline` generates a simple pipeline project, inside a directory named "my_pipeline", including a spec file and example definitions.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the --name option was already required here, but wasn't documented as such.

@gengliangwang
Copy link
Member

Thanks, merging to master!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants