Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] databricks sdk jobs. how to create dependency task /lineage using python #504

Open
shivatharun opened this issue Jan 10, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@shivatharun
Copy link

shivatharun commented Jan 10, 2024

How to create dependency jobs / lineage using databricks sdk.
I found documentation for single job creation.

created_job = w.jobs.create(name=f'sdk-{time.time_ns()}',
                            tasks=[
                                jobs.Task(description="test",
                                          existing_cluster_id=cluster_id,
                                          notebook_task=jobs.NotebookTask(notebook_path="test_run"),
                                          task_key="test",
                                          timeout_seconds=0)

Lets say I have main notebook within the notebook creating a job test and passing "test_run" notebook to trigger. I want to run test_run notebook with different paremeter. How to create lineage using sdk python. ?
Could please help to share any references I couldn't find ?

@tanmay-db
Copy link
Contributor

Hi @shivatharun, the lineage isn't supported in the SDK currently, however you could update the job with different parameters for example: https://github.com/databricks/databricks-sdk-py/blob/main/examples/jobs/update_jobs_api_full_integration.py where you could use a different JobSetting, does this seem to work for your use case?

@tanmay-db tanmay-db added the enhancement New feature or request label Jan 15, 2024
@shivatharun
Copy link
Author

Hi @tanmay-db - May I know how tasks can run parallel within job, without any dependency, is there any limitation number ?

created_job = w.jobs.create(name=f'sdk-{time.time_ns()}',
                            tasks=[ task1,task2,task3,........]))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants