Skip to content

Conversation

@dheerajkumar-solanki
Copy link
Contributor

Fixing the DatabricksNotebookOperator invalid dependency graph issue, this comes when dependencies are defined in DatabricksWorkflowTaskGroup. Issue is child task DatabricksNotebookOperator and DatabricksTaskOperator task referencing itself. Issue in _generate_databricks_task_key function which is incorrectly generating depends_on task value.

closes: #47983


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg
Copy link

boring-cyborg bot commented Mar 28, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@jroachgolf84
Copy link
Collaborator

@dheerajkumar-solanki, nice stuff! I was going to approach this in a bit of a different way, but I like what you did. Question; were you able to run that DAG that you originally had posted?

@jroachgolf84
Copy link
Collaborator

There were a few other places where I think we might need to make changes.

@dheerajkumar-solanki
Copy link
Contributor Author

There were a few other places where I think we might need to make changes.

Which places, can you provide some context, so I can update the PR.

@dheerajkumar-solanki
Copy link
Contributor Author

@dheerajkumar-solanki, nice stuff! I was going to approach this in a bit of a different way, but I like what you did. Question; were you able to run that DAG that you originally had posted?

Yes, I was able to run the failed DAG.

@ghost
Copy link

ghost commented Mar 31, 2025

This was one of the places I believe would need a change (and was part of the original PR that we believe may have been "breaking").

https://github.com/apache/airflow/blob/3dbed17b0ef52b3a46483219feb4b311251990eb/providers/databricks/src/airflow/providers/databricks/plugins/databricks_workflow.py#L321C1-L337C115

@jroachgolf84
Copy link
Collaborator

@dheerajkumar-solanki, how are things going on your side? Have those unit-tests been run?

@dheerajkumar-solanki
Copy link
Contributor Author

This was one of the places I believe would need a change (and was part of the original PR that we believe may have been "breaking").

https://github.com/apache/airflow/blob/3dbed17b0ef52b3a46483219feb4b311251990eb/providers/databricks/src/airflow/providers/databricks/plugins/databricks_workflow.py#L321C1-L337C115

I checked the code and previous MR, there is no change required as we have not change the function definition of databricks_task_key.

All the databricks unit test files are running fine.

@dheerajkumar-solanki
Copy link
Contributor Author

dheerajkumar-solanki commented Apr 6, 2025

@pankajkoti @jroachgolf84 One of the tests pipeline is failing Tests / CI image checks / MyPy checks (mypy-providers) (pull_request), doesn't look like from the changes in this PR. Is it just due to my branch is not up to date with base branch or any other issue?

Also, there is another MR on this same issue (#48623), I like his solution but I think some pieces are missing in the MR. what would be next step in this MR now?

@potiuk
Copy link
Member

potiuk commented Apr 9, 2025

rebase first and make it green

image

@dheerajkumar-solanki
Copy link
Contributor Author

rebase first and make it green

image

@potiuk rebase is done, waiting for workflow approval to run the test again.

@potiuk potiuk merged commit 9dcce2f into apache:main Apr 13, 2025
60 checks passed
@boring-cyborg
Copy link

boring-cyborg bot commented Apr 13, 2025

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

@potiuk
Copy link
Member

potiuk commented Apr 13, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DatabricksNotebookOperator generating invalid dependency graph

3 participants