Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rules/Request: Airflow DAGs checks #4421

Open
brucearctor opened this issue May 13, 2023 · 10 comments
Open

Rules/Request: Airflow DAGs checks #4421

brucearctor opened this issue May 13, 2023 · 10 comments
Labels
accepted Ready for implementation plugin Implementing a known but unsupported plugin

Comments

@brucearctor
Copy link
Contributor

There are numerous problems that could go wrong with an Airflow DAG, ex:

  • Duplicate DAG name
  • Duplicate Task ID
  • Duplicate Task Dependency
  • Task without DAG
  • No Cycles [ DAG should be acyclic ]
  • etc ...

Seems like we might want new rules that could be used to detect these things.

@qdegraaf
Copy link
Contributor

The rules defined in https://github.com/BasPH/pylint-airflow seem to cover all those cases plus a bit more. Would a port of that Pylint plugin be a good start?

@charliermarsh
Copy link
Member

I'd had some discussion with @jlaneve on Twitter about this! I'd like to support Airflow-specific rules, but we need guidance on what those rules should contain.

@jlaneve -- any further thoughts here? Is the pylint-airflow plugin a good starting point?

@charliermarsh charliermarsh added the plugin Implementing a known but unsupported plugin label May 17, 2023
@jlaneve
Copy link
Contributor

jlaneve commented May 17, 2023

Bas' pylint-airflow is definitely a good start, but worth noting it was designed for Airflow 1.x and we should aim to get ruff supporting Airflow 2.x, so there are likely a few changes we'll want to make. Maybe I can open a draft PR with a few of the easy rules (no duplicate DAG names, no empty DAGs, etc) to get something going, and then we can open issues for specific rules afterwards?

@charliermarsh - what do you think?

@charliermarsh
Copy link
Member

@jlaneve - Yeah, that's perfect.

@brucearctor
Copy link
Contributor Author

brucearctor commented May 17, 2023

Exactly - I had seen pylint-airflow, that covers lots of the ideal rules. V2 Airflow is the thing to target, and imagine even ignoring V1 [ for ruff ] is OK.

@fritz-astronomer
Copy link

https://docs.astronomer.io/learn/dag-best-practices
https://airflow.apache.org/docs/apache-airflow/stable/faq.html
https://airflow.apache.org/docs/apache-airflow/stable/best-practices.html

Some opinionated options (which could certainly be contentious):

  • not mixing with DAG and @dag in the same repo
  • DAGs having owners
  • remove DAG as dag if airflow>X.Y
  • remove globals()[dag_id] = dag if airflow>x.y
  • Some way to check for top level code? That'd probably be really tricky
    • avoid top-level variable.get
  • dag_id is the same as the filename
  • start_date is a static datetime

@fritz-astronomer
Copy link

Some Astronomer folks might be able to help development :)

@daniel-bartley
Copy link

Is there a roadmap timeline for this?

@usethia
Copy link

usethia commented Apr 3, 2024

what essentially makes the difference between ruff check vs ruff lint
lint gives out this error which doesn't seem to get resolved.
lint:1:1: E902 No such file or directory (os error 2)
any reason?

@MichaReiser
Copy link
Member

MichaReiser commented Apr 4, 2024

@usethia your question seems unrelated to this issue. I'll reply anyway but please open a new issue if you need more help to avoid sidetracking this issue.

ruff lint isn't a Ruff command. Today ruff <path> is an alias for ruff check <path>. So the command ruff lint is the same as ruff check lint but you don't have a file or directory called lint in your project. We're about to remove the alias in the next minor version of Ruff because it can be confusing (as in your case).

To lint your project, run ruff check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation plugin Implementing a known but unsupported plugin
Projects
None yet
Development

No branches or pull requests

8 participants