-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The grid view shows incorrect DAG runs #26505
Comments
Hi there! Follow up from dev ML. We faced exactly the same issue after upgrade from 2.3.4 to 2.4.0 for all our dags. All of them has quite a history up to 2016 year, so a lot of dag runs. Somehow grid show for us latest runs at 2021.11 while other views (calendar, graph) shows all of them and the most recent ones. The exact fix which solved this problem is:
It feels like there is a problem between data and logic that operates it, but not sure where and what is it. |
@uranusjr Can you take a look at this ASAP please so we can get a fix for this out in 2.4.1? |
@kxepal What is the timetable/schedule parameter for the DAG(s) with this problem? |
@ashb we have good old timedelta schedule interval value everywhere. No timetables at all. |
@zachliu That PR contains a lot of changes, but for us was enough to revert those changes that in the diff. The rest of them causes no troubles. |
@kxepal taking a wild guess here, maybe a |
@zachliu probably. We don't use timetables, so it was clear fix for us. |
@kxepal same here, i'm waiting for the distant future "calendar"-like scheduling system 😉 |
Everything is converted to a timetable now under the hood (and had been since timetables were introduced) so it's not as simple as an |
aww... wrong guess 😿 it was based on the fact that some old dag runs don't have the |
@ashb What information could be useful to debug this issue? |
@kxepal Creating a minimal stand-alone reproduction case if you can ("here's a dag, run it/backfill it for X" sort of thing) |
i tried that, running the same dag many times in a short period (backfilling) doesn't help, they will be ordered properly. i guess because they are all converted to timetables uniformly it seems the issue only occurs on dags that are older than the introduction of timetable, it's basically ordering = (DagRun.__table__.columns[name].desc() for name in dag.timetable.run_ordering)
dag_runs = query.order_by(*ordering, DagRun.id.desc()).limit(num_runs).all() SELECT dag_id,
run_id,
data_interval_end
FROM dag_run
WHERE dag_id = '<dag_name>'
ORDER BY data_interval_end DESC,
execution_date DESC
LIMIT 25; vs dag_runs = query.order_by(DagRun.execution_date.desc()).limit(num_runs).all() SELECT dag_id,
run_id,
data_interval_end
FROM dag_run
WHERE dag_id = '<dag_name>'
ORDER BY execution_date DESC
LIMIT 25;
|
Oblivious, but it be quite a hard to provide. @zachliu feels like walk on this path with no luck. But it looks like the problem could be solved by a simple new migration script. But what the cases it should handle and how it should fix them? |
can the migration script simply fills all the NULL |
But why it already doesn't? It doesn't feels like a problem with current logic - it works fine, but somewhere in changes between two versions. Actually, running new dagruns with 2.4 without a fix doesn't change anything. |
Filling all data interval with execution date would be excruciatingly slow, which is why it’s not done. Doing some fancy |
I created a PR for this. Would be awesome if someone could test it! |
Fixed by #26626 |
Apache Airflow version
2.4.0
What happened
Airflow 2.4.0's grid view has an issue:
Grid view stop showing the latest DAG runs in 2.4.0 (regardless of schedule, it shows old dag runs)
case 1 (all 25 are old dag runs)
![2022-09-19_16-12](https://user-images.githubusercontent.com/14293802/191109057-53522453-907d-4476-b7ef-cc40a9818422.png)
case 2 (old and new dag runs are mixed)
![2022-09-20_17-59](https://user-images.githubusercontent.com/14293802/191372514-e0d87eb9-92be-4c89-bae4-d4361a5f582b.png)
![2022-09-20_17-50](https://user-images.githubusercontent.com/14293802/191371326-b1a81a54-d1f3-4f55-b668-6ad563a73102.png)
What you think should happen instead
No response
How to reproduce
Operating System
Linux Mint 20.3 Una
Versions of Apache Airflow Providers
No response
Deployment
Other Docker-based deployment
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: