Fetch log file from multiple worker#21230
Conversation
768aa8a to
7cbda7b
Compare
|
You need to rebase |
7cbda7b to
dc8dcaa
Compare
@potiuk Done |
There was a problem hiding this comment.
This is very wrong solution.
Yeah. It will work fine for 3 hosts and small number of task instances, but what if you have 50 hostnames and millions of task instances?
The solution is extremely "brute-force" solution to a problem. The query below has terrible characteristics: Basically it performs full table scan over one of the potentially HUGE TaskInstance table in order to retrieve unique hostnames used during the last two days (!). Without any index on hostname. This will kill the database and will run for minutes on a huge deployment.
hosts = (
session.query(TaskInstance.hostname)
.filter(
TaskInstance.hostname != '',
TaskInstance.end_date > (timezone.utcnow() - LOG_SEARCH_INTERVAL),
)
.distinct()
)
And it seems that "proper" fixing it - i.e. storing a list of hosts that diiferent retries of the same task instance used - should be ralatively easy.
Simply make the "hostname" field to accept both "hostname" and "hostname array" (coma separated) for example) this shoudl be rather easy change, you can also easily make a round-robin if the total array lenght will be too long (1000 characters is the current limit) or we could even increase the limit of the hostname field if we are worried about it. It is not part of any index nor searched for so it can be even unlimited.
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
closes: #16472
A robust solution requires TI try history. I'm not sure it's worth it. So this PR is a workaround: we try to find an appropriate host in all task instances. Some hosts can disappear from TI-table, but I think it's a rare case.
Solution tested at the 3-host cluster.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.