-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-18048: [Dev][Archery][Crossbow] Comment bot waits for a while before generate a report #14412
Conversation
…efore generate a report If we generate a report immediately after we submit CI tasks, the generated report doesn't have suitable task CI task URL. Because CI tasks aren't started. We need to wait for a while before we generate a report to collect suitable CI task URLs. Before: apache#14409 (comment) https://github.com/ursacomputing/crossbow/tree/actions-1e49219a52-github-r-binary-packages is used. After: apache#14409 (comment) https://github.com/ursacomputing/crossbow/actions/runs/3248297802/jobs/5329340988 is used.
|
@raulcd What do you think about this (heuristic) approach? |
dev/archery/archery/bot.py
Outdated
@@ -269,6 +272,10 @@ def submit(obj, tasks, groups, params, arrow_version): | |||
queue.put(job, prefix="actions", increment_job_id=False) | |||
queue.push() | |||
|
|||
# # wait for tasks of the job are triggered to collect more | |||
# # suitable task URLs | |||
time.sleep(wait) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with the heuristic implementation but I think I would prefer this to be implemented inside the CommentReport
and the task_url
itself as I can see us requiring to wait for the job to trigger on other places. I had something like this patch in mind:
diff --git a/dev/archery/archery/bot.py b/dev/archery/archery/bot.py
index c548e9a..8b02dc7 100644
--- a/dev/archery/archery/bot.py
+++ b/dev/archery/archery/bot.py
@@ -270,7 +270,7 @@ def submit(obj, tasks, groups, params, arrow_version):
queue.push()
# render the response comment's content
- report = CommentReport(job, crossbow_repo=crossbow_repo)
+ report = CommentReport(job, crossbow_repo=crossbow_repo, wait_for_task=wait)
# send the response
pull_request.create_issue_comment(report.show())
diff --git a/dev/archery/archery/crossbow/reports.py b/dev/archery/archery/crossbow/reports.py
index a3958d8..d887574 100644
--- a/dev/archery/archery/crossbow/reports.py
+++ b/dev/archery/archery/crossbow/reports.py
@@ -20,6 +20,7 @@ import csv
import operator
import fnmatch
import functools
+import time
import click
import requests
@@ -41,7 +42,7 @@ class Report:
"arrow_commit",
]
- def __init__(self, job, task_filters=None):
+ def __init__(self, job, task_filters=None, wait_for_task=None):
self.job = job
tasks = sorted(job.tasks.items())
@@ -53,6 +54,7 @@ class Report:
tasks = [(name, task) for name, task in tasks if name in filtered]
self._tasks = dict(tasks)
+ self._wait_for_task = wait_for_task
@property
def repo_url(self):
@@ -66,6 +68,8 @@ class Report:
return '{}/tree/{}'.format(self.repo_url, branch)
def task_url(self, task):
+ if self._wait_for_task:
+ time.wait(self._wait_for_task)
if task.status().build_links:
# show link to the actual build, some CI providers implement
# the statuses API others implement the checks API, retrieve any.
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I changed to use the approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... It seems that the approach may do N wait
sleeps when we run N tasks. For example, crossbow submit -g wheel
will sleep 9 times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are correct, sorry about that, we probably should only wait if there are no links to the individual task. In that case I would expect the links to be available and the wait to be only for the first one:
if not task.status().build_links and self._wait_for_task:
time.sleep(self.__wait_for_task)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry this won't work because if present it would be a string. I might take a look tomorrow if is still not solved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kou !
Benchmark runs are scheduled for baseline = fc01a9c and contender = d1a8f4b. d1a8f4b is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
If we generate a report immediately after we submit CI tasks, the generated report doesn't have suitable task CI task URL. Because CI tasks aren't started. We need to wait for a while before we generate a report to collect suitable CI task URLs.
Before:
#14409 (comment)
https://github.com/ursacomputing/crossbow/tree/actions-1e49219a52-github-r-binary-packages is used.
After:
#14409 (comment)
https://github.com/ursacomputing/crossbow/actions/runs/3248297802/jobs/5329340988 is used.