Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-42528: Fix reporting request status #65

Merged
merged 2 commits into from
Jan 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/changes/DM-42528.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix report function during checking request status.
44 changes: 24 additions & 20 deletions python/lsst/ctrl/bps/panda/panda_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,12 +179,22 @@
WmsStates.PRUNED: "output_missing_files",
}

# workflow status to report as SUCCEEDED
wf_status = ["Finished", "SubFinished", "Transforming"]

wf_succeed = False

tasks.sort(key=lambda x: x["transform_workload_id"])
workflow_status = head["status"]["attributes"]["_name_"]

Check warning on line 182 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L182

Added line #L182 was not covered by tests
if workflow_status in ["Finished", "SubFinished"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does subfinished mean in particular to understand why it means succeeded?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this give the correct status for when the final job hasn't shown up yet in PanDA tasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. It's the workflow status in iDDS. In the iDDS result, there are two parts about the status: (1) 'status': the workflow/request status, it includes the last task even it's not created. (2) 'transform_status': It's the task status, per task.

In WmsStates, there is no item to distinguish 'Subfinished'. That's why I mapped it to Succeeded.

wms_report.state = WmsStates.SUCCEEDED

Check warning on line 184 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L184

Added line #L184 was not covered by tests
elif workflow_status in ["Failed", "Expired"]:
wms_report.state = WmsStates.FAILED

Check warning on line 186 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L186

Added line #L186 was not covered by tests
elif workflow_status in ["Cancelled"]:
wms_report.state = WmsStates.DELETED

Check warning on line 188 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L188

Added line #L188 was not covered by tests
elif workflow_status in ["Suspended"]:
wms_report.state = WmsStates.HELD

Check warning on line 190 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L190

Added line #L190 was not covered by tests
else:
wms_report.state = WmsStates.RUNNING

Check warning on line 192 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L192

Added line #L192 was not covered by tests

try:

Check warning on line 194 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L194

Added line #L194 was not covered by tests
tasks.sort(key=lambda x: x["transform_workload_id"])
except Exception:

Check warning on line 196 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L196

Added line #L196 was not covered by tests
tasks.sort(key=lambda x: x["transform_id"])

exit_codes_all = {}
# Loop over all tasks data returned by idds_client
Expand Down Expand Up @@ -228,13 +238,14 @@
for state in WmsStates:
njobs = 0
# Each WmsState have many iDDS status mapped to it.
for mappedstate in state_map[status]:
if state in file_map and mappedstate == state:
if task[file_map[mappedstate]] is not None:
njobs = task[file_map[mappedstate]]
if state == WmsStates.RUNNING:
njobs += task["output_new_files"] - task["input_new_files"]
break
if status in state_map:
for mappedstate in state_map[status]:
if state in file_map and mappedstate == state:
if task[file_map[mappedstate]] is not None:
njobs = task[file_map[mappedstate]]

Check warning on line 245 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L245

Added line #L245 was not covered by tests
if state == WmsStates.RUNNING:
njobs += task["output_new_files"] - task["input_new_files"]
break

Check warning on line 248 in python/lsst/ctrl/bps/panda/panda_service.py

View check run for this annotation

Codecov / codecov/patch

python/lsst/ctrl/bps/panda/panda_service.py#L247-L248

Added lines #L247 - L248 were not covered by tests
wms_report.job_state_counts[state] += njobs
taskstatus[state] = njobs
wms_report.job_summary[tasklabel] = taskstatus
Expand All @@ -244,13 +255,6 @@
wms_report.run_summary += ";"
wms_report.run_summary += f"{tasklabel}:{str(totaljobs)}"

if status in wf_status:
wf_succeed = True
wms_report.state = state_map[status][0]

# All tasks have failed, set the workflow FAILED
if not wf_succeed:
wms_report.state = WmsStates.FAILED
wms_report.exit_code_summary = exit_codes_all
run_reports.append(wms_report)

Expand Down