`task_sandbox` broken for raptor tasks #2802

eirrgang · 2022-12-20T13:43:16Z

I believe this is a known problem, but I don't see that it is explicitly tracked and is likely to cause confusion.

The task_sandbox property of a raptor task generated on a client has a constructed path that is never created or used in the execution environment.
File staging directives with unqualified paths (relative paths instead of URIs) do not resolve to the directory in which raptor tasks actually run.
The task:/// URI scheme does not work in staging directives raptor tasks.

The text was updated successfully, but these errors were encountered:

eirrgang · 2023-03-29T09:57:49Z

I think it was discussed elsewhere, but a corollary that @andre-merzky may be intending to tackle:

1.1 Explicitly setting the task_sandbox property sets the working directory for a traditional RP Task, but not for a raptor task (which has separate code for the launch method, which involves one of several ways of spawning new processes from a running Worker task).

Ref radical-cybertools/radical.pilot#2802

andre-merzky · 2023-04-10T21:33:41Z

@eirrgang : can this be closed after the last iteration?

eirrgang · 2023-04-11T10:18:32Z

@eirrgang : can this be closed after the last iteration?

I believe so, but I haven't personally checked all cases. Can you confirm that the following have been addressed or won't be addressed?

task_sandbox returns the effective working directory for tasks with scheduler set, regardless of task mode.
sandbox can be set (in the TaskDescription) to determine the working directory for tasks with schedule set. The value will be used as the working directory. The Task will fail if the path is not valid and accessible. (updated)
task:///path is equivalent to {task_sandbox}/path
relative paths in staging directives equate to task:/// URIs, where appropriate

andre-merzky · 2023-04-11T11:39:16Z

task_sandbox can be set to determine the working directory for tasks with schedule set and will either set the working directory or produce an error no later than the attempt to submit the task.

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

eirrgang · 2023-04-11T11:55:03Z

task_sandbox can be set to determine the working directory for tasks with schedule set and will either set the working directory or produce an error no later than the attempt to submit the task.

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

Does a non-default value of task_sandbox raise an error (when scheduler is a non-empty string) that prevents acquisition of a broken Task object?

eirrgang · 2023-04-11T11:56:43Z

task_sandbox can be set to determine the working directory for tasks with schedule set and will either set the working directory or produce an error no later than the attempt to submit the task.

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

Does a non-default value of task_sandbox raise an error (when scheduler is a non-empty string) that prevents acquisition of a broken Task object?

Oh. Excuse me, maybe. I'm not asking for a new feature---just predictability with respect to standard use cases.

My impression was that setting task_sandbox was a normally supported use case. If not, then this point is irrelevant (assuming that task_sandbox is never inappropriately user-assignable). However, the current docs allows a sandbox field in the TaskDescription as long as it is relative to the pilot sandbox.

andre-merzky · 2023-04-11T12:35:07Z

Does a non-default value of task_sandbox raise an error (when scheduler is a non-empty string) that prevents acquisition of a broken Task object?

Oh. Excuse me, maybe. I'm not asking for a new feature---just predictability with respect to standard use cases.

My impression was that setting task_sandbox was a normally supported use case. If not, then this point is irrelevant (assuming that task_sandbox is never inappropriately user-assignable). However, the current docs allows a sandbox field in the TaskDescription as long as it is relative to the pilot sandbox.

We may talk a bit cross-purpose I think - or at least I may be missing the point. Let me try to do this stepwise.

specifying a sandbox in a task description is a normally supported use case (now also for raptor tasks)
any value set here is a non-default value (unless you manually retrace what the implementation would do to create the default sandbox path which would be possible but a bit useless).
we don't raise errors on such non-default values, as that is the purpose of the attribute.

The last statement reads like an ipso facto, so I am likely missing something?

andre-merzky · 2023-04-11T12:36:16Z

Duh, I should read a bit more carefully. You also wrote:

However, the current docs allows a sandbox field in the TaskDescription as long as it is relative to the pilot sandbox.

That was corrected in the docs: the sandbox does not need to be relative to the pilot sandbox anymore, that restriction was lifted a while ago.

eirrgang · 2023-04-11T12:39:50Z

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

Which part is not possible? The error? If there is some part of the protocol in which a raptor task is less able to perform error checking than a traditional task, then we should document it. If assignment to sandbox for a raptor task (a task with "scheduler" set) behaves the same as for non-raptor tasks, then this item is fine. (I updated the checklist item text in an attempt to clarify)

andre-merzky · 2023-04-11T12:47:33Z

Sorry, some quoting went wrong. Let me reply to the changed text:

The Task will fail if the path is not valid and accessible.

Yes, the task will move to FAILED state in that case. That will not happen during submission though, but either during data staging or during execution. Checking path validity of the remote path during submission would be costly, and would race with the actual state of the file system anyway...

PS: This holds for all tasks, not only Raptor tasks.

eirrgang added layer:rp type:bug labels Dec 20, 2022

andre-merzky self-assigned this Mar 23, 2023

andre-merzky mentioned this issue Mar 31, 2023

support sandboxes for raptor tasks #2885

Merged

eirrgang added a commit to SCALE-MS/scale-ms that referenced this issue Apr 7, 2023

Test RP fix for task directories.

de25eb3

Ref radical-cybertools/radical.pilot#2802

andre-merzky closed this as completed in #2885 Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`task_sandbox` broken for raptor tasks #2802

`task_sandbox` broken for raptor tasks #2802

eirrgang commented Dec 20, 2022

eirrgang commented Mar 29, 2023

andre-merzky commented Apr 10, 2023

eirrgang commented Apr 11, 2023 •

edited

andre-merzky commented Apr 11, 2023

eirrgang commented Apr 11, 2023 •

edited

eirrgang commented Apr 11, 2023 •

edited

andre-merzky commented Apr 11, 2023

andre-merzky commented Apr 11, 2023 •

edited

eirrgang commented Apr 11, 2023 •

edited

andre-merzky commented Apr 11, 2023 •

edited

task_sandbox broken for raptor tasks #2802

task_sandbox broken for raptor tasks #2802

Comments

eirrgang commented Dec 20, 2022

eirrgang commented Mar 29, 2023

andre-merzky commented Apr 10, 2023

eirrgang commented Apr 11, 2023 • edited

andre-merzky commented Apr 11, 2023

eirrgang commented Apr 11, 2023 • edited

eirrgang commented Apr 11, 2023 • edited

andre-merzky commented Apr 11, 2023

andre-merzky commented Apr 11, 2023 • edited

eirrgang commented Apr 11, 2023 • edited

andre-merzky commented Apr 11, 2023 • edited

`task_sandbox` broken for raptor tasks #2802

`task_sandbox` broken for raptor tasks #2802

eirrgang commented Apr 11, 2023 •

edited

eirrgang commented Apr 11, 2023 •

edited

eirrgang commented Apr 11, 2023 •

edited

andre-merzky commented Apr 11, 2023 •

edited

eirrgang commented Apr 11, 2023 •

edited

andre-merzky commented Apr 11, 2023 •

edited