Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task_sandbox broken for raptor tasks #2802

Closed
eirrgang opened this issue Dec 20, 2022 · 10 comments · Fixed by #2885
Closed

task_sandbox broken for raptor tasks #2802

eirrgang opened this issue Dec 20, 2022 · 10 comments · Fixed by #2885
Assignees

Comments

@eirrgang
Copy link
Contributor

I believe this is a known problem, but I don't see that it is explicitly tracked and is likely to cause confusion.

  1. The task_sandbox property of a raptor task generated on a client has a constructed path that is never created or used in the execution environment.
  2. File staging directives with unqualified paths (relative paths instead of URIs) do not resolve to the directory in which raptor tasks actually run.
  3. The task:/// URI scheme does not work in staging directives raptor tasks.
@eirrgang
Copy link
Contributor Author

I think it was discussed elsewhere, but a corollary that @andre-merzky may be intending to tackle:

1.1 Explicitly setting the task_sandbox property sets the working directory for a traditional RP Task, but not for a raptor task (which has separate code for the launch method, which involves one of several ways of spawning new processes from a running Worker task).

@andre-merzky
Copy link
Member

@eirrgang : can this be closed after the last iteration?

@eirrgang
Copy link
Contributor Author

eirrgang commented Apr 11, 2023

@eirrgang : can this be closed after the last iteration?

I believe so, but I haven't personally checked all cases. Can you confirm that the following have been addressed or won't be addressed?

  • task_sandbox returns the effective working directory for tasks with scheduler set, regardless of task mode.
  • sandbox can be set (in the TaskDescription) to determine the working directory for tasks with schedule set. The value will be used as the working directory. The Task will fail if the path is not valid and accessible. (updated)
  • task:///path is equivalent to {task_sandbox}/path
  • relative paths in staging directives equate to task:/// URIs, where appropriate

@andre-merzky
Copy link
Member

task_sandbox can be set to determine the working directory for tasks with schedule set and will either set the working directory or produce an error no later than the attempt to submit the task.

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

@eirrgang
Copy link
Contributor Author

eirrgang commented Apr 11, 2023

task_sandbox can be set to determine the working directory for tasks with schedule set and will either set the working directory or produce an error no later than the attempt to submit the task.

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

Does a non-default value of task_sandbox raise an error (when scheduler is a non-empty string) that prevents acquisition of a broken Task object?

@eirrgang
Copy link
Contributor Author

eirrgang commented Apr 11, 2023

task_sandbox can be set to determine the working directory for tasks with schedule set and will either set the working directory or produce an error no later than the attempt to submit the task.

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

Does a non-default value of task_sandbox raise an error (when scheduler is a non-empty string) that prevents acquisition of a broken Task object?

Oh. Excuse me, maybe. I'm not asking for a new feature---just predictability with respect to standard use cases.

My impression was that setting task_sandbox was a normally supported use case. If not, then this point is irrelevant (assuming that task_sandbox is never inappropriately user-assignable). However, the current docs allows a sandbox field in the TaskDescription as long as it is relative to the pilot sandbox.

@andre-merzky
Copy link
Member

Does a non-default value of task_sandbox raise an error (when scheduler is a non-empty string) that prevents acquisition of a broken Task object?

Oh. Excuse me, maybe. I'm not asking for a new feature---just predictability with respect to standard use cases.

My impression was that setting task_sandbox was a normally supported use case. If not, then this point is irrelevant (assuming that task_sandbox is never inappropriately user-assignable). However, the current docs allows a sandbox field in the TaskDescription as long as it is relative to the pilot sandbox.

We may talk a bit cross-purpose I think - or at least I may be missing the point. Let me try to do this stepwise.

  • specifying a sandbox in a task description is a normally supported use case (now also for raptor tasks)
  • any value set here is a non-default value (unless you manually retrace what the implementation would do to create the default sandbox path which would be possible but a bit useless).
  • we don't raise errors on such non-default values, as that is the purpose of the attribute.

The last statement reads like an ipso facto, so I am likely missing something?

@andre-merzky
Copy link
Member

andre-merzky commented Apr 11, 2023

Duh, I should read a bit more carefully. You also wrote:

However, the current docs allows a sandbox field in the TaskDescription as long as it is relative to the pilot sandbox.

That was corrected in the docs: the sandbox does not need to be relative to the pilot sandbox anymore, that restriction was lifted a while ago.

@eirrgang
Copy link
Contributor Author

eirrgang commented Apr 11, 2023

That is not possible in our current approach: the task sandbox is interpreted on the remote resource, and thus it's validity can only be ascertained once the task reaches the agent (and, in this case specifically, the raptor worker).

Which part is not possible? The error? If there is some part of the protocol in which a raptor task is less able to perform error checking than a traditional task, then we should document it. If assignment to sandbox for a raptor task (a task with "scheduler" set) behaves the same as for non-raptor tasks, then this item is fine. (I updated the checklist item text in an attempt to clarify)

@andre-merzky
Copy link
Member

andre-merzky commented Apr 11, 2023

Sorry, some quoting went wrong. Let me reply to the changed text:

The Task will fail if the path is not valid and accessible.

Yes, the task will move to FAILED state in that case. That will not happen during submission though, but either during data staging or during execution. Checking path validity of the remote path during submission would be costly, and would race with the actual state of the file system anyway...

PS: This holds for all tasks, not only Raptor tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants