Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requiring shared directory for remote execution of tasks #1535

Closed
BoPeng opened this issue Feb 7, 2024 · 3 comments
Closed

Requiring shared directory for remote execution of tasks #1535

BoPeng opened this issue Feb 7, 2024 · 3 comments

Comments

@BoPeng
Copy link
Contributor

BoPeng commented Feb 7, 2024

Currently the sos task execution system is complicated because it needs to copy files around if files reside on directories that are not shared by the job submission machine and the head node. A lot of work, such as signature checking, directory mapping, and file synchronization has been done to enable a powerful, yet complicated and confusing system that can easily break.

I propose that we remove all the file mapping feature and require file systems to be shared. In this way input and output files have to exist on both local and remote file systems, and there is no need to map directories and copy files around. The system will be much easier to understand, configure, and a lot more robust.

@gaow
Copy link
Member

gaow commented Feb 7, 2024

Agreed, considering most of our user cases so far are the simpler case of completely local or remote.

@BoPeng BoPeng mentioned this issue Feb 9, 2024
@BoPeng
Copy link
Contributor Author

BoPeng commented Feb 9, 2024

One problem with this approach is that the $HOME directory on different systems can be different (although there can be shared mounts). It is then necessary to allow the specification of ~/.sos/tasks to a directory that is shared. Because it can be a bad idea to mix jobs executed on different hosts, the tasks directory should better be host-specific.

For backward compatibility, let us keep tasks to home directory. In this way, sos installed on remote servers does not need to be updated.

We need to

  1. Confirm if input file is on one of the shared directories.
  2. Confirm if workdir etc are on one of the shared directories.

BoPeng added a commit to vatlab/sos-docs that referenced this issue Feb 17, 2024
@BoPeng
Copy link
Contributor Author

BoPeng commented Feb 17, 2024

Done, with a new version of sos released. #1536

@BoPeng BoPeng closed this as completed Feb 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants