# Path translation and file synchronization

* **Difficulty level**: intermediate
* **Time need to lean**: 20 minutes or less
  * A remote host might have different paths from the local host, making the execution of tasks difficult
  * SoS automatically translates paths specified in `_input`, `_depends` and `_output` according to host configurations
  * Options `to_host` and `from_host` specify files and directories send before task execution and retrieve after task execution, respectively.
  * Use of named path could make your workflow more portable and easier to read.  

## Translation of input and output paths

When local and remote hosts do not share file systems (or share only some file systems), things can get a bit complicated because SoS will need to decide what paths to use on the remote host. There are a few things to understand here:

**The current project directory, and all input, output and dependent files that are involved need to be under paths defined for local and remote host.** This is usually not a problem if you are working under your home directory and you have `home` defined under `paths` of both local and remote hosts, but can become more complicated if your tasks involves system directories such as `resource`, `temp`, and `scratch` that are outside of `home`. In these cases, all involved directories need to be defined for both local and remote hosts.

**Unless specified otherwise, the tasks will be executed under the remote version of the current working directory.**. That is to say, the execution of tasks will leave files on remote hosts that will not be automatically removed, and in a worse scenario **might overwrite remote files without warning**. This is why we recommend that you set remote `home` to a directory other than the true `home` (e.g. `/home/user_name/scratch`, or `/home/user_name/sos_temp`). In this way SoS will write to sos-specified directories on remote hosts and will not containminate your real `home` directory.

**Unless specified otherwise, input and dependent files will be copied to remote host before execution, and output files will be copied to local host after the completion of the task.** It is therefore important for you to plan ahead and avoid synchronization of large files that should stay on remote hosts.

## Working directory of tasks (Option `workdir`)

The `workdir` of task is default to the current working directory, or, in the case of remote execution, the remote counterpart of the current working directory.

Option `workdir` controls the working directory of the task. For example, the following step downloads a file to the `resource` directory using [action `download`](download.html).

In [9]:
task: queue='localhost', workdir='resource'

download:
  ftp://speedtest.tele2.net/512KB.zip

0,1,2,3,4
,85ea891331ab4bcb,5057fa441d6e1755scratch_0user_guide,Ran for < 5 seconds,completed


In [10]:
!ls resource

512KB.zip


## Sending additional files before task execution (Option `to_host`)

Option `to_host` specifies additional files or directories that would be synchronized to the remote host before tasks are executed. It can be specified as

* A single file or directory (with respect to local file system), or
* A list of files or directories, or

The files or directories will be translated using the host-specific path maps. Note that if a symbolic link is specified in `to_host`, both the symbolic link and the path it refers to would be synchronized to the remote host.


## Retrieving additional files after task completion (Option `from_host`)

Option `from_host` specifies additional files or directories that would be synchronized from the remote host after tasks are executed. It can be specified as

* A single file or directory (with respect to local file system), or
* A list of files or directories, or

The files or directories will be translated using the host-specific path maps to determine what remote files to retrieve.

## Further reading

* [The `task_statement`](task_statement.html)