# How to execute tasks on a remote server without a batch system

* **Difficulty level**: easy
* **Time need to lean**: 10 minutes or less
* **Key points**:

  

#### `queue_type`

Option `query_type` determines the type of remote server or job queue. SoS currently supports the following types of job queues:

1. **`process`**: this is the default queue type. Tasks are executed directly, either on local host or on a server.
2. **`pbs`**: A PBS/MOAB/LFS/Slurm cluster system where tasks are submitted using commands such as `qsub`.
3. **`rq`**: A redis queue where tasks are submitted to the rq server and monitored through rq-dashboard.

### Common host configuration

SoS needs to know how to connect to a remote host, how to synchronize files between local and remote hosts, and how to execute commands on the remote host. These should be defined with the following keys:

#### `address`

IP address or URL of the host. Note that

* `address` should be ignored for hosts, for example your desktop, that do not accept remote execution.
* If you have a different user name on the the remote host, specify the `address` in the format of `username@hostaddress`.
* SoS does not support username/password authentication so **public key authentication between local and remote hosts is reuired for communication between local and remote host**.
* SoS currently does not support remote execution on windows hosts so no `address` is needed for windows hosts.

#### `hostname`

The `hostname` of the machine, as reported by command `hostname` or Python's `socket.gethostname()`. This entry is used to identify the machine on which sos starts. It is needed only if the hostname is different from `alias` and `address`.

#### `port`

ssh port of the remote host, which is `22` by default. 

#### `shared`

Option `shared` tells SoS which file systems are shared between localhost and some remote hosts so that it does not have to synchronize files under these directories between the hosts.

The `shared` entry should be defined with `name` and `path` pairs in the format of

```
hosts:
    desktop:
    server:
        shared:
            project: /myprojects
            HTFS: /
    worker:
        shared:
            HTFS: /
    server1:
        shared:
            project: /scratch/myprojects
            data: /scratch/data
````

The above cooked configuration says:

1. `desktop` does not share any volume with any other machine so all files need to be transferred.
2. `server` and `worker` shares `HTFS` with directory `/`, so all files are shared.
3. `server` and `server1` share a `project` volume but the volume is mounted at different locations. So files under `myprojects` are not synchronized if you are submitting jobs from `server` to `server1`, and files under `/scratch/myprojects` are not synchronized if you are submitting jobs from `server1` to `server`.


#### `paths`

`paths` defines paths that will be translated when a task is executed remotely. For example, your input file on a mac might be `/Users/myuser/project/KS28.fa`, but it should be named `/home/myuser/project/KS28.fa` if it is processed on a remote server. In this case, you should define directories `/Users/myuser` and `/home/myuser` as equivalent directories on the two hosts, using 

```
hosts:
    desktop:
        paths:
            home: /Users/myuser
    server:
        paths:
            home: /home/myuser
```

Multiple entried could be defined and the files would be mapped by the longest mapping path. For example, if you have on the server a shared location for all resources, you could define

```
hosts:
    desktop:
        paths:            
            home: /Users/myuser
            resource: /Users/myuser/resources
    server:
        paths:
            home: /home/myuser
            resource: /shared/resources
```

so that `/Users/myuser/resources/hg19.fa` could be mapped to `/shared/resources/hg19.fa` on the server. Note that `/Users/myuser/resource/hg19.fa` would be mapped to `resource` instead of `home` because `resource` matches longer piece of the input path.

A remote host can be accessible from a local host only if the remote host defines all paths defined by the local host. More specifically, if host A defines path `home` and host B defines paths `home` and `resource`, it is possible to connect from host A to host B using `home`, but not from host B to A because SoS does not know how to map paths under `resource`.

#### `path_map` (derived)

With definitions of `shared` and `paths` on both hosts, SoS would derive a set of `path_maps` between all hosts using common keys in `shared` and `path` between the hosts. Actually, when you run 

```
sos status -q -v3
```

to list all host configurations, SoS would list all hosts accessible from `localhost`, with host-specific `path_map`, which is a list of directory mappings between local and remote directories. For example, the `path_map` from `desktop` to `server` using the above example would be

```
/Users/myuser -> /home/myuser
/Users/myuser/resoruces -> /shared/resources
```

Note that if a directory shows in both `shared` and `paths` (e.g. `/scratch` in `shared` and `/scratch/user` in `paths`), files can still be synchronized following `path_map` to a different directory even if they are shared and are already on the remote server.

## Further reading

* 