# Configuration Files

* **Difficulty level**: easy
* **Time need to lean**: 10 minutes or less
* **Key points**:
  * a
  

### Configuration files <a id="Configuration_files"></a>

SoS reads configurations from 
* A site configuration file `site_config.yml` under the sos package directory
* A host configuration file `~/.sos/hosts.yml`
* A global sos configuration file `~/.sos/config.yml`
* And a configuration file specified by command line option `-c`.

The configuration files should be in [YAML format](http://www.yaml.org/start.html). Dictionaries defined in all these configuration files are merged to form a single dictionary that is available to SoS as a dictionary named `CONFIG`.

Note that:
* All configurations from the aforementioned files are merged to a single dictionary. A dictionary could therefore contain keys defined in different configuration files and a latter file could overwrite keys defined in a previous file. For example, if 
  * `{'A': {'B': 'old', 'C': 'old'}` is defined in `~/.sos/config.yml` and
  * `{'A': {'B': 'new', 'D': 'new'}` is defined in `my_config.yml`, then
  * dictionary `A` in `CONFIG` would have value `{'B': 'new', 'C': 'old', 'D': 'new'}`. 
* SoS interpolate string values in `CONFIG` if they contain `{ }`. The expressions enclosed by `{ }` would be evaluated with a local namespace that is the dictionary in which the key exists, and a global namespace that is the complete `CONFIG` dictionary. That is to say, if a configuration file contains
  ```
  user_name: user
  hosts:
    cluster:
      address: "{user_name}@domain.com:{port}"
      port: 123
   ```
  `CONFIG['hosts']['cluster']['address']` would be interpolated with `port` from the `CONFIG['host']['cluster']` and `user_name` from the top level `CONFIG['user_name']`. You will need to double the braces (`{{ }}` to include `{ }` in the config file.
* Because key `user_name` is frequently used in `hosts.yml`, SoS automatically defines `user_name` as the local user ID (all lower case) in `CONFIG` if it is not defined in any of the configuration files.
* A special key `based_on` will be processed after all configuration files are loaded. The value of `based_on` should be one or more keys to other dictionaries in the configuration (e.g. `hosts.cluster`. The consequence of this key is that the items from the referred dictionaries would be merged to the present dictionary if they do not exist in the present dictionary. This allows you to derive a dictionary from an existing one. For example, 
  ```
  hosts:
    head_node:
      description: head_node of cluster
      address: "{user_name}@domain.com:{port}"
      port: 123
      paths:
          home:   "/home/{user_name}"
    cluster:
      description: Cluster
      based_on: hosts.head_node
      queue_type: pbs
   ```
   allows `hosts["cluster"]` to be derived from `hosts["head_node"]`.

SoS allows you to store parameters in a number of configuration files. There are three kinds of configuraiton files:

1. Global user-specific configuration file `~/.sos/config.yml`.
2. Local project-specific configuraiton file `./config.yml` (under the current working directory).
3. Configuration file specified by option `-c`.

The configuration files should be in the format of [`YAML`](http://yaml.org/) or its subset format [`JSON`](http://json-schema.org/implementations.html). When a SoS script is loaded, SoS looks for and parses the global and project-specific configuration files, and a configuration file specified by option `-c`. The results are stored in a global variable `CONFIG` that is available to the script.

## Use of configuration file

The configuration file should be in `YAML` format, which is a superset of JSON so any configuration file in JSON format is also acceptable. 

Let us create a yaml file with some simple content

In [1]:
run:
    cat << EOF > myconfig.yml
    # A list of tasty fruits
    martin:
        name: Martin D'vloper
        job: Developer
        skill: Elite
    manager: martin
    EOF

The configuration file looks like

In [2]:
!cat myconfig.yml

# A list of tasty fruits
martin:
    name: Martin D'vloper
    job: Developer
    skill: Elite
manager: martin


Variables defined in the configuration file are available in SoS script as a dictionary `CONFIG`. You can retrieve its values as a regular dictionary although writing to this dictioary is prohibited. For convenience, an attribute syntax can also be used to access dictionary items.

In [3]:
%run -c myconfig.yml
print(CONFIG['martin'])
print(CONFIG.martin['name'])
print(CONFIG.manager)

{'name': "Martin D'vloper", 'skill': 'Elite', 'job': 'Developer'}
Martin D'vloper
martin


Configuration files are frequently used to specify system configurations. For example

In [4]:
%run -c myconfig.yml
manager = CONFIG.manager

allow you to define name of `manager` in a configuration file. If you do not want to require a configuration file, you can define `manager` as

In [5]:
manager = CONFIG.get('manager', 'Bob')
print(manager)

Bob


So the manger would be `Bob` without configuration file, and `Martin` with configuration file.

In [6]:
%run -c myconfig.yml
manager = CONFIG.get('manager', 'Bob')
print(manager)

martin


If you would further want to allow modification of this value from command line, you can place this definition after `parameter`.

In [7]:
parameter: manager = CONFIG.get('manager', 'Bob')
print(manager)

Bob


In this way, users have the freedom to use the default value, define a value in a configuration file, and provide another value from command line. 

In [8]:
%rerun

Bob


In [9]:
%rerun -c myconfig.yml

martin


In [10]:
%rerun -c myconfig.yml --manager Joe

Joe


In [11]:
%rerun --manager Joe

Joe


## Command `sos config`

Although `yaml` is not a difficult format to learn. It is often easier to use command `sos config` to check and set values in configuration files, especially for complex data types.

`sos config` by default works on the local `config.yml` file. For example

In [12]:
!sos config --set cutoff 0.5

Set cutoff to 0.5


creates `config.yml` with content

In [13]:
!cat config.yml

cutoff: 0.5


It can also work on the global configuration file with option `--global`, or a local configuration file `--config file`. For example,

In [14]:
!sos config -c myconfig.yml --set cutoff '{"low":1, "high":10}'

Set cutoff to {'high': 10, 'low': 1}


would add a line to existing configuration file `myconfig.yml`

In [15]:
!cat myconfig.yml

cutoff:
  high: 10
  low: 1
manager: martin
martin:
  job: Developer
  name: Martin D'vloper
  skill: Elite


The command is clever enough to handle partial values (e.g. of a dictionary), so if you do 

In [16]:
!sos config -c myconfig.yml --set cutoff.low 2

Set cutoff to {'high': 10, 'low': 2}


The commands updates one of the values of dictionary `cutoff`.

Other than `set`, you can check the content of a configuration file using option `get`. For example, the following command get all values defined in `config.yml`,

In [17]:
!sos config --get

cutoff	0.5


and the following command get the value of `manager` from `myconfig.yml`.

In [18]:
!sos config -c myconfig.yml --get manager

manager	'martin'


wildcard characters are allowed to specify a subset of keys, although the name should be quoted to avoid shell expansion.

In [19]:
!sos config -c myconfig.yml --get 'c*'

cutoff	{'low': 2, 'high': 10}


Finally, if you would like to remove a key from a configuration file, you can use option `--unset`.

In [20]:
!sos config -c myconfig.yml --unset martin

Unset martin


In [21]:
!cat myconfig.yml

cutoff:
  high: 10
  low: 2
manager: martin


In [22]:
# clean up
!rm config.yml myconfig.yaml