## The `CompsynConfig` class

`compsyn.utils.CompsynConfig` provides a convenient way to setup your runtime configuration through code.

```python
class CompsynConfig:
    def __init__(self, **kwargs: Dict[str, str]) -> None:
        self.config = dict()
        # fill argument values according to argparse config
        for key, val in self.args.items():
            set_env_var(key, val)
            self.config[key] = val
        # overwrite argparse values with those passed
        for key, val in kwargs.items():
            set_env_var(key, val)  # sets passed config values in os.environ
            self.config[key] = val # store on self for convenience
```

It is possible to configure compsyn entirely using environment variables, but this class provides a more code-centric way to set relevant environment variables. `CompsynConfig.args` are collected from the various `get_<component>_args` methods found throughout compsyn. `kwargs` passed to `CompsynConfig.__init__` will take precedence over those gathered from argparse. 

In [2]:
from compsyn.config import CompsynConfig

# the host running this notebook has many compsyn environment variables set, so the CompsynConfig will see them.
print(CompsynConfig(jzazbz_array='../jzazbz_array.npy'))

CompsynConfig
	work_dir                       = //anaconda3/lib/python3.7/site-packages
	jzazbz_array                   = ../jzazbz_array.npy
	google_application_credentials = None
	driver_browser                 = Chrome
	driver_path                    = chromedriver
	s3_bucket                      = None
	s3_region_name                 = None
	s3_endpoint_url                = None
	s3_access_key_id               = None
	s3_secret_access_key           = None
	log_level                      = 20
	log_file                       = None


The `CompsynConfig` class sets these values in `os.environ`, so that other parts of the code can access them. You may wish to set config values through code by passing arg values to the `CompsynConfig` instantiation.

In [3]:
from compsyn.trial import get_trial_from_env, Trial

config = CompsynConfig(
    driver_browser="Chrome", 
    driver_path="chromedriver", 
    hostname="my-id"
)

print(config)

CompsynConfig
	work_dir                       = //anaconda3/lib/python3.7/site-packages
	jzazbz_array                   = jzazbz_array.npy
	google_application_credentials = None
	driver_browser                 = Chrome
	driver_path                    = chromedriver
	s3_bucket                      = None
	s3_region_name                 = None
	s3_endpoint_url                = None
	s3_access_key_id               = None
	s3_secret_access_key           = None
	log_level                      = 20
	log_file                       = None
	hostname                       = my-id


## Purpose of `CompsynConfig`

The `CompsynConfig` class is a convenient mechanism for setting up the environment `compsyn` code uses to do it's multi-modal analyses. The values set through `CompsynConfig` are required for *the code to run successfully*.

To facilitate using compsyn as an experimental framework, further configuration may be achieved through the `Trial` class (See associated notebook trial_and_vector.ipynb). The values set in `Trial` should not be considered part of the `CompsynConfig`, as a given compsyn user may be analyzing data accross multiple trials. The values set through `Trial` are required to *implement experimental designs*.


__Note__: The config values are likely to not change, so can be set in the environment. If you are using jupyter notebooks, this means the environment of the shell running the jupyter notebook server. 

__Note__: Usage is optional, or rather, defaults are provided for the core functionality of the `compsyn` package. Using more advanced features, like the shared s3 backend, will require configuration to be set. Here we will show those defaults by clearing the environment of this kernel:


In [3]:
import os

for key, val in os.environ.items():
    if key.startswith("COMPSYN_"):
        del os.environ[key] # simulate an unset environment


default_config = CompsynConfig()
default_trial = get_trial_from_env()

print("default", default_trial)
print()
print("default", default_config)

[1616905443] (compsyn.Trial)  INFO: experiment: default-experiment
[1616905443] (compsyn.Trial)  INFO: trial_id: default-trial
[1616905443] (compsyn.Trial)  INFO: hostname: default-hostname
default Trial
	experiment_name = default-experiment
	trial_id        = default-trial
	hostname        = default-hostname
	trial_timestamp = 2021-03-28

default CompsynConfig
	work_dir                       = /Users/tasker/checkout/comp-syn
	jzazbz_array                   = jzazbz_array.npy
	google_application_credentials = None
	driver_browser                 = Chrome
	driver_path                    = chromedriver
	s3_bucket                      = None
	s3_region_name                 = None
	s3_endpoint_url                = None
	s3_access_key_id               = None
	s3_secret_access_key           = None
	log_level                      = 20
	log_file                       = None


*__Note__: the default work_dir will be the root of wherever you have the comp-syn repository cloned.*

## Common Configuration patterns

It can get messy quickly to store data in the default work directory, which will be wherever the comp-syn repository is cloned. It is usually a good idea to use a `work_dir` that exists outside of the repo. For instance, if you are collecting a large amount of data, you may wish to use a `work_dir` located on an external harddrive, like:

In [4]:
config = CompsynConfig(
    work_dir="/Volumes/LACIE/compsyn/data/zth"
)

print()
print(config)


CompsynConfig
	work_dir                       = /Volumes/LACIE/compsyn/data/zth
	jzazbz_array                   = jzazbz_array.npy
	google_application_credentials = None
	driver_browser                 = Chrome
	driver_path                    = chromedriver
	s3_bucket                      = None
	s3_region_name                 = None
	s3_endpoint_url                = None
	s3_access_key_id               = None
	s3_secret_access_key           = None
	log_level                      = 20
	log_file                       = None


### Use the environment for defaults

A further improvement on this would be to set your desired `work_dir` in the environment running the jupyter notebook server. All of the `CompsynConfig` values can be set by environment variables named with a `COMPSYN_` prefix, for example `COMPSYN_WORK_DIR` and `COMPSYN_DRIVER_PATH`. 

Unlike the `CompsynConfig` values, there may be multiple sets of `Trial` values in a given jupyter notebook (or other workflow), so you should usually use the `Trial` class to set trial values directly in code:

In [5]:
# toy example trial for participating in some geolocation-sensitive experiment
trial = Trial(
    experiment_name="regional-differences",
    trial_id="phase-0",
    hostname="toronto",
)

print()
print(trial)

[1616905443] (compsyn.Trial)  INFO: experiment: regional-differences
[1616905443] (compsyn.Trial)  INFO: trial_id: phase-0
[1616905443] (compsyn.Trial)  INFO: hostname: toronto

Trial
	experiment_name = regional-differences
	trial_id        = phase-0
	hostname        = toronto
	trial_timestamp = 2021-03-28


__Note__: Environment variables for the trial values are supported as well, to facilitate programmatic execution of compsyn experiments.