# Setting Up Your Local System

We now need to set up your local system.

### Setting the local_system configuration file

First you'll create a configuration file.

I suggest using [mine](../cfg/local_system/generic_lab.yaml) as an example. Open that file and take a look. It has two keys.

The `datasets_root` will be where the datasets themselves are written. At first this will contain, for a dataset, only the simulation and the Logs generated while producing that simulation. As more of the pipeline is run, many stages will create folders alongside simulation.

The `assets_dir` is only for the science assets (maps used for noise, instrument parameters, cosmological parameter distributions). It is used once.

Set those according to your local system.

If more granularity of file storage is needed (e.g., you want to store models on a faster drive or analysis results on a slower drive), this can also be done in the pipeline yamls.

### Setting the top level configuration file

We also need to let your system know where that yaml is. This information goes in top level configurations, e.g. [config_setup.yaml](../cfg/config_setup.yaml), which look like:

```yaml
defaults:
  - local_system: ${oc.env:CMB_ML_LOCAL_SYSTEM}
  - file_system : common_fs
  - override hydra/job_logging: custom_log
  - _self_
```

For `local_system`, either change the value to the name of your local system yaml file, e.g.:

```yaml
  - local_system: generic_lab.yaml
```

Or add an environment variable to your system. On linux working with Python scripts, the command `export CMB_ML_LOCAL_SYSTEM=generic_lab.yaml` would be added to your shell startup script. In jupyter notebooks, this is done through the `os` library. This option is very useful for researchers using the dataset on multiple systems.

### Checking the configuration

Set this up now for both your local system configuration and [config_setup.yaml](../cfg/config_setup.yaml). Let's see how it looks:

In [3]:
import os
import hydra
from hydra import compose, initialize
from omegaconf import OmegaConf

# Set the environment variable, only effective for this notebook.
os.environ['CMB_ML_LOCAL_SYSTEM'] = 'generic_lab'

In [4]:
hydra.core.global_hydra.GlobalHydra.instance().clear() # if re-initialize is needed, clear the global hydra instance (in case of multiple calls to initialize)

initialize(version_base=None, config_path="../cfg")

cfg = compose(config_name='config_setup.yaml')

print(OmegaConf.to_yaml(cfg))

local_system:
  datasets_root: /data/generic_user/CMB_Data/Datasets/
  assets_dir: /data/generic_user/CMB_Data/Assets/
file_system:
  sim_folder_prefix: sim
  sim_str_num_digits: 4
  dataset_template_str: '{root}/{dataset}/'
  default_dataset_template_str: '{root}/{dataset}/{stage}/{split}/{sim}'
  working_dataset_template_str: '{root}/{dataset}/{working}{stage}/{split}/{sim}'
  subdir_for_log_scripts: scripts
  log_dataset_template_str: '{root}/{dataset}/{hydra_run_dir}'
  log_stage_template_str: '{root}/{dataset}/{working}{stage}/{hydra_run_dir}'
  top_level_work_template_str: '{root}/{dataset}/{stage}/{hydra_run_dir}'
  wmap_chains_dir: WMAP/wmap_lcdm_mnu_wmap9_chains_v5



Those look good to me.

# Getting Science Assets

<!-- We now need to get either:
- All science assets for running simulations
- Just the asset containing the mask used for analysis -->

### All Science Assets

The easiest method is the simplest: run [the get_data/get_assets.py](../get_data/get_assets.py) script. This will download from the ESA's Planck Legacy Archive and from NASA's LAMBDA Archive.

Downloads may be slow. There is also a CMB-ML data mirror for these files, but links are not currently available. Please contact us through the GitHub repository for more information.

# Continuing

Your system is now set up to use CMB-ML.

Next, we'll look at a couple simulations to better understand the data, in [the next demonstration notebook](./D_getting_dataset_instances.ipynb).