# Introduction

It was a bit difficult to explain any of them outside the context of others. Now that they've been explained [for single stages](demonstrations/E_CMB_ML_framework.ipynb) and [for the whole pipeline](demonstrations/F_CMB_ML_pipeline.ipynb) I can illustrate how I use them with less worry about overwhelming the reader.

In this notebook, I want to tackle multiple simulations and splits. This should:
- Drive home the advantage of using Assets with path_templates
- Introduce the `Split` object
- Show a few different structures for Executors
- Point out a few other practices that I think are good <!--PLEASE GIVE FEEDBACK ON THIS. I'M MAKING THIS UP. I'M A CHILD LOST IN THE WOODS-->

I'll cover three different Executors, which differ in how they process multiple simulations:
- [A simple Executor](#serially-iterating-executor) that iterates slowly (relatively)
- An Executor that uses multiprocessing to iterate quickly
- An Executor that sets up and uses a PyTorch DataLoader

## Example

In this notebook, I'll continue with the example of wanting to convert a power spectrum into a map. But this time, I'll consider that there's a single power spectrum and produce many realizations for hypothetical Training, Validation, and Test splits. (This is likely contrived. Some independent variable is needed so that a network is trained to prediction something, but this suffices for now.)

# Set-Up

I have to again set a few things up so that the CMB-ML framework plays nice with Jupyter.

In [1]:
# Ignore this cell. 
# It's needed for the notebook to work, not something to learn.

import sys
import os

# Set the local_system
os.environ["CMB_ML_LOCAL_SYSTEM"] = "generic_lab"

# Add the path to the parent directory so I can import cmb-ml
repo_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
sys.path.insert(0, repo_root)

In [2]:
import logging
from hydra import compose, initialize
import numpy as np
import healpy as hp

from cmbml.core import BaseStageExecutor, Asset
from cmbml.core.asset_handlers import TextPowerSpectrum, HealpyMap

In [3]:
logger = logging.getLogger("F_Tutorial")
logger.setLevel(logging.DEBUG)

# Outside of a notebook, Hydra will handle the logging. 
handler = logging.StreamHandler()  # StreamHandler sends logs to sys.stdout by default
handler.setLevel(logging.DEBUG)
logger.addHandler(handler)

In [4]:
from omegaconf import OmegaConf
from hydra.core.hydra_config import HydraConfig

with initialize(version_base=None, config_path="../cfg"):
    cfg = compose(config_name="config_demoG_framework")

# Basic Executor

I'll use the basic Executor 

In [9]:
class MakePSExecutor(BaseStageExecutor):
    def __init__(self, cfg):
        super().__init__(cfg, stage_str="ps_setup")

        self.out_cmb_ps: Asset = self.assets_out["cmb_ps"]
        # I note the handler for assets. When using an IDE, this makes
        #   it easier to navigate to the handler's code.
        out_cmb_ps_handler: TextPowerSpectrum

        # The config file has a list of values for 
        self.ps_model = cfg.model.ps

    def execute(self):
        ell = np.arange(200)
        # This is a naive model for the CMB power spectrum.
        #   It's just a polynomial fit up to ell=200 for the 
        #   Planck 2018 power spectrum.
        ps = np.poly1d(self.ps_model)
        self.out_cmb_ps.write(data=ps(ell))
        logger.info(f"CMB power spectrum written to {self.out_cmb_ps.path}")

# Serially Iterating Executor

The simplest form for an Executor will simply iterate in a naive way.

It's seldom the case that I need to think too much about my Namer object, other than to `set_context()` or `set_contexts()`. However, there's a few important points to keep in mind:
- Define templates clearly: Use meaningful and consistent placeholders in your YAML configuration.
- Update context carefully: Ensure the Namer’s context is set correctly before accessing file paths.
- Set context before descending: When iterating over splits or simulations, use set_context() before calling an inner loop. For example:

```python
  def execute(self):
    for split in self.splits:
      with self.name_tracker.set_context('split', split.name):
        self.process_split(split)
```

This practice simply helps simplify indentation.