# Set up for model experiments

We'll do the following here:

- Create distinct catalogs
- Document parameters changes that will accompany each
- Create yamls for each experiment

## Initial experiments - FTW baseline model, FTW dataset

The first tests will be on a few different parameters/settings on the existing FTW baseline model, on just the FTW dataset (the full one). 

### Catalog

In [36]:
import pandas as pd
from pathlib import Path

catalog = pd.read_csv("../data/ftw-mappingafrica-combined-catalog.csv")
catalog.query("dataset == 'ftw'").to_csv("../data/ftw-catalog.csv", index=False)

### Config adjustor

In [37]:
import yaml

def write_yaml(template_path: str, output_path: str, updates: dict = None):
    """
    Write a YAML file from a template file, with optional updates.

    Args:
        template_path (str): Path to the base YAML template file.
        output_path (str): Path to the output YAML file.
        updates (dict, optional): Dictionary of keys/values to update.
    """

    def recursive_update(d, u):
        for k, v in u.items():
            if isinstance(v, dict) and isinstance(d.get(k), dict):
                recursive_update(d[k], v)
            else:
                d[k] = v

    with open(template_path, 'r') as f:
        config = yaml.safe_load(f)
        if updates:
            recursive_update(config, updates)

    class IndentDumper(yaml.SafeDumper):
        def increase_indent(self, flow=False, indentless=False):
            return super().increase_indent(flow, False)

    with open(output_path, 'w') as f:
        yaml.dump(
            config,
            f,
            Dumper=IndentDumper,
            default_flow_style=False,
            sort_keys=False,
            indent=2,
            allow_unicode=True
        )

# Example usage:
# base_config = {'param1': 10, 'param2': 'foo'}
# write_yaml(base_config, 'experiment.yaml', updates={'param1': 20})

### Experiments

All experiments here are FTW baseline model, window B only, on the FTW dataset.

Single parameter or no change:

1. FTW defaults (for comparison with FTW's results)
2. Locally-weighted tversky focal loss
3. min-max normalization, lab
4. min-max normalization, gab
5. photometric augmentation package
6. satslidemix
7. rescale

#### Setup

Below we set up a yaml for each experiment. Provide the following:

- `cfg_name`: name of the config/experiment file (without .yaml)
- `update`: dictionary of changes to make to the base config

Also define a global `home_dir` for the path to the repo containing the catalog. That's done once in the first cell. 

#### # 1

In [38]:
home_dir = "/home/airg/lestes/projects"
cfg_name = "ftwbaseline-exp1"
update = dict(
    trainer=dict(
        default_root_dir=f"~/models/{cfg_name}"
    ), 
    data=dict(
        init_args=dict(
            catalog=f"{home_dir}/ftw-mappingafrica-integration/data/ftw-catalog.csv",
        )
    )
)
write_yaml("../configs/template-hpc-config.yaml", 
           f"../configs/{cfg_name}.yaml", 
           updates=update)

#### # 2


In [40]:
cfg_name = "ftwbaseline-localtversky-exp2"
update = dict(
    trainer=dict(
        default_root_dir=f"~/models/{cfg_name}"
    ), 
    model = dict(
        init_args=dict(
            loss="localtversky"
        )
    ), 
    data=dict(
        init_args=dict(
            catalog=f"{home_dir}/ftw-mappingafrica-integration/data/ftw-catalog.csv",
        )
    )
)
write_yaml("../configs/template-hpc-config.yaml", 
           f"../configs/{cfg_name}.yaml", 
           updates=update)

#### # 3


In [41]:
cfg_name = "ftwbaseline-minmax_lab-exp3"
update = dict(
    trainer=dict(
        default_root_dir=f"~/models/{cfg_name}"
    ), 
    data = dict(
        init_args=dict(
            normalization_strategy="min_max",
            normalization_stat_procedure="lab",
            global_stats=None,
            catalog=f"{home_dir}/ftw-mappingafrica-integration/data/ftw-catalog.csv",
        )
    )
)
write_yaml("../configs/template-hpc-config.yaml", 
           f"../configs/{cfg_name}.yaml", 
           updates=update)

#### # 4

Not yet run (need to calculate global stats)


#### # 5

In [42]:
augs = ["rotation", "hflip", "vflip", "sharpness"]
cfg_name = "ftwbaseline-photometric-exp5"
update = dict(
    trainer=dict(
        default_root_dir=f"~/models/{cfg_name}"
    ), 
    data = dict(
        init_args=dict(
            aug_list=augs + ["brightness", "contrast", "gaussian_noise"],
            catalog=f"{home_dir}/ftw-mappingafrica-integration/data/ftw-catalog.csv",
        )
    )
)
write_yaml("../configs/template-hpc-config.yaml", 
           f"../configs/{cfg_name}.yaml", 
           updates=update)

#### # 6

In [44]:
augs = ["rotation", "hflip", "vflip", "sharpness"]
cfg_name = "ftwbaseline-satslide-exp5"
update = dict(
    trainer=dict(
        default_root_dir=f"~/models/{cfg_name}"
    ), 
    data = dict(
        init_args=dict(
            aug_list=augs + ["satslidemix"],
            catalog=f"{home_dir}/ftw-mappingafrica-integration/data/ftw-catalog.csv",
        )
    )
)
write_yaml("../configs/template-hpc-config.yaml", 
           f"../configs/{cfg_name}.yaml", 
           updates=update)

#### # 7

In [None]:
augs = ["rotation", "hflip", "vflip", "sharpness"]
cfg_name = "ftwbaseline-rescale-exp6"
update = dict(
    trainer=dict(
        default_root_dir=f"~/models/{cfg_name}"
    ), 
    data = dict(
        init_args=dict(
            aug_list=augs + ["rescale"],
            catalog=f"{home_dir}/ftw-mappingafrica-integration/data/ftw-catalog.csv",
        )
    )
)
write_yaml("../configs/template-hpc-config.yaml", 
           f"../configs/{cfg_name}.yaml", 
           updates=update)