# Goal

Understand and use exca in a very basic scenario: simply to create a simple model, config and cache it.



### The Philosophy

#### Pure Python

The tools here do not provide a script API but a way to do everything directly from Python. Specific script APIs can be easily composed on top of it if needed.

#### Parameter Validation

Configurations should be validated before running to avoid discovering bugs later (e.g., missing parameters, inconsistent parameters, wrong types, etc.). We achieve this by using `pydantic.BaseModel`, which works like dataclasses but validates all parameters.

#### Fast Configs

Running a grid search requires creating numerous configurations, so they should be easy and fast to create. This means not deferring the loading of data, PyTorch models, etc., to later.

#### No Parameter Duplication - Easy to Extend

Configurations hold the underlying actual function/class parameters. To avoid duplicating parameters, we couple configs with the actual classes/functions:

```python
class MyClassCfg(pydantic.BaseModel):
    x: int = 12
    y: str = "hello"

    def build(self) -> "MyClass":
        return MyClass(self)

class MyClass:
    def __init__(self, cfg: MyClassCfg):
        self.cfg = cfg
```



## A first example

Validating configurations and understand why it's important

Before anything:
```
pip install exca torch pydantic yaml typing pathlib sklearn
```

In [None]:
import pydantic
import torch

"""
Let's create a simple model configuration class using pydantic.
This class will define the model's hyperparameters and provide a method to build the model.
"""
class ConvCfg(pydantic.BaseModel):
    layers: int = 12
    kernel: int = 5
    channels: int = 128

    def build(self) -> torch.nn.Module:
        # instantiate when needed
        # (do not slow down config initialization)
        return ConvModel(self)  


class ConvModel(torch.nn.Module):

    def __init__(self, cfg: ConvCfg) -> None:
        self.cfg = cfg

# then in your code
model = ConvCfg().build()  # Works ! :-)

In [None]:
# Now, if we want to create a yaml config, with a parameter that's of a wrong type, it doesn't work! And even before sending it to slurm!
config = """
model:
  layers: 12
  kernel: 5
  channels: "hi"
"""
import yaml
config = yaml.safe_load(config)
model = ConvCfg(**config['model']).build()  # Raises an error! :(

### Discriminated unions (one step further)

In [None]:
import pydantic
import torch
from typing import Any, Dict, Optional
from torch import nn
import typing as tp
from pathlib import Path
import yaml
from pydantic import BaseModel, Field
from exca import TaskInfra, MapInfra
    
class TransformerModel(nn.Module):
    def __init__(self, cfg: "TransformerCfg") -> None:
        super().__init__()
        self.cfg = cfg
        # define your transformer model here

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        # define the forward pass
        pass
class ConvCfg(pydantic.BaseModel):
    name: tp.Literal["conv"] = "conv"  # special discriminator field
    layers: int = 12
    kernel: int = 5
    channels: int = 128

    def build(self) -> torch.nn.Module:
        return ConvModel(self) #instantiate when needed

...

class TransformerCfg(pydantic.BaseModel):
    model_config = pydantic.ConfigDict(extra="forbid")  # pydantic boilerplate: safer
    name: tp.Literal["transformer"] = "transformer"  # special discriminator field
    layers: int = 12
    embeddings: int = 128

    def build(self) -> torch.nn.Module:
        return TransformerModel(self)

...

class Trainer(pydantic.BaseModel):
    model: ConvCfg | TransformerCfg = pydantic.Field(..., discriminator="name")
    optimizer: str = "Adam"
    infra: TaskInfra = TaskInfra()

    @infra.apply
    def run(self) -> float:
        model = self.model.build()  # build either one of the model
        # specific location for this very config:
        ckpt_path = self.infra.uid_folder() / "checkpoint.pt"
        if ckpt_path.exists():
           # load
           ...
        ...
        # for batch in loader:
        #     ...
        # return accuracy
        return 100.


string = """
model:
  name: transformer  # specifies which model
  embeddings: 256  # only accepts transformer specific parameters
optimizer: SGD
"""
trainer = Trainer(**yaml.safe_load(string))

isinstance(trainer.model, TransformerCfg)

In [None]:
# We can do it with a ConvCfg too
string = """
model:
  name: conv  # specifies which model
  layers: 12
  kernel: 5
  channels: 128
optimizer: Adam
"""
trainer = Trainer(**yaml.safe_load(string))
isinstance(trainer.model, ConvCfg)

No need to instantiate the objects anymore / import them!

## Do it on a slurm cluster

Let's try to play with the most important part: launching jobs on a cluster, and caching the results.

Works as well in local!

In [None]:
from pathlib import Path
tmp_path = Path("/tmp")
string = f"""
model:
  name: transformer  # specifies which model
  embeddings: 256
optimizer: SGD
infra:
  gpus_per_node: 8
  cpus_per_task: 80
  slurm_constraint: volta32gb
  folder: {tmp_path}
  cluster: auto
  slurm_partition: learnfair
  workdir:
    copied:
      - . # copies current working directory into a dedicated workdir
      # - whatever_other_file_or_folder
"""

trainer = Trainer(**yaml.safe_load(string))
with trainer.infra.job_array() as array:
    for layers in [12, 14, 15]:
        array.append(trainer.infra.clone_obj({"model.layers": layers}))
# leaving the context submits all trainings in a job array
# and is non-blocking

# show one of the slurm jobs
print(array[0].infra.job())

In [None]:
rst = array[0].run()  # run the first job

In [None]:
rst

In [None]:
ls /tmp/__main__.Trainer.run,0/

What are those folders? 

## Add hierarchical classes, and """real life""" example

In [None]:
"""
A minimalist example with sklearn to show how to develop and explore a model with exca.
"""
import typing as tp
import numpy as np
import pydantic
import sys
import exca
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error


class Dataset(pydantic.BaseModel):
    n_samples: int = 100
    noise: float = 0.1
    random_state: int = 42
    test_size: float = 0.2
    model_config = pydantic.ConfigDict(extra="forbid")

    def get(self) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
        # Generate synthetic data
        X, y = make_regression(
            n_samples=self.n_samples,
            noise=self.noise,
            random_state=self.random_state
        )
        # Split into training and testing datasets
        X_train, X_test, y_train, y_test = train_test_split(
            X, y, 
            test_size=self.test_size, 
            random_state=self.random_state
        )
        return X_train, X_test, y_train, y_test


class Model(pydantic.BaseModel):
    data: Dataset = Dataset()
    alpha: float = 1.0
    max_iter: int = 1000
    infra: exca.TaskInfra = exca.TaskInfra(folder='.cache/', version='v1.0')

    @infra.apply
    def score(self):
        # Get data
        X_train, X_test, y_train, y_test = self.data.get()

        # Train a Ridge regression model
        print('Fit...')
        model = Ridge(alpha=self.alpha, max_iter=self.max_iter)
        model.fit(X_train, y_train)

        # Evaluate
        print('Score...')
        y_pred = model.predict(X_test)
        mse = mean_squared_error(y_test, y_pred)
        return mse


if __name__ == "__main__":
    # Validate config
    basic_config = {"alpha": 1.0, "max_iter": 1000}
    config = exca.ConfDict(basic_config)
    model = Model(**config)
    print(model.infra.config)

    # Score
    mse = model.score()
    print(mse)

In [None]:
ls .cache/

## Update the scoring function: make it a new version !

In [None]:

class Model(pydantic.BaseModel):
    data: Dataset = Dataset()
    alpha: float = 1.0
    max_iter: int = 1000
    infra: exca.TaskInfra = exca.TaskInfra(folder='.cache/', version='v2.0')

    @infra.apply
    def score(self):
        # Get data
        X_train, X_test, y_train, y_test = self.data.get()

        # Train a Ridge regression model
        print('Fit...')
        # model = Ridge(alpha=self.alpha, max_iter=self.max_iter)

        ## NEW VERSION: use not a Ridge model but a LinearRegression model
        from sklearn.linear_model import LinearRegression
        model = LinearRegression()
        model.fit(X_train, y_train)

        # Evaluate
        print('Score...')
        y_pred = model.predict(X_test)
        mse = mean_squared_error(y_test, y_pred)
        return mse


if __name__ == "__main__":
    # Validate config
    basic_config = {"alpha": 1.0, "max_iter": 1000}
    config = exca.ConfDict(basic_config)
    model = Model(**config)
    print(model.infra.config)

    # Score
    mse = model.score()
    print(mse)

In [None]:
ls .cache/

## Other important topics

### MapInfra

If you want to iterate on your tasks