# Project Index

[Custom Model Notebook](../../../notebooks/custom_model.ipynb)  
[Training Notebook](../../../notebooks/train.ipynb)  
[Project Config Notebook](../../../notebooks/project_config.ipynb)  
[Forgather Notebook](../../../notebooks/forgather.ipynb)  

In [21]:
import forgather.nb.notebooks as nb

# By setting materialize to 'False,' we can skip constructing the config
# Constructing this configuration will result in downloading the required dataset,
# so it may take a moment the first time it is constructed.
nb.display_project_index(config_template="", materialize=False, pp_first=False)

# Traning  for Fashion

This project reproduces the configuration from a PyTorch tutorial, where a simple ML model is created and trained to recognize categories of clothing from the FashionMNIST dataset.

https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html

This was chosen as it is a relatively simple project which can be relativley self contained. Still, it is far more complex than the previous examples.

The model itself does not require any custom code. It's simply a stack of PyTorch layers, chained together with a nn.Sequential. If you would like to know more about the model itself, see:

https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html

## Custom Code

While Forgather is good at assembling objects, the language is not practical for defining logic. For this, we have defined a custom "trainer" class in the project's 'src' directory and we will use Forgather to dynamically import this code, injecting all the required dependencies.

Unlike the previous projects, you will note that the "Modules" section not empty and has a link to the Trainer definition.

## Project Structure

Like the previous example, this project makes use of template inheritance, where there is a common 'project.yaml' file from which all of the configuratioins are derived.

The template provides the basic structure, with 'blocks' which may be substituted or extended by child templates. We use this functionaity in the configurations to override various components of the base configuraiton.

This is still a relatively simple project, as it does not reference any external template libraries. We will get to that in the next example.

## Code Generation

Note the output of the code generator. It has detected the inclusion of a dynamic import, thus it has automatically defined a function for importing dynamic modules.

Also note that it knows how to translate some of the rather clunky expreressions from the original YAML file, like calling a method, into relatively clean Python code.

---



#### Project Directory: "/home/dinalt/ai_assets/forgather/tutorials/basic_projects/project_gamma"

## Meta Config
Meta Config: [/home/dinalt/ai_assets/forgather/tutorials/basic_projects/project_gamma/meta.yaml](meta.yaml)

- [meta.yaml](meta.yaml)

Template Search Paths:
- [/home/dinalt/ai_assets/forgather/tutorials/basic_projects/project_gamma/templates](templates)

## Available Configurations
- [wider.yaml](templates/experiments/wider.yaml)
- [deeper.yaml](templates/experiments/deeper.yaml)
- [adam.yaml](templates/experiments/adam.yaml)
- [baseline.yaml](templates/experiments/baseline.yaml)

Default Configuration: baseline.yaml

Active Configuration: baseline.yaml

## Available Templates
- [experiments/adam.yaml](templates/experiments/adam.yaml)
- [experiments/baseline.yaml](templates/experiments/baseline.yaml)
- [experiments/deeper.yaml](templates/experiments/deeper.yaml)
- [experiments/wider.yaml](templates/experiments/wider.yaml)
- [formatting.yaml](templates/formatting.yaml)
- [project.yaml](templates/project.yaml)

## Included Templates
- [experiments/baseline.yaml](templates/experiments/baseline.yaml)
    - [project.yaml](templates/project.yaml)
        - [formatting.yaml](templates/formatting.yaml)
### Config Metadata:

```python
{'batch_size': 64,
 'citation': 'https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html',
 'description': 'Base configuration, based on Torch tutorial parameters.',
 'epochs': 5,
 'hidden_dim': 512,
 'logging_steps': 100,
 'lr': 0.001,
 'name': 'Fashion MNIST Trainer'}

```

## Modules
- [./src/trainer.py](src/trainer.py) : Trainer
    - [/home/dinalt/ai_assets/forgather/tutorials/basic_projects/project_gamma/./src/trainer.py](src/trainer.py) : trainer
## Preprocessed Config

```yaml

#---------------------------------------
#          Fashion MNIST Trainer         
#---------------------------------------
# 2024-08-12T06:16:12
# Description: Base configuration, based on Torch tutorial parameters.
# Project Dir: /home/dinalt/ai_assets/forgather/tutorials/basic_projects/project_gamma
# Citation: https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html
#---------------------------------------


############# Config Vars ##############

# ns.hidden_dim = 512
# ns.epochs = 5
# ns.batch_size = 64
# ns.lr = 0.001
# ns.logging_steps = 100



########### Model Definition ###########

.define: &activation_fn !factory:torch.nn:ReLU@activation_fn []

.define: &model !singleton:torch.nn:Sequential@model
    - !factory:torch.nn:Flatten []
    - !factory:torch.nn:Linear [ 784, 512 ]
    - *activation_fn
    - !factory:torch.nn:Linear [ 512, 512 ]
    - *activation_fn
    - !factory:torch.nn:Linear [ 512, 10 ]

############### Dataset ################

.define: &transform !factory:torchvision.transforms:ToTensor@transform []

.define: &training_data !singleton:torchvision.datasets:FashionMNIST
    root: "data"
    train: True
    download: True
    transform: *transform

.define: &test_data !singleton:torchvision.datasets:FashionMNIST
    root: "data"
    train: False
    download: True
    transform: *transform

.define: &train_dataloader !singleton:torch.utils.data:DataLoader
    args: [ *training_data ]
    kwargs: { batch_size: 64 }

.define: &test_dataloader !singleton:torch.utils.data:DataLoader
    args: [ *test_data ]
    kwargs: { batch_size: 64 }

############### Trainer ################

.define: &model_params !singleton:call [ !singleton:getattr [ *model, "parameters" ] ]

# **Optimizer**

.define: &optimizer !singleton:torch.optim:SGD
    args: [ *model_params ]
    kwargs:
        lr: 0.001

# **Loss Function**

.define: &loss_fn !singleton:torch.nn:CrossEntropyLoss []

# **Trainer**

.define: &trainer !singleton:./src/trainer.py:Trainer@trainer
    train_dataloader: *train_dataloader
    test_dataloader: *test_dataloader
    model: *model
    loss_fn: *loss_fn
    optimizer: *optimizer
    epochs: 5
    batch_size: 64
    logging_steps: 100

################ Output ################

meta:
    name: "Fashion MNIST Trainer"
    description: "Base configuration, based on Torch tutorial parameters."
    citation: "https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html"
    hidden_dim: 512
    epochs: 5
    batch_size: 64
    logging_steps: 100
    lr: 0.001

main:
    model: *model
    trainer: *trainer

```

## Loaded Configuration to YAML

```yaml
.define: &activation_fn !factory:torch.nn:ReLU@activation_fn []

.define: &model !singleton:torch.nn:Sequential@model
    - !factory:torch.nn:Flatten []
    - !factory:torch.nn:Linear
        - 784
        - 512
    - *activation_fn
    - !factory:torch.nn:Linear
        - 512
        - 512
    - *activation_fn
    - !factory:torch.nn:Linear
        - 512
        - 10

.define: &transform !factory:torchvision.transforms:ToTensor@transform []

.define: &trainer !singleton:./src/trainer.py:Trainer@trainer
    train_dataloader: !singleton:torch.utils.data:DataLoader
        args:
            - !singleton:torchvision.datasets:FashionMNIST
                root: 'data'
                train: True
                download: True
                transform: *transform
        kwargs:
            batch_size: 64
    test_dataloader: !singleton:torch.utils.data:DataLoader
        args:
            - !singleton:torchvision.datasets:FashionMNIST
                root: 'data'
                train: False
                download: True
                transform: *transform
        kwargs:
            batch_size: 64
    model: *model
    loss_fn: !singleton:torch.nn:CrossEntropyLoss []
    optimizer: !singleton:torch.optim:SGD
        args:
            - !singleton:call
                - !singleton:getattr
                    - *model
                    - 'parameters'
        kwargs:
            lr: 0.001
    epochs: 5
    batch_size: 64
    logging_steps: 100


meta: 
    name: 'Fashion MNIST Trainer'
    description: 'Base configuration, based on Torch tutorial parameters.'
    citation: 'https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html'
    hidden_dim: 512
    epochs: 5
    batch_size: 64
    logging_steps: 100
    lr: 0.001
main: 
    model: *model
    trainer: *trainer

```

### Generated Source Code

```python
from torch.nn import Linear
from torch.nn import Flatten
from torch.nn import CrossEntropyLoss
from torch.nn import Sequential
from torch.utils.data import DataLoader
from torch.optim import SGD
from torch.nn import ReLU
from torchvision.transforms import ToTensor
from torchvision.datasets import FashionMNIST
from importlib.util import spec_from_file_location, module_from_spec
import os
import sys

# Import a dynamic module.
def dynimport(module, name, searchpath):
    module_path = module
    module_name = os.path.basename(module).split(".")[0]
    module_spec = spec_from_file_location(
        module_name,
        module_path,
        submodule_search_locations=searchpath,
    )
    mod = module_from_spec(module_spec)
    sys.modules[module_name] = mod
    module_spec.loader.exec_module(mod)
    for symbol in name.split("."):
        mod = getattr(mod, symbol)
    return mod

Trainer = lambda: dynimport("./src/trainer.py", "Trainer", [])

def construct(
):
    activation_fn = lambda: ReLU()

    model = Sequential(
        Flatten(),
        Linear(
            784,
            512,
        ),
        activation_fn(),
        Linear(
            512,
            512,
        ),
        activation_fn(),
        Linear(
            512,
            10,
        ),
    )

    transform = lambda: ToTensor()

    trainer = Trainer()(
        train_dataloader=DataLoader(
            FashionMNIST(
                root='data',
                train=True,
                download=True,
                transform=transform(),
            ),
            batch_size=64,
        ),
        test_dataloader=DataLoader(
            FashionMNIST(
                root='data',
                train=False,
                download=True,
                transform=transform(),
            ),
            batch_size=64,
        ),
        model=model,
        loss_fn=CrossEntropyLoss(),
        optimizer=SGD(
            model.parameters(),
            lr=0.001,
        ),
        epochs=5,
        batch_size=64,
        logging_steps=100,
    )
    
    return {
        'meta': {
            'name': 'Fashion MNIST Trainer',
            'description': 'Base configuration, based on Torch tutorial parameters.',
            'citation': 'https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html',
            'hidden_dim': 512,
            'epochs': 5,
            'batch_size': 64,
            'logging_steps': 100,
            'lr': 0.001,
        },
        'main': {
            'model': model,
            'trainer': trainer,
        },
    }

```



## Construct Baseline Configuration

In [12]:
from forgather.project import Project
from forgather.dotdict import DotDict
from pprint import pp

# Load default baseline config
proj = Project(config_name="")

output = proj()
pp(output)

# Wrap the main output in DotDict for easier to read/type syntax by making dictionary keys look like attributes.
config = DotDict(output['main'])

{'meta': {'name': 'Fashion MNIST Trainer',
          'description': 'Base configuration, based on Torch tutorial '
                         'parameters.',
          'citation': 'https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html',
          'hidden_dim': 512,
          'epochs': 5,
          'batch_size': 64,
          'logging_steps': 100,
          'lr': 0.001},
 'main': {'model': Sequential(
  (0): Flatten(start_dim=1, end_dim=-1)
  (1): Linear(in_features=784, out_features=512, bias=True)
  (2): ReLU()
  (3): Linear(in_features=512, out_features=512, bias=True)
  (4): ReLU()
  (5): Linear(in_features=512, out_features=10, bias=True)
),
          'trainer': Trainer(train_dataloader=<torch.utils.data.dataloader.DataLoader object at 0x7fb14c7a71f0>, test_dataloader=<torch.utils.data.dataloader.DataLoader object at 0x7fb14c7a6e00>, model=Sequential(
  (0): Flatten(start_dim=1, end_dim=-1)
  (1): Linear(in_features=784, out_features=512, bias=True)
  (2): ReLU()
  

## Train Model

The trainer is started by simply calling it.

In [None]:
config.trainer()

## Construct and Train with Adam Optimizer

This just goes straight from config to train, using the 'adam.yaml' configuration, where we have replaced the SGD optimizer with Adam.

Take a look at the actual config definition to see what changes were required to accomplish this.

In [None]:
Project(config_name="adam.yaml")()['main']['trainer']()

## Run all Project Configurations

You can directly load the meta-config only and use it to find and iterate over all configurations in the project.

In [None]:
from forgather.meta_config import MetaConfig

meta = MetaConfig()
for config_name, _ in meta.find_templates(meta.config_prefix):
    proj = Project(config_name=config_name)
    print(f"{ ' Starting ' + proj.config_name + ' ':-^60}")
    proj()['main']['trainer']()