# Configuration Notebook
Useful for debugging configurations and viewing project configuration details.

## Setup
Configure defaults and select a project.

In [1]:
# Set defaults
#default_projects_directory = '/home/dinalt/ai_assets/projects/experiments'
default_projects_directory = '../examples/trainers'
default_project = "dynamic_models"
config_template = ""

from ipyfilechooser import FileChooser
import os
fc = FileChooser(
    os.path.join(default_projects_directory, default_project), show_only_dirs=True,
    title="Select a Project Directory", select_default=True)
display(fc)

FileChooser(path='/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models', filename='', title='Sele…

## Project Info

This cell loads the import dependencies and the project meta-data from the selected project directory.

In [3]:
import sys, os
modules_path = os.path.join('..', 'src')
if modules_path not in sys.path: sys.path.insert(0, modules_path)
from pprint import pformat, pp
from IPython import display as ds
from forgather import Latent
from forgather.config import ConfigEnvironment
from forgather.codegen import generate_code
from aiws.config import preprocessor_globals, MetaConfig
import aiws.notebooks as nb

assert os.path.exists(fc.selected_path), "Project directory does not exist."
nb.show_project_readme(fc.selected_path)

# Get meta-config for project
meta = MetaConfig(fc.selected_path)

nb.display_meta(meta, "### Meta Config\n")
nb.list_templates(meta.find_templates(meta.config_prefix), "### Available Configurations\n")

# Get default config for project
default_config = meta.default_config()
print('-' * 60)
print(f"Default Configuration: {default_config}")

# Get the full name of the selected config template in the template name-space.
# If empty, meta.config_path() will return the default template path.
config_template_path = meta.config_path(config_template)
print(f"Selected Template Name: {config_template_path}")

## Dynamic Models

This is a demonstraction of how to perform model archetecture experiments by using the configuration system to dynamically change module types.

As most of the examples, we use "Tiny Causal" as a baseline, then make various changes for comparison.

### Common Configuration
- Tokenizer: tokenizers/tiny_2k_bpe.yaml
    - Vocabulary Size: 2000
    - Maximum Model Sequence: 2048
- Dataset: datasets/tiny/tiny_stories_abridged.yaml
    - Dataset ID: roneneldan/TinyStories
    - Reference: https://arxiv.org/abs/2305.07759
    - Train Select Range: 10% 
- Model:
    - Model Dimension: 256
    - MLP Dimension: 1024
    - Layers: 4
    - Heads: 2
    - All Dropout Probabilities: 0.0
- Trainer:
    - Class: aiws.trainer.Trainer
    - Epochs: 1
    - Initial Learning Rate: 1.0e-3
    - Train Batch Size: 32
    - LR Sheduler: Cosine

### Meta Config
Project Directory: /home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models

Meta Config: [/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/meta.yaml](../examples/trainers/dynamic_models/meta.yaml)

Template Search Paths:
- [/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/templates](../examples/trainers/dynamic_models/templates)
- [/home/dinalt/ai_assets/forgather/templates](../templates)


### Available Configurations
- [pre_ln.yaml](../examples/trainers/dynamic_models/templates/experiments/pre_ln.yaml)
- [walsh_pe.yaml](../examples/trainers/dynamic_models/templates/experiments/walsh_pe.yaml)
- [relu-glu.yaml](../examples/trainers/dynamic_models/templates/experiments/relu-glu.yaml)
- [swish.yaml](../examples/trainers/dynamic_models/templates/experiments/swish.yaml)
- [swi-glu.yaml](../examples/trainers/dynamic_models/templates/experiments/swi-glu.yaml)
- [control.yaml](../examples/trainers/dynamic_models/templates/experiments/control.yaml)


------------------------------------------------------------
Default Configuration: control.yaml
Selected Template Name: experiments/control.yaml


## List Available Templates
This will list all templates within the searchpath.

In [None]:
def list_templates(prefix):
    nb.list_templates(meta.find_templates(prefix), "### Templates\n")
list_templates('')

### Show Referenced Templates

In [None]:
nb.display_referenced_templates_tree(environment, config_template_path, "### Included Templates\n")

## Init Config Envrionment

In [4]:
# Create configuration envrionment
environment = ConfigEnvironment(
    searchpath=meta.searchpath,
    global_vars=preprocessor_globals(fc.selected_path),
)

## Preprocess Configiguration
Not required, but can be useful for diagnostics prior to YAML parsing.

In [None]:
pp_config = environment.preprocess(config_template_path)
display(ds.Markdown(f"#### Preprocessed Config\n" f"```yaml\n{pp_config}\n```\n"))

## Load Configuration

This will both preprocess and parse (YAML) the template in a single step, returning both the node-graph and the pre-processed config.

In [5]:
config, pp_config = environment.load(config_template_path).get()

### Show Referenced Source Files

Show referenced sub-modules within the same package.  
For accurate results, the configuration must be instantiated.

In [None]:
nb.display_referenced_source_list(config, "### Included Sources\n")

### Render Configuration as YAML

Note: The configuration graph is language independent. This merely translates the graph to YAML.

In [None]:
display(ds.Markdown(f"### Loaded Configuration\n```yaml\n{Latent.to_yaml(config)}\n```"))

### Render Configuration as Python

This will display the configuraiton graph as Python code. This even works correctly for recursively generated code.

While the render tries to faithfully generate code which is identical to what Latent.materialize(config) would do, it's an interpretation and may not always produce identical results.

One known issue is that LambdaNodes, which take arguments, are not rendered correctly. They work fine with 'materialize(),' but the lambdas in the generated code don't accept arguments from their caller. While fixable, doing so is fairly complicated, and the author lacks an infinite supply of time.


In [6]:
generated_code = generate_code(config)
display(ds.Markdown(f"### Generated Source Code\n```python\n{generated_code}\n```"))

### Generated Source Code
```python
from aiws.datasets import tokenize_dataset
from aiws.json_logger import JsonLogger
from aiws.tb_logger import TBLogger
from torch.utils.tensorboard import SummaryWriter
from aiws.distributed import DistributedEnvironment
from aiws.trainer import Trainer
from aiws.training_script import TrainingScript
from aiws.construct import write_file
from aiws.textgen_callback import TextgenCallback
from aiws.trainer_types import TrainingArguments
from datasets import load_dataset
from aiws.construct import dependency_list
from aiws.construct import copy_package_files
from transformers import DataCollatorForLanguageModeling
from aiws.construct import load_from_config
from importlib.util import spec_from_file_location, module_from_spec
import os
import sys

# Import a dynamic module.
def dynimport(module, name, searchpath):
    module_path = module
    module_name = os.path.basename(module).split(".")[0]
    module_spec = spec_from_file_location(
        module_name,
        module_path,
        submodule_search_locations=searchpath,
    )
    mod = module_from_spec(module_spec)
    sys.modules[module_name] = mod
    module_spec.loader.exec_module(mod)
    for symbol in name.split("."):
        mod = getattr(mod, symbol)
    return mod

DynamicCasualLM = lambda: dynimport("/home/dinalt/ai_assets/forgather/model_src/dynamic_causal_lm.py", "DynamicCasualLM", ['/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal', '/home/dinalt/ai_assets/forgather/model_src/bits'])
DynamicCausalLMConfig = lambda: dynimport("/home/dinalt/ai_assets/forgather/model_src/dynamic_causal_lm.py", "DynamicCausalLMConfig", ['/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal', '/home/dinalt/ai_assets/forgather/model_src/bits'])

def construct(
    pp_config,
):
    meta = {
        'config_name': 'Control',
        'config_description': 'Tiny Causal; the baseline control',
        'project_dir': '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models',
        'models_dir': '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models',
        'tokenizers_dir': '/home/dinalt/ai_assets/forgather/tokenizers',
        'datasets_dir': '/home/dinalt/ai_assets/forgather/datasets',
        'output_dir': '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal',
        'model_src_dir': '/home/dinalt/ai_assets/forgather/model_src',
        'logging_dir': '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal/runs/control_2024-08-05T23-38-38',
        'create_new_model': 'True',
        'save_model': 'False',
        'train': 'True',
        'eval': 'False',
    }

    distributed_env = DistributedEnvironment()

    tokenizer = load_from_config(
        project_dir='/home/dinalt/ai_assets/forgather/examples/tokenizers/tiny_stories_bpe',
        config_template='2k.yaml',
    )

    model_code_writer = write_file(
        data=(
            'from torch.nn import Linear\n'
            'from torch.nn import LayerNorm\n'
            'from .init_weights import InitWeights\n'
            'from .causal_layer_stack import CausalLayerStack\n'
            'from .causal_lm import CasualLM\n'
            'from .sinusoidal_pe import SinusoidalPE\n'
            'from .input_encoder import InputEncoder\n'
            'from .causal_multihead_attn import CausalMultiheadAttn\n'
            'from .causal_loss import CausalLoss\n'
            'from .post_ln_layer import PostLNLayer\n'
            'from .feedforward_layer import FeedforwardLayer\n'
            '\n'
            'def construct_model(\n'
            '    num_hidden_layers,\n'
            '    initializer_range,\n'
            '    dim_feedforward,\n'
            '    num_attention_heads,\n'
            '    embedding_dropout,\n'
            '    hidden_size,\n'
            '    vocab_size,\n'
            '    layer_dropout,\n'
            '    activation_dropout,\n'
            '    attention_dropout,\n'
            '    max_sequence_length,\n'
            '    residual_dropout,\n'
            '    **kwargs\n'
            '):\n'
            '    loss_fn_factory = lambda: CausalLoss()\n'
            '\n'
            '    positional_encoder_factory = lambda: SinusoidalPE(\n'
            '        d_model=hidden_size,\n'
            '        max_sequence_length=max_sequence_length,\n'
            '    )\n'
            '\n'
            '    input_encoder_factory = lambda: InputEncoder(\n'
            '        d_model=hidden_size,\n'
            '        vocab_size=vocab_size,\n'
            '        dropout=embedding_dropout,\n'
            '        positional_encoder=positional_encoder_factory(),\n'
            '    )\n'
            '\n'
            '    output_decoder_factory = lambda: Linear(\n'
            '        hidden_size,\n'
            '        vocab_size,\n'
            '    )\n'
            '\n'
            '    feedforward_factory = lambda: FeedforwardLayer(\n'
            '        d_model=hidden_size,\n'
            '        d_feedforward=dim_feedforward,\n'
            '        dropout=activation_dropout,\n'
            '    )\n'
            '\n'
            '    attention_factory = lambda: CausalMultiheadAttn(\n'
            '        d_model=hidden_size,\n'
            '        num_heads=num_attention_heads,\n'
            '        dropout=attention_dropout,\n'
            '    )\n'
            '\n'
            '    layer_norm_factory = lambda: LayerNorm(\n'
            '        normalized_shape=hidden_size,\n'
            '    )\n'
            '\n'
            '    layer_factory = lambda: PostLNLayer(\n'
            '        feedforward=feedforward_factory(),\n'
            '        attention=attention_factory(),\n'
            '        norm1=layer_norm_factory(),\n'
            '        norm2=layer_norm_factory(),\n'
            '        dropout=layer_dropout,\n'
            '        residual_dropout=residual_dropout,\n'
            '    )\n'
            '\n'
            '    layer_stack_factory = lambda: CausalLayerStack(\n'
            '        layer_factory=layer_factory,\n'
            '        num_hidden_layers=num_hidden_layers,\n'
            '    )\n'
            '\n'
            '    init_weights_factory = lambda: InitWeights(\n'
            '        std=initializer_range,\n'
            '    )\n'
            '\n'
            '    model_factory = CasualLM(\n'
            '        loss_fn=loss_fn_factory(),\n'
            '        input_encoder=input_encoder_factory(),\n'
            '        output_decoder=output_decoder_factory(),\n'
            '        layer_stack=layer_stack_factory(),\n'
            '        init_weights=init_weights_factory(),\n'
            '    )\n'
            '    \n'
            '    return model_factory'
        ),
        output_file='/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal/model_factory.py',
        return_value='Model constructor generated by Forgather 1.0',
    )

    model_config = DynamicCausalLMConfig()(
        auto_map={
            'AutoConfig': 'dynamic_causal_lm.DynamicCausalLMConfig',
            'AutoModel': 'dynamic_causal_lm.DynamicCasualLM',
        },
        vocab_size=len(
            tokenizer,
        ),
        pad_token_id=tokenizer.pad_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        code_generator=model_code_writer,
        hidden_size=256,
        num_attention_heads=2,
        num_hidden_layers=4,
        max_sequence_length=tokenizer.model_max_length,
        dim_feedforward=1024,
        initializer_range=0.02,
        embedding_dropout=0.0,
        layer_dropout=0.0,
        residual_dropout=0.0,
        attention_dropout=0.0,
        activation_dropout=0.0,
    )

    pretrained_model = DynamicCasualLM()(
        model_config,
    )

    model = dependency_list(
        pretrained_model,
        copy_package_files(
            '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal',
            model_config,
        ),
        copy_package_files(
            '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal',
            pretrained_model,
        ),
    )

    trainer_args = TrainingArguments(
        output_dir='/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal',
        logging_dir='/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal/runs/control_2024-08-05T23-38-38',
        overwrite_output_dir=True,
        per_device_train_batch_size=32,
        per_device_eval_batch_size=64,
        learning_rate=0.001,
        num_train_epochs=1,
        eval_steps=500,
        logging_steps=100,
        eval_strategy='steps',
        save_strategy='no',
        logging_strategy='steps',
        lr_scheduler_type='cosine',
    )

    data_collator = DataCollatorForLanguageModeling(
        tokenizer,
        mlm=False,
        return_tensors='pt',
    )

    train_source_dataset = load_dataset(
        'roneneldan/TinyStories',
    )

    train_dataset = tokenize_dataset(
        dataset=train_source_dataset['train'],
        tokenizer=tokenizer,
        select_range=0.1,
        desc='Tokenizing train',
        fn_kwargs={
            'truncation': True,
        },
    )

    eval_dataset = tokenize_dataset(
        dataset=train_source_dataset['validation'],
        tokenizer=tokenizer,
        select_range=500,
        desc='Tokenizing validation split',
        fn_kwargs={
            'truncation': True,
        },
    )

    testprompts = [
        'Alice was so tired when she got back home so she went',
        'Jack and Lily liked to watch the moon at night. They noticed that the moon changed its shape every night. Sometimes the moon was big and round, and sometimes it was',
        'Jack and Lily saw a rainbow after a rainy day.They were amazed by the colors. Jack said, "Look, Lily. A rainbow has',
        'Jack wanted to read a book, so he went to',
        '"Can cows fly?" Alice asked her mother.',
        '"What do birds like to eat?" Tom asked his mother.',
        '"What language do they speak in France?" Tom asked his mother.',
        'If I throw a ball up in the air, eventually it will',
        'It was winter and cold outside so his mother told him, "You should',
        'Lily likes cats and dogs. She asked her mom for a dog and her mom said no, so instead she asked',
        'Jack told Mary, "If you give me your banana, I\'ll give you my apple." Mary gave Jack her Banana, so',
        'On weekends Jack went to visit his grandmother whereas on weekdays he would go to school. Last weekend, when Jack was on his way to',
        'Lily and Ben were having an argument. Ben said that cake is much better than ice cream and Lily said that',
        'Lily and Ben are having an argument. They are trying to decide between the park and the swimming pool. Ben says, "I want to go to the park". Lily says',
        "Jack's mother was not home, and his father was at home. When Jack came home, he said hello to",
        "Lily doesn't like swimming. When her father wants to take her to the swimming pool, she says",
        'Both Ben and Lily wanted cake. Father said that there was only one piece of cake left. They',
        'Ben went to visit Lily in her house, but she was not at home. Ben knocked on the door,',
    ]

    generation_config = {
        'identity': 'generation_config',
        'do_sample': True,
        'top_k': 20,
        'top_p': 0.9,
        'temperature': 0.7,
        'repitition_penalty': 1.15,
    }

    trainer_callbacks = [
        JsonLogger(
            date='2024-08-05T23:38:38',
            name='Control',
            description='Tiny Causal; the baseline control',
            config=pp_config,
            versions={
                'python': '3.10.13',
                'torch': '2.3.1',
                'transformers': '4.41.2',
                'accelerate': '0.31.0',
            },
        ),
        TBLogger(
            SummaryWriter(
                '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal/runs/control_2024-08-05T23-38-38',
            ),
            date='2024-08-05T23:38:38',
            name='Control',
            description='Tiny Causal; the baseline control',
            config=pp_config,
            versions={
                'python': '3.10.13',
                'torch': '2.3.1',
                'transformers': '4.41.2',
                'accelerate': '0.31.0',
            },
        ),
        TextgenCallback(
            summary_writer=SummaryWriter(
                '/home/dinalt/ai_assets/forgather/examples/trainers/dynamic_models/output_models/tiny_causal/runs/control_2024-08-05T23-38-38',
            ),
            prompts=testprompts,
            generation_config=generation_config,
            max_new_tokens=40,
            generation_steps=2000,
        ),
    ]

    trainer = Trainer(
        model=model,
        args=trainer_args,
        data_collator=data_collator,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        tokenizer=tokenizer,
        callbacks=trainer_callbacks,
    )

    training_script = TrainingScript(
        meta=meta,
        do_save=False,
        do_train=True,
        do_eval=False,
        distributed_env=distributed_env,
        trainer=trainer,
    )
    
    return {
        'meta': meta,
        'main': training_script,
    }
```

## Materialized Configuration

Instantiate the configuration from the definition.
This loads all of the referenced modules and instantiates the main output. Some configurations will run preprocessing when loaded, so this can take a moment.

And don't run this if you don't trust the source of the configuration!

In [None]:
#from loguru import logger
#logger.enable("forgather.latent")

config, pp_config = environment.load(meta.config_path(config_template)).get()

# Note: We inject the pre-processed config as an argument, which can then be used to log this information.
main_output = Latent.materialize(config, pp_config=pp_config)['main']
pp(main_output)

## Execute Generate Code

This executes the generated code and calls 'construct()', the default factory function, to instantiate the configuration.

In theory, the output should be identitical to calling Latent.materialize(config), but there are know differences (see section on code generation for details). 

In [None]:
exec(generated_code)
main_output = construct(pp_config=pp_config)['main']
pp(main_output)

### Run Configuration

Assuming that this the output object has a 'run' method (training scripts do), the following will run it.

For a more robust approach, see: [train.ipynb](train.ipynb)

In [None]:
main_output.run()

### Cleanup
Note: These will show the target directory and ask for confirmation before proceeding.

#### Delete All

In [None]:
nb.delete_dir(config.meta['models_dir'], "Delete all models in project")

#### Delete Configuration Output Directory
This will delete the model and logs for the current configuration.

In [None]:
nb.delete_dir(config.meta['output_dir'], "Delete output directory")