# Project Index

[Custom Model Notebook](../../../notebooks/custom_model.ipynb)  
[Training Notebook](../../../notebooks/train.ipynb)  
[Project Config Notebook](../../../notebooks/project_config.ipynb)  
[Forgather Notebook](../../../notebooks/forgather.ipynb)  

In [1]:
import forgather.ml.notebooks as nb

nb.display_project_index(config_template="causal_transformer.yaml", materialize=True, pp_first=False)

## Causal Transformer

A custom transformer constructed from composible 'bits.'

## Meta Config
Project Directory: /home/dinalt/ai_assets/forgather/examples/models/transformers

Meta Config: [/home/dinalt/ai_assets/forgather/examples/models/transformers/meta.yaml](meta.yaml)

- [meta.yaml](meta.yaml)

Template Search Paths:
- [/home/dinalt/ai_assets/forgather/examples/models/transformers/templates](templates)
- [/home/dinalt/ai_assets/forgather/templates](../../../templates)

## Available Configurations
- [walsh_pe.yaml](templates/configs/walsh_pe.yaml)
- [causal_transformer.yaml](templates/configs/causal_transformer.yaml)
- [swi-glu.yaml](templates/configs/swi-glu.yaml)
- [tiny.yaml](templates/configs/tiny.yaml)
- [base_dynamic.yaml](templates/configs/base_dynamic.yaml)
Default Configuration: base_dynamic.yaml

Active Configuration: causal_transformer.yaml

## Available Templates
- [project.yaml](templates/project.yaml)
- [configs/walsh_pe.yaml](templates/configs/walsh_pe.yaml)
- [configs/causal_transformer.yaml](templates/configs/causal_transformer.yaml)
- [configs/swi-glu.yaml](templates/configs/swi-glu.yaml)
- [configs/tiny.yaml](templates/configs/tiny.yaml)
- [configs/base_dynamic.yaml](templates/configs/base_dynamic.yaml)
- [trainers/accel_trainer.yaml](../../../templates/trainers/accel_trainer.yaml)
- [trainers/trainer.yaml](../../../templates/trainers/trainer.yaml)
- [trainers/hf_trainer.yaml](../../../templates/trainers/hf_trainer.yaml)
- [trainers/base_trainer.yaml](../../../templates/trainers/base_trainer.yaml)
- [model_ctor/args.yaml](../../../templates/model_ctor/args.yaml)
- [projects/tiny.yaml](../../../templates/projects/tiny.yaml)
- [datasets/abstract/pretokenized_dataset.yaml](../../../templates/datasets/abstract/pretokenized_dataset.yaml)
- [datasets/abstract/base_datasets.yaml](../../../templates/datasets/abstract/base_datasets.yaml)
- [datasets/tiny/tiny_stories.yaml](../../../templates/datasets/tiny/tiny_stories.yaml)
- [datasets/tiny/tiny_stories_abridged.yaml](../../../templates/datasets/tiny/tiny_stories_abridged.yaml)
- [models/dynamic_lm.yaml](../../../templates/models/dynamic_lm.yaml)
- [models/causal_transformer.yaml](../../../templates/models/causal_transformer.yaml)
- [models/gpt2.yaml](../../../templates/models/gpt2.yaml)
- [models/llama.yaml](../../../templates/models/llama.yaml)
- [models/abstract/causal_lm_from_config.yaml](../../../templates/models/abstract/causal_lm_from_config.yaml)
- [models/abstract/base_language_model.yaml](../../../templates/models/abstract/base_language_model.yaml)
- [models/abstract/custom_causal_lm.yaml](../../../templates/models/abstract/custom_causal_lm.yaml)
- [models/abstract/causal_lm_from_pretrained.yaml](../../../templates/models/abstract/causal_lm_from_pretrained.yaml)
- [models/abstract/load_model.yaml](../../../templates/models/abstract/load_model.yaml)
- [models/tiny/tiny_causal.yaml](../../../templates/models/tiny/tiny_causal.yaml)
- [models/tiny/tiny_gpt2.yaml](../../../templates/models/tiny/tiny_gpt2.yaml)
- [models/tiny/tiny_llama.yaml](../../../templates/models/tiny/tiny_llama.yaml)
- [models/tiny/tiny_d128_l2.yaml](../../../templates/models/tiny/tiny_d128_l2.yaml)
- [prompts/tiny_stories.yaml](../../../templates/prompts/tiny_stories.yaml)
- [callbacks/base_callbacks.yaml](../../../templates/callbacks/base_callbacks.yaml)
- [callbacks/loggers.yaml](../../../templates/callbacks/loggers.yaml)
- [types/meta_template.yaml](../../../templates/types/meta_template.yaml)
- [types/type.yaml](../../../templates/types/type.yaml)
- [types/tokenizer/tokenizer.yaml](../../../templates/types/tokenizer/tokenizer.yaml)
- [types/tokenizer/bpe/bpe.yaml](../../../templates/types/tokenizer/bpe/bpe.yaml)
- [types/model/model_type.yaml](../../../templates/types/model/model_type.yaml)
- [types/training_script/training_script.yaml](../../../templates/types/training_script/training_script.yaml)
- [types/training_script/causal_lm/causal_lm.yaml](../../../templates/types/training_script/causal_lm/causal_lm.yaml)
- [paths/example_paths.yaml](../../../templates/paths/example_paths.yaml)
- [tokenizers/tiny_2k.yaml](../../../templates/tokenizers/tiny_2k.yaml)
- [tokenizers/tiny_8k.yaml](../../../templates/tokenizers/tiny_8k.yaml)
## Included Templates
- [configs/causal_transformer.yaml](templates/configs/causal_transformer.yaml)
    - [models/causal_transformer.yaml](../../../templates/models/causal_transformer.yaml)
        - [tokenizers/tiny_8k.yaml](../../../templates/tokenizers/tiny_8k.yaml)
        - [models/abstract/custom_causal_lm.yaml](../../../templates/models/abstract/custom_causal_lm.yaml)
            - [models/abstract/base_language_model.yaml](../../../templates/models/abstract/base_language_model.yaml)
                - [inc/formatting.jinja](../../../templates/inc/formatting.jinja)
    - [project.yaml](templates/project.yaml)
        - [paths/example_paths.yaml](../../../templates/paths/example_paths.yaml)
        - [types/model/model_type.yaml](../../../templates/types/model/model_type.yaml)
            - [types/type.yaml](../../../templates/types/type.yaml)
### Config Metadata:

```python
{'config_description': '',
 'config_name': 'Default Causal Transformer',
 'datasets_dir': '../../../datasets',
 'model_src_dir': '../../../model_src',
 'models_dir': 'output_models',
 'output_dir': 'output_models/causal_transformer',
 'project_dir': '.',
 'tokenizers_dir': '../../../tokenizers'}

```

## Modules
- [../../../model_src/causal_transformer.py](../../../model_src/causal_transformer.py) : CausalTransformer
    - [/home/dinalt/ai_assets/forgather/examples/models/transformers/../../../model_src/causal_transformer.py](../../../model_src/causal_transformer.py) : causal_transformer
- [../../../model_src/causal_transformer.py](../../../model_src/causal_transformer.py) : CausalTransformerConfig
## Preprocessed Config

```yaml
#---------------------------------------
#       Default Causal Transformer       
#---------------------------------------
# 2024-08-08T01:30:46
# Description: 
# Project Dir: .
# Current Working Dir: "/home/dinalt/ai_assets/forgather/examples/models/transformers"
# Forgather Config Dir: "/home/dinalt/.config/forgather"
# Model: causal_transformer

############# Config Vars ##############

# ns.models_dir: "output_models"
# ns.tokenizers_dir: "../../../tokenizers"
# ns.datasets_dir: "../../../datasets"
# ns.model_src_dir: "../../../model_src"
# ns.output_dir: "output_models/causal_transformer"

################ Model #################

.define: &model_constructor_args {}

# Name: Causal Transformer
# Description: A causal transformer model, based upon 'Attention is All You Need'
# model_def.cls = "CausalTransformer"
# model_def.cfg_cls = "CausalTransformerConfig"
# model_def.config_path = "../../../model_src/causal_transformer.py"
# model_def.model_path = "../../../model_src/causal_transformer.py"

# **Tokenizer**

# Load custom tokenizer from sub-project definition
.define: &tokenizer !singleton:forgather.ml.construct:load_from_config@tokenizer
    project_dir: "../../../examples/tokenizers/tiny_stories_bpe"
    config_template: "8k.yaml"

# **Model Config**

# Model is entirely self-contained; no sub-modules.
.define: &model_submodule_searchpath []

# Model does not have dynamically generated code
.define: &model_code_generator null

.define: &model_config !singleton:../../../model_src/causal_transformer.py:CausalTransformerConfig@model_config
    submodule_searchpath: *model_submodule_searchpath
    # Set auto-map for custom model; this ensures that the source code stays with the model.
    auto_map:
        AutoConfig: "causal_transformer.CausalTransformerConfig"
        AutoModel: "causal_transformer.CausalTransformer"
    # Get the vocab-size from the tokenizer definition.
    vocab_size: !singleton:len [ *tokenizer ]
    pad_token_id: !singleton:getattr [ *tokenizer, 'pad_token_id' ]
    bos_token_id: !singleton:getattr [ *tokenizer, 'bos_token_id' ]
    eos_token_id: !singleton:getattr [ *tokenizer, 'eos_token_id' ]
    hidden_size: 512
    num_attention_heads: 8
    num_hidden_layers: 6
    max_sequence_length: !singleton:getattr
        - *tokenizer
        - "model_max_length"
    dim_feedforward: 2048
    initializer_range: 0.02
    embedding_dropout: 0.10
    layer_dropout: 0.10
    residual_dropout: 0.0
    attention_dropout: 0.0
    activation_dropout: 0.0

# **Model Constructor**

.define: &pretrained_model !singleton:../../../model_src/causal_transformer.py:CausalTransformer@pretrained_model
    args:
        - *model_config
    kwargs:
        submodule_searchpath: *model_submodule_searchpath
        <<: *model_constructor_args

.define: &model !singleton:forgather.ml.construct:dependency_list@model
    - *pretrained_model
    - !singleton:forgather.ml.construct:copy_package_files
        - "output_models/causal_transformer"
        - *model_config
    - !singleton:forgather.ml.construct:copy_package_files
        - "output_models/causal_transformer"
        - *pretrained_model

#---------------------------------------
#          Configuration Output          
#---------------------------------------
meta: &meta_output !dict:@meta
    config_name: "Default Causal Transformer"
    config_description: ""
    project_dir: "."
    models_dir: "output_models"
    tokenizers_dir: "../../../tokenizers"
    datasets_dir: "../../../datasets"
    output_dir: "output_models/causal_transformer"
    model_src_dir: "../../../model_src"

main:
    model: *model
    tokenizer: *tokenizer
    model_config: *model_config
    generated_code: *model_code_generator

```

## Loaded Configuration to YAML

```yaml
.define: &meta !singleton:named_dict@meta
    config_name: 'Default Causal Transformer'
    config_description: ''
    project_dir: '.'
    models_dir: 'output_models'
    tokenizers_dir: '../../../tokenizers'
    datasets_dir: '../../../datasets'
    output_dir: 'output_models/causal_transformer'
    model_src_dir: '../../../model_src'

.define: &tokenizer !singleton:forgather.ml.construct:load_from_config@tokenizer
    project_dir: '../../../examples/tokenizers/tiny_stories_bpe'
    config_template: '8k.yaml'

.define: &model_config !singleton:../../../model_src/causal_transformer.py:CausalTransformerConfig@model_config
    auto_map: 
        AutoConfig: 'causal_transformer.CausalTransformerConfig'
        AutoModel: 'causal_transformer.CausalTransformer'
    vocab_size: !singleton:len
        - *tokenizer
    pad_token_id: !singleton:getattr
        - *tokenizer
        - 'pad_token_id'
    bos_token_id: !singleton:getattr
        - *tokenizer
        - 'bos_token_id'
    eos_token_id: !singleton:getattr
        - *tokenizer
        - 'eos_token_id'
    hidden_size: 512
    num_attention_heads: 8
    num_hidden_layers: 6
    max_sequence_length: !singleton:getattr
        - *tokenizer
        - 'model_max_length'
    dim_feedforward: 2048
    initializer_range: 0.02
    embedding_dropout: 0.1
    layer_dropout: 0.1
    residual_dropout: 0.0
    attention_dropout: 0.0
    activation_dropout: 0.0

.define: &pretrained_model !singleton:../../../model_src/causal_transformer.py:CausalTransformer@pretrained_model
    - *model_config

.define: &model !singleton:forgather.ml.construct:dependency_list@model
    - *pretrained_model
    - !singleton:forgather.ml.construct:copy_package_files
        - 'output_models/causal_transformer'
        - *model_config
    - !singleton:forgather.ml.construct:copy_package_files
        - 'output_models/causal_transformer'
        - *pretrained_model


meta: *meta
main: 
    model: *model
    tokenizer: *tokenizer
    model_config: *model_config
    generated_code: null

```

### Generated Source Code

```python
from forgather.ml.construct import dependency_list
from forgather.ml.construct import copy_package_files
from forgather.ml.construct import load_from_config
from importlib.util import spec_from_file_location, module_from_spec
import os
import sys

# Import a dynamic module.
def dynimport(module, name, searchpath):
    module_path = module
    module_name = os.path.basename(module).split(".")[0]
    module_spec = spec_from_file_location(
        module_name,
        module_path,
        submodule_search_locations=searchpath,
    )
    mod = module_from_spec(module_spec)
    sys.modules[module_name] = mod
    module_spec.loader.exec_module(mod)
    for symbol in name.split("."):
        mod = getattr(mod, symbol)
    return mod

CausalTransformerConfig = lambda: dynimport("../../../model_src/causal_transformer.py", "CausalTransformerConfig", ['/home/dinalt/ai_assets/forgather/examples/models/transformers/../../../model_src'])
CausalTransformer = lambda: dynimport("../../../model_src/causal_transformer.py", "CausalTransformer", ['/home/dinalt/ai_assets/forgather/examples/models/transformers/../../../model_src'])

def construct(
):
    meta = {
        'config_name': 'Default Causal Transformer',
        'config_description': '',
        'project_dir': '.',
        'models_dir': 'output_models',
        'tokenizers_dir': '../../../tokenizers',
        'datasets_dir': '../../../datasets',
        'output_dir': 'output_models/causal_transformer',
        'model_src_dir': '../../../model_src',
    }

    tokenizer = load_from_config(
        project_dir='../../../examples/tokenizers/tiny_stories_bpe',
        config_template='8k.yaml',
    )

    model_config = CausalTransformerConfig()(
        auto_map={
            'AutoConfig': 'causal_transformer.CausalTransformerConfig',
            'AutoModel': 'causal_transformer.CausalTransformer',
        },
        vocab_size=len(
            tokenizer,
        ),
        pad_token_id=tokenizer.pad_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        hidden_size=512,
        num_attention_heads=8,
        num_hidden_layers=6,
        max_sequence_length=tokenizer.model_max_length,
        dim_feedforward=2048,
        initializer_range=0.02,
        embedding_dropout=0.1,
        layer_dropout=0.1,
        residual_dropout=0.0,
        attention_dropout=0.0,
        activation_dropout=0.0,
    )

    pretrained_model = CausalTransformer()(
        model_config,
    )

    model = dependency_list(
        pretrained_model,
        copy_package_files(
            'output_models/causal_transformer',
            model_config,
        ),
        copy_package_files(
            'output_models/causal_transformer',
            pretrained_model,
        ),
    )
    
    return {
        'meta': meta,
        'main': {
            'model': model,
            'tokenizer': tokenizer,
            'model_config': model_config,
            'generated_code': None,
        },
    }

```

## Constructed Project

```python
{'main': {'generated_code': None,
          'model': CausalTransformer(
  (input_encoder): InputEncoder(
    d_model=512, vocab_size=8000, embedding_scale=22.627416997969522
    (dropout): Dropout(p=0.1, inplace=False)
    (embedding): Embedding(8000, 512)
    (positional_encoder): SinusoidalPE(d_model=512, max_sequence_length=2048)
  )
  (output_decoder): Linear(in_features=512, out_features=8000, bias=True)
  (layer_stack): CausalLayerStack(
    (layers): ModuleList(
      (0-5): 6 x PostLNLayer(
        (feedforward): FeedforwardLayer(
          d_model=512, d_feedforward=2048
          (linear1): Linear(in_features=512, out_features=2048, bias=True)
          (dropout): Identity()
          (activation): ReLU()
          (linear2): Linear(in_features=2048, out_features=512, bias=True)
        )
        (attention): CausalMultiheadAttn(
          d_model=512, num_heads=8
          (query_linear): Linear(in_features=512, out_features=512, bias=True)
          (key_linear): Linear(in_features=512, out_features=512, bias=True)
          (value_linear): Linear(in_features=512, out_features=512, bias=True)
          (output_linear): Linear(in_features=512, out_features=512, bias=True)
          (dropout): Identity()
        )
        (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
        (residual_dropout): Identity()
      )
    )
  )
),
          'model_config': CausalTransformerConfig {
  "activation_dropout": 0.0,
  "attention_dropout": 0.0,
  "auto_map": {
    "AutoConfig": "causal_transformer.CausalTransformerConfig",
    "AutoModel": "causal_transformer.CausalTransformer"
  },
  "bos_token_id": 0,
  "dim_feedforward": 2048,
  "embedding_dropout": 0.1,
  "eos_token_id": 2,
  "hidden_size": 512,
  "initializer_range": 0.02,
  "layer_dropout": 0.1,
  "max_sequence_length": 2048,
  "model_type": "forgather-causal-transformer",
  "num_attention_heads": 8,
  "num_hidden_layers": 6,
  "pad_token_id": 1,
  "residual_dropout": 0.0,
  "transformers_version": "4.41.2",
  "vocab_size": 8000
}
,
          'tokenizer': PreTrainedTokenizerFast(name_or_path='../../../tokenizers/tiny_stories_8k', vocab_size=8000, model_max_length=2048, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<|BOS|>', 'eos_token': '<|EOS|>', 'unk_token': '<|UNK|>', 'pad_token': '<|PAD|>'}, clean_up_tokenization_spaces=True),  added_tokens_decoder={
	0: AddedToken("<|BOS|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	1: AddedToken("<|PAD|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	2: AddedToken("<|EOS|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	3: AddedToken("<|UNK|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}},
 'meta': {'config_description': '',
          'config_name': 'Default Causal Transformer',
          'datasets_dir': '../../../datasets',
          'model_src_dir': '../../../model_src',
          'models_dir': 'output_models',
          'output_dir': 'output_models/causal_transformer',
          'project_dir': '.',
          'tokenizers_dir': '../../../tokenizers'}}

```

