# Project Index

[Custom Model Notebook](../../../notebooks/custom_model.ipynb)  
[Training Notebook](../../../notebooks/train.ipynb)  
[Project Config Notebook](../../../notebooks/project_config.ipynb)  
[Forgather Notebook](../../../notebooks/forgather.ipynb)  

In [9]:
import forgather.nb.notebooks as nb

nb.display_project_index(
    config_template="lower_lr.yaml",
    show_available_templates=False,
    show_pp_config=True,
    show_loaded_config=False,
    show_generated_code=False,
    materialize=False,
    pp_first=True,
)

## Test Effectiveness of Deepnet Init

### Control

Uses a deep model with our baseline init method.


#### Project Directory: "/home/dinalt/ai_assets/forgather/examples/trainers/deepnet"

## Meta Config
Meta Config: [/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/meta.yaml](meta.yaml)

- [meta.yaml](meta.yaml)
    - [meta_defaults.yaml](../../../forgather_workspace/meta_defaults.yaml)
        - [base_directories.yaml](../../../forgather_workspace/base_directories.yaml)

Template Search Paths:
- [/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/templates](templates)
- [/home/dinalt/ai_assets/forgather/forgather_workspace](../../../forgather_workspace)
- [/home/dinalt/ai_assets/forgather/templates/tiny_experiments](../../../templates/tiny_experiments)
- [/home/dinalt/ai_assets/forgather/templates/modellib](../../../templates/modellib)
- [/home/dinalt/ai_assets/forgather/templates/base](../../../templates/base)

## Available Configurations
- [deepnet_torch.yaml](templates/configs/deepnet_torch.yaml)
- [deepnet.yaml](templates/configs/deepnet.yaml)
- [lower_lr.yaml](templates/configs/lower_lr.yaml)
- [deepnet_init.yaml](templates/configs/deepnet_init.yaml)
- [control.yaml](templates/configs/control.yaml)

Default Configuration: control.yaml

Active Configuration: lower_lr.yaml

## Included Templates
- [configs/lower_lr.yaml](templates/configs/lower_lr.yaml)
    - [project.yaml](templates/project.yaml)
        - [projects/tiny.yaml](../../../templates/tiny_experiments/projects/tiny.yaml)
            - [prompts/tiny_stories.yaml](../../../templates/tiny_experiments/prompts/tiny_stories.yaml)
            - [types/training_script/causal_lm/causal_lm.yaml](../../../templates/base/types/training_script/causal_lm/causal_lm.yaml)
                - [trainers/trainer.yaml](../../../templates/base/trainers/trainer.yaml)
                    - [trainers/base_trainer.yaml](../../../templates/base/trainers/base_trainer.yaml)
                        - [trainers/minimal_trainer.yaml](../../../templates/base/trainers/minimal_trainer.yaml)
                - [callbacks/loggers.yaml](../../../templates/base/callbacks/loggers.yaml)
                    - [callbacks/base_callbacks.yaml](../../../templates/base/callbacks/base_callbacks.yaml)
                - [models/abstract/load_model.yaml](../../../templates/base/models/abstract/load_model.yaml)
                    - [models/abstract/causal_lm_from_pretrained.yaml](../../../templates/base/models/abstract/causal_lm_from_pretrained.yaml)
                        - [models/abstract/base_language_model.yaml](../../../templates/base/models/abstract/base_language_model.yaml)
                - [types/training_script/training_script.yaml](../../../templates/base/types/training_script/training_script.yaml)
                    - [types/type.yaml](../../../templates/base/types/type.yaml)
                        - [base_directories.yaml](../../../forgather_workspace/base_directories.yaml)
                - [inc/formatting.jinja](../../../templates/base/inc/formatting.jinja)
            - [tiny.callbacks](../../../templates/tiny_experiments/projects/tiny.yaml)
            - [tiny.model_config](../../../templates/tiny_experiments/projects/tiny.yaml)
                - [models/tiny/tiny_causal.yaml](../../../templates/tiny_experiments/models/tiny/tiny_causal.yaml)
                    - [tokenizers/tiny_2k.yaml](../../../templates/tiny_experiments/tokenizers/tiny_2k.yaml)
                    - [models/dynamic_causal_transformer.yaml](../../../templates/modellib/models/dynamic_causal_transformer.yaml)
                        - [models/abstract/dynamic_causal_lm.yaml](../../../templates/base/models/abstract/dynamic_causal_lm.yaml)
                            - [models/abstract/custom_causal_lm.yaml](../../../templates/base/models/abstract/custom_causal_lm.yaml)
            - [tiny.trainer_config](../../../templates/tiny_experiments/projects/tiny.yaml)
            - [tiny.dataset_config](../../../templates/tiny_experiments/projects/tiny.yaml)
                - [datasets/tiny/tiny_stories_abridged.yaml](../../../templates/tiny_experiments/datasets/tiny/tiny_stories_abridged.yaml)
                    - [datasets/tiny/tiny_stories.yaml](../../../templates/tiny_experiments/datasets/tiny/tiny_stories.yaml)
                        - [datasets/abstract/base_datasets.yaml](../../../templates/base/datasets/abstract/base_datasets.yaml)
        - [project.model_config](templates/project.yaml)
        - [project.trainer_config](templates/project.yaml)
    - [experiment.trainer_config](templates/configs/lower_lr.yaml)
    - [experiment.model_config](templates/configs/lower_lr.yaml)
## Preprocessed Config

```yaml
#---------------------------------------
#         Control with lowered LR        
#---------------------------------------
# 2025-05-24T20:55:33
# Description: Baseline Simple Init
# Project Dir: /home/dinalt/ai_assets/forgather/examples/trainers/deepnet
# Current Working Dir: "/home/dinalt/ai_assets/forgather/examples/trainers/deepnet"
# Forgather Config Dir: "/home/dinalt/.config/forgather"
# Model: lower_lr
# Hostname: hal9000
# Versions:
#     python: 3.10.13
#     torch: 2.7.0
#     transformers: 4.51.3
#     accelerate: 1.7.0

############# Config Vars ##############

# ns.forgather_dir: "/home/dinalt/ai_assets/forgather"
# ns.models_dir: "/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/output_models"
# ns.project_model_src_dir: "/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/model_src"
# ns.tokenizers_dir: "/home/dinalt/ai_assets/forgather/tokenizers"
# ns.datasets_dir: "/home/dinalt/ai_assets/forgather/datasets"
# ns.model_src_dir: "/home/dinalt/ai_assets/forgather/model_src"
# ns.output_dir: "./output_models/lower_lr"
# ns.logging_dir: "./output_models/lower_lr/runs/lower_lr_2025-05-24T20-55-33"
# ns.create_new_model: True
# ns.save_model: False
# ns.train: True
# ns.eval: False
# ns.trust_remote_code: True

####### Distributed Environment ########

.define: &distributed_env !singleton:forgather.ml.distributed:DistributedEnvironment@distributed_env

############# Dependencies #############

# The model will be given the following prompts for text-gen at regular intervals.
.define: &testprompts !list:@testprompts
    # Test prompts from "https://arxiv.org/abs/2305.07759"
    - "Alice was so tired when she got back home so she went"
    - "Jack and Lily liked to watch the moon at night. They noticed that the moon changed its shape every night. Sometimes the moon was big and round, and sometimes it was"
    - "Jack and Lily saw a rainbow after a rainy day.They were amazed by the colors. Jack said, \"Look, Lily. A rainbow has"
    - "Jack wanted to read a book, so he went to"
    - "\"Can cows fly?\" Alice asked her mother."
    - "\"What do birds like to eat?\" Tom asked his mother."
    - "\"What language do they speak in France?\" Tom asked his mother."
    - "If I throw a ball up in the air, eventually it will"
    - "It was winter and cold outside so his mother told him, \"You should"
    - "Lily likes cats and dogs. She asked her mom for a dog and her mom said no, so instead she asked"
    - "Jack told Mary, \"If you give me your banana, I'll give you my apple.\" Mary gave Jack her Banana, so"
    - "On weekends Jack went to visit his grandmother whereas on weekdays he would go to school. Last weekend, when Jack was on his way to"
    - "Lily and Ben were having an argument. Ben said that cake is much better than ice cream and Lily said that"
    - "Lily and Ben are having an argument. They are trying to decide between the park and the swimming pool. Ben says, \"I want to go to the park\". Lily says"
    - "Jack's mother was not home, and his father was at home. When Jack came home, he said hello to"
    - "Lily doesn't like swimming. When her father wants to take her to the swimming pool, she says"
    - "Both Ben and Lily wanted cake. Father said that there was only one piece of cake left. They"
    - "Ben went to visit Lily in her house, but she was not at home. Ben knocked on the door,"

# Conservative text-generation parameters.
.define: &generation_config !dict:@generation_config
    identity: generation_config
    do_sample: True
    top_k: 20
    top_p: 0.9
    temperature: 0.7
    repitition_penalty: 1.15

################ Model #################

# https://huggingface.co/docs/transformers/en/model_doc/auto
.define: &model_constructor_args {}

# Name: Tiny Causal
# Description: A scaled-down version of the base Causal Transformer
# model_def.cls = "DynamicCasualLM"
# model_def.cfg_cls = "DynamicCausalLMConfig"
# model_def.config_path = "/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/output_models/lower_lr/tiny_causal_transformer.py"
# model_def.model_path = "/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/output_models/lower_lr/tiny_causal_transformer.py"
# model_def.short_name = "tiny_causal_transformer"
# model_def.model_type = "forgather-dynamic-causal-tiny_causal_transformer"
# model_def.model_path = "./output_models/lower_lr/tiny_causal_transformer.py"
# model_def.model_template_searchpath = "/home/dinalt/ai_assets/forgather/templates/dynamic_models"
# model_def.model_template_name = "causal_lm.py"
# model_def.name_policy = "named"

# **Tokenizer**

# Load custom tokenizer from sub-project definition
.define: &tokenizer !singleton:forgather.ml.construct:load_from_config@tokenizer
    project_dir: "/home/dinalt/ai_assets/forgather/examples/tokenizers/tiny_stories_bpe"
    config_template: "2k.yaml"

# **Model Config**

.define: &model_submodule_searchpath
    - "./model_src"
    - "/home/dinalt/ai_assets/forgather/model_src/bits"
    - "./output_models/lower_lr"

.define: &loss_fn !singleton:.causal_loss:CausalLoss@loss_fn []

.define: &layer_norm_factory !lambda:torch.nn:LayerNorm@layer_norm_factory
    normalized_shape: !var "hidden_size"

.define: &feedforward_factory !lambda:.feedforward_layer:FeedforwardLayer@feedforward_factory
    d_model: !var "hidden_size"
    d_feedforward: !var "dim_feedforward"
    dropout: !var "activation_dropout"

.define: &attention_factory !lambda:.causal_multihead_attn:CausalMultiheadAttn@attention_factory
    d_model: !var "hidden_size"
    num_heads: !var "num_attention_heads"
    dropout: !var "attention_dropout"

.define: &layer_factory !lambda:.post_ln_layer:PostLNLayer@layer_factory
    feedforward_factory: *feedforward_factory
    attention_factory: *attention_factory
    norm_factory: *layer_norm_factory
    dropout: !var "layer_dropout"
    residual_dropout: !var "residual_dropout"

.define: &layer_stack !singleton:.layer_stack:LayerStack@layer_stack
    layer_factory: *layer_factory
    num_hidden_layers: !var "num_hidden_layers"

.define: &output_decoder !singleton:torch.nn:Linear@output_decoder
    - !var "hidden_size"
    - !var "vocab_size"

.define: &positional_encoder !singleton:.sinusoidal_pe:SinusoidalPE@positional_encoder
    d_model: !var "hidden_size"
    max_sequence_length: !var "max_sequence_length"

.define: &input_encoder !singleton:.input_encoder:InputEncoder@input_encoder
    d_model: !var "hidden_size"
    vocab_size: !var "vocab_size"
    dropout: !var "embedding_dropout"
    positional_encoder: *positional_encoder

.define: &init_weights !lambda:.init_weights:simple_weight_init@init_weights []

.define: &model_factory !singleton:.causal_lm:CasualLM@model_factory
    loss_fn: *loss_fn
    input_encoder: *input_encoder
    output_decoder: *output_decoder
    layer_stack: *layer_stack
    init_weights: *init_weights

.define: &model_code_generator !meta:forgather.codegen:generate_code@model_code_generator
    searchpath: "/home/dinalt/ai_assets/forgather/templates/dynamic_models"
    template_name: "causal_lm.py"
    name_policy: "named"
    obj: *model_factory
    # Template args
    model_type: "forgather-dynamic-causal-tiny_causal_transformer"

.define: &model_code_writer !singleton:forgather.ml.construct:write_file@model_code_writer
    data: *model_code_generator
    output_file: "./output_models/lower_lr/tiny_causal_transformer.py"
    return_value: "Model constructor generated by Forgather 1.0"    

.define: &model_config !singleton:./output_models/lower_lr/tiny_causal_transformer.py:DynamicCausalLMConfig@model_config
    submodule_searchpath: *model_submodule_searchpath
    # Set auto-map for custom model; this ensures that the source code stays with the model.
    auto_map:
        AutoConfig: "tiny_causal_transformer.DynamicCausalLMConfig"
        AutoModel: "tiny_causal_transformer.DynamicCasualLM"
    # Get the vocab-size from the tokenizer definition.
    vocab_size: !singleton:len [ *tokenizer ]
    pad_token_id: !singleton:getattr [ *tokenizer, 'pad_token_id' ]
    bos_token_id: !singleton:getattr [ *tokenizer, 'bos_token_id' ]
    eos_token_id: !singleton:getattr [ *tokenizer, 'eos_token_id' ]
    # Add dependency on code generator
    code_generator: *model_code_writer
    hidden_size: 512
    num_attention_heads: 8
    num_hidden_layers: 6
    max_sequence_length: !singleton:getattr
        - *tokenizer
        - "model_max_length"
    dim_feedforward: 2048
    embedding_dropout: 0.10
    layer_dropout: 0.10
    residual_dropout: 0.0
    attention_dropout: 0.0
    activation_dropout: 0.0
    
    # Tiny Causal overrides
    hidden_size: 256
    dim_feedforward: 1024
    num_attention_heads: 2
    num_hidden_layers: 4
    embedding_dropout: 0.0
    layer_dropout: 0.0
    
    # Project Overrides
    num_attention_heads: 4
    num_hidden_layers: 20
    embedding_dropout: 0.1
    layer_dropout: 0.1

    # Control with lowered LR Overrides

# **Model Constructor**

.define: &pretrained_model !singleton:./output_models/lower_lr/tiny_causal_transformer.py:DynamicCasualLM@pretrained_model
    args:
        - *model_config
    kwargs:
        submodule_searchpath: *model_submodule_searchpath
        <<: *model_constructor_args

.define: &model !singleton:forgather.ml.construct:dependency_list@model
    - *pretrained_model
    - !singleton:forgather.ml.construct:copy_package_files
        - "./output_models/lower_lr"
        - *model_config
    - !singleton:forgather.ml.construct:copy_package_files
        - "./output_models/lower_lr"
        - *pretrained_model

############### Datasets ###############

# Name: TinyStories Abridged
# Define: Abridged to 10% of original size; Dataset containing synthetically generated (by GPT-3.5 and GPT-4) short stories that only use a small vocabulary.
# Source: https://arxiv.org/abs/2305.07759
# Train Dataset: "roneneldan/TinyStories" : "train"
# Eval Dataset: "roneneldan/TinyStories" : "validation"

# **Source Datasets**

.define: &train_source_dataset !singleton:datasets:load_dataset@train_source_dataset
    - "roneneldan/TinyStories"

.define: &eval_source_dataset !singleton:datasets:load_dataset@eval_source_dataset
    - "roneneldan/TinyStories"

# **Dataset Splits**

.define: &train_dataset_split !singleton:operator:getitem
    - *train_source_dataset
    - "train"

.define: &eval_dataset_split !singleton:operator:getitem
    - *train_source_dataset
    - "validation"

# **Preprocess Dataset Args**

.define: &preprocess_args
    truncation: True

# **Preprocessed Datasets**

.define: &train_dataset !singleton:forgather.ml.datasets:preprocess_dataset@train_dataset
    dataset: *train_dataset_split
    tokenizer: *tokenizer
    select_range: 0.1
    desc: "Tokenizing train"
    fn_kwargs:
        <<: *preprocess_args

.define: &eval_dataset !singleton:forgather.ml.datasets:preprocess_dataset@eval_dataset
    dataset: *eval_dataset_split
    tokenizer: *tokenizer
    select_range: 500
    desc: "Tokenizing validation split"
    fn_kwargs:
        <<: *preprocess_args

############ Data Collator #############

# Data collator for causal model
# Batches are dynamically padded to longest sequence
# labels are set to input_ids, with pad tokens set to -100
# https://huggingface.co/docs/transformers/en/main_classes/data_collator#transformers.DataCollatorForLanguageModeling
.define: &data_collator !singleton:transformers:DataCollatorForLanguageModeling@data_collator
    args:
        - *tokenizer
    kwargs:
        mlm: False
        return_tensors: pt

########## Trainer Callbacks ###########

# **Dependencies**

# Experiment tracking: Tensorboard SummaryWriter
.define: &summary_writer !singleton:torch.utils.tensorboard:SummaryWriter
    - "./output_models/lower_lr/runs/lower_lr_2025-05-24T20-55-33"

# Additional data to record to experiment loggers
.define: &experiment_info !dict:@experiment_info
    date: "2025-05-24T20:55:33"
    name: "Control with lowered LR"
    description: "Baseline Simple Init"
    config: !var "pp_config"
    versions: {'python': '3.10.13', 'torch': '2.7.0', 'transformers': '4.51.3', 'accelerate': '1.7.0'}

.define: &text_gen_callback_args
    summary_writer: *summary_writer
    prompts: *testprompts
    generation_config: *generation_config
    max_new_tokens: 40
    generation_steps: 2000

# **Callback List**

.define: &trainer_callbacks !list:@trainer_callbacks
    # Log all training output to JSON
    - !singleton:forgather.ml.json_logger:JsonLogger
        <<: *experiment_info
    # Log configuration and metrics to Tensorboard file
    - !singleton:forgather.ml.tb_logger:TBLogger
        args: [ *summary_writer ]
        kwargs:
            <<: *experiment_info
    - !singleton:forgather.ml.textgen_callback:TextgenCallback
        <<: *text_gen_callback_args

############## Optimizer ###############

.define: &optimizer ~
############# LR Scheduler #############

.define: &lr_scheduler ~

############### Trainer ################

# Name: Custom forgather.ml.trainer.Trainer
# Description: A lightweight, extensible trainer; does not support multiple GPUs

# **Trainer Args**

.define: &trainer_args
    # Minimal Trainer Defaults
    # https://huggingface.co/docs/transformers/en/main_classes/trainer#transformers.TrainingArguments
    output_dir: "./output_models/lower_lr"
    logging_dir: "./output_models/lower_lr/runs/lower_lr_2025-05-24T20-55-33"
    logging_steps: 500
    per_device_train_batch_size: 16
    per_device_eval_batch_size: 32
    learning_rate: 5.0e-5
    num_train_epochs: 1
    # Base Trainer Defaults
    # https://huggingface.co/docs/transformers/en/main_classes/trainer#transformers.TrainingArguments
    overwrite_output_dir: True
    eval_steps: 100
    eval_strategy: "steps"
    save_strategy: "no"
    logging_strategy: "steps"

    # Tiny Project Overrides
    per_device_train_batch_size: 32
    per_device_eval_batch_size: 64
    logging_steps: 100
    eval_steps: 500
    learning_rate: 1.0e-3
    num_train_epochs: 1

    # project overrides
    per_device_train_batch_size: 8
    per_device_eval_batch_size: 16
    logging_steps: 100
    eval_steps: 500
    learning_rate: 1.0e-3
    num_train_epochs: 1


    # Experiment Overrides
    learning_rate: 1.0e-4

# **Trainer Constructor**

.define: &trainer !singleton:forgather.ml.trainer:Trainer@trainer
    model: *model
    args: !singleton:forgather.ml.trainer_types:TrainingArguments@trainer_args
        <<: *trainer_args
    data_collator: *data_collator
    train_dataset: *train_dataset
    eval_dataset: *eval_dataset
    processing_class: *tokenizer
    callbacks: *trainer_callbacks
    optimizer_factory: *optimizer
    lr_scheduler_factory: *lr_scheduler

#---------------------------------------
#          Configuration Output          
#---------------------------------------
meta: &meta_output !dict:@meta
    config_name: "Control with lowered LR"
    config_description: "Baseline Simple Init"
    config_class: "type.training_script.causal_lm"
    project_dir: "."
    workspace_root: "/home/dinalt/ai_assets/forgather"
    forgather_dir: "/home/dinalt/ai_assets/forgather"
    models_dir: "./output_models"
    tokenizers_dir: "/home/dinalt/ai_assets/forgather/tokenizers"
    datasets_dir: "/home/dinalt/ai_assets/forgather/datasets"
    output_dir: "./output_models/lower_lr"
    model_src_dir: "/home/dinalt/ai_assets/forgather/model_src"
    logging_dir: "./output_models/lower_lr/runs/lower_lr_2025-05-24T20-55-33"
    create_new_model: "True"
    save_model: "False"
    train: "True"
    eval: "False"

main: !singleton:forgather.ml.training_script:TrainingScript@training_script
    meta: *meta_output
    do_save: False
    do_train: True
    do_eval: False
    # Init distributed envrionment before initializing anyting which depends on it.
    distributed_env: *distributed_env
    trainer: *trainer
    pp_config: !var "pp_config"

model_code_writer: *model_code_writer
distributed_env: *distributed_env
model: *model
trainer: *trainer
train_dataset: *train_dataset
eval_dataset: *eval_dataset
data_collator: *data_collator
trainer_callbacks: *trainer_callbacks
trainer_args: *trainer_args
optimizer: *optimizer
lr_scheduler: *lr_scheduler
model_constructor_args: *model_constructor_args
tokenizer: *tokenizer

```

### Config Metadata:

```python
{'config_class': 'type.training_script.causal_lm',
 'config_description': 'Baseline Simple Init',
 'config_name': 'Control with lowered LR',
 'create_new_model': 'True',
 'datasets_dir': '/home/dinalt/ai_assets/forgather/datasets',
 'eval': 'False',
 'forgather_dir': '/home/dinalt/ai_assets/forgather',
 'logging_dir': './output_models/lower_lr/runs/lower_lr_2025-05-24T20-55-33',
 'model_src_dir': '/home/dinalt/ai_assets/forgather/model_src',
 'models_dir': './output_models',
 'output_dir': './output_models/lower_lr',
 'project_dir': '.',
 'save_model': 'False',
 'tokenizers_dir': '/home/dinalt/ai_assets/forgather/tokenizers',
 'train': 'True',
 'workspace_root': '/home/dinalt/ai_assets/forgather'}

```

## Modules
- [./output_models/lower_lr/tiny_causal_transformer.py](output_models/lower_lr/tiny_causal_transformer.py) : DynamicCasualLM
    - [/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/./output_models/lower_lr/tiny_causal_transformer.py](output_models/lower_lr/tiny_causal_transformer.py) : tiny_causal_transformer
        - [/home/dinalt/ai_assets/forgather/model_src/bits/causal_lm.py](../../../model_src/bits/causal_lm.py) : tiny_causal_transformer.causal_lm
        - [/home/dinalt/ai_assets/forgather/model_src/bits/causal_loss.py](../../../model_src/bits/causal_loss.py) : tiny_causal_transformer.causal_loss
        - [/home/dinalt/ai_assets/forgather/model_src/bits/causal_multihead_attn.py](../../../model_src/bits/causal_multihead_attn.py) : tiny_causal_transformer.causal_multihead_attn
        - [/home/dinalt/ai_assets/forgather/model_src/bits/feedforward_layer.py](../../../model_src/bits/feedforward_layer.py) : tiny_causal_transformer.feedforward_layer
        - [/home/dinalt/ai_assets/forgather/model_src/bits/init_weights.py](../../../model_src/bits/init_weights.py) : tiny_causal_transformer.init_weights
        - [/home/dinalt/ai_assets/forgather/model_src/bits/input_encoder.py](../../../model_src/bits/input_encoder.py) : tiny_causal_transformer.input_encoder
        - [/home/dinalt/ai_assets/forgather/model_src/bits/layer_stack.py](../../../model_src/bits/layer_stack.py) : tiny_causal_transformer.layer_stack
        - [/home/dinalt/ai_assets/forgather/model_src/bits/post_ln_layer.py](../../../model_src/bits/post_ln_layer.py) : tiny_causal_transformer.post_ln_layer
        - [/home/dinalt/ai_assets/forgather/model_src/bits/sinusoidal_pe.py](../../../model_src/bits/sinusoidal_pe.py) : tiny_causal_transformer.sinusoidal_pe
- [./output_models/lower_lr/tiny_causal_transformer.py](output_models/lower_lr/tiny_causal_transformer.py) : DynamicCausalLMConfig
## Output Targets
- meta
- main
- model_code_writer
- distributed_env
- model
- trainer
- train_dataset
- eval_dataset
- data_collator
- trainer_callbacks
- trainer_args
- optimizer
- lr_scheduler
- model_constructor_args
- tokenizer



## Constuct Project

In [10]:
import forgather.nb.notebooks as nb
from forgather import Project

# Pass config name
proj = Project("lower_lr.yaml")

## Dump Model Param Names

In [None]:
model = proj("model")
for name, param in model.named_parameters():
    print(name)

## Train Model in Notebook
This only works for a single GPU.

In [None]:
# Use default config and default output target (training script, in this example).
training_script = proj()
training_script.run()

## Start Tensorboard

In [4]:
# Show command to run tensorboard; local_host should be false if tensorboard should run on all network interfaces.
nb.display_tb_command(proj, local_host=False)

#### Tensorboard Command

```bash
tensorboard --bind_all --logdir "/home/dinalt/ai_assets/forgather/examples/trainers/deepnet/output_models/simple_init"
```

## Generate Trainingscript
The preferred way of running training is via the command-line. This generates a simple bash script to train the model.

In [11]:
# The second arg specifies which GPUs may be used. For example, "0,2" only allows the first and third GPU.
# Note that multi-GPU training requires a trainer implementation which supports this. e.g. "accel_trainer"
nb.generate_trainingscript(proj, "0")

#### Generated Shell Script
[lower_lr.sh](lower_lr.sh)
```bash
#!/bin/bash
CUDA_VISIBLE_DEVICES='0' torchrun --standalone --nproc-per-node 'gpu' '/home/dinalt/ai_assets/forgather/scripts/train_script.py' -p '/home/dinalt/ai_assets/forgather/examples/trainers/deepnet' "lower_lr.yaml"

```