# Project Index

In [1]:
import forgather.nb.notebooks as nb
nb.display_project_index()

## Tiny Models

A collection of tiny models to train on the Tiny Stories dataset with the tiny_stories_2k tokenizer.

This allows for direct comparison of model archetectures.

### Featuring
- Tiny Causal Transformer -- an example custom transformer model
- Tiny Llama
- Tiny GPT2

### Common Configuration
- Tokenizer: tokenizers/tiny_2k_bpe.yaml
    - Vocabulary Size: 2000
    - Maximum Model Sequence: 2048
- Dataset: datasets/tiny/tiny_stories_abridged.yaml
    - Dataset ID: roneneldan/TinyStories
    - Reference: https://arxiv.org/abs/2305.07759
    - Train Select Range: 10% 
- Model:
    - Model Dimension: 256
    - MLP Dimension: 1024
    - Layers: 4
    - Heads: 2
    - All Dropout Probabilities: 0.0
- Trainer:
    - Class: aiws.trainer.Trainer
    - Epochs: 1
    - Learning Rate: 1.0e-3
    - Batch Size: 16

#### Project Directory: "/home/dinalt/ai_assets/forgather/examples/tiny_experiments/tiny_models"

## Meta Config
Meta Config: [/home/dinalt/ai_assets/forgather/examples/tiny_experiments/tiny_models/meta.yaml](meta.yaml)

- [meta.yaml](meta.yaml)
    - [meta_defaults.yaml](../forgather_workspace/meta_defaults.yaml)
        - [base_directories.yaml](../forgather_workspace/base_directories.yaml)

Template Search Paths:
- [/home/dinalt/ai_assets/forgather/examples/tiny_experiments/tiny_models/templates](templates)
- [/home/dinalt/ai_assets/forgather/examples/tiny_experiments/forgather_workspace](../forgather_workspace)
- [/home/dinalt/ai_assets/forgather/examples/tiny_experiments/tiny_templates](../tiny_templates)
- [/home/dinalt/ai_assets/forgather/templatelib/modellib](../../../templatelib/modellib)
- [/home/dinalt/ai_assets/forgather/templatelib/examples](../../../templatelib/examples)
- [/home/dinalt/ai_assets/forgather/templatelib/base](../../../templatelib/base)

## Available Configurations
- [tiny_causal.yaml](templates/configs/tiny_causal.yaml)
- [tiny_gpt2.yaml](templates/configs/tiny_gpt2.yaml)
- [tiny_llama.yaml](templates/configs/tiny_llama.yaml)

Default Configuration: tiny_causal.yaml



In [6]:
nb.display_config(config_template="tiny_llama.yaml", show_pp_config=False, show_generated_code=False)

## Included Templates
- [configs/tiny_llama.yaml](templates/configs/tiny_llama.yaml)
    - [project.yaml](templates/project.yaml)
        - [projects/tiny.yaml](../tiny_templates/projects/tiny.yaml)
            - [prompts/tiny_stories.yaml](../tiny_templates/prompts/tiny_stories.yaml)
            - [types/training_script/causal_lm/causal_lm.yaml](../../../templatelib/base/types/training_script/causal_lm/causal_lm.yaml)
                - [trainers/trainer.yaml](../../../templatelib/base/trainers/trainer.yaml)
                    - [trainers/base_trainer.yaml](../../../templatelib/base/trainers/base_trainer.yaml)
                        - [trainers/minimal_trainer.yaml](../../../templatelib/base/trainers/minimal_trainer.yaml)
                - [callbacks/loggers.yaml](../../../templatelib/base/callbacks/loggers.yaml)
                    - [callbacks/base_callbacks.yaml](../../../templatelib/base/callbacks/base_callbacks.yaml)
                - [models/causal_lm/load_model.yaml](../../../templatelib/base/models/causal_lm/load_model.yaml)
                    - [models/causal_lm/from_pretrained.yaml](../../../templatelib/base/models/causal_lm/from_pretrained.yaml)
                        - [models/base_language_model.yaml](../../../templatelib/base/models/base_language_model.yaml)
                - [types/training_script/training_script.yaml](../../../templatelib/base/types/training_script/training_script.yaml)
                    - [types/type.yaml](../../../templatelib/base/types/type.yaml)
                        - [base_directories.yaml](../forgather_workspace/base_directories.yaml)
                - [inc/formatting.jinja](../../../templatelib/base/inc/formatting.jinja)
            - [tiny.callbacks](../tiny_templates/projects/tiny.yaml)
            - [tiny.model_config](../tiny_templates/projects/tiny.yaml)
                - [models/tiny/tiny_causal.yaml](../tiny_templates/models/tiny/tiny_causal.yaml)
                    - [tokenizers/tiny_2k.yaml](../../../templatelib/examples/tokenizers/tiny_2k.yaml)
                    - [models/dynamic_causal_transformer.yaml](../../../templatelib/examples/models/dynamic_causal_transformer.yaml)
                        - [models/causal_lm/custom_dynamic.yaml](../../../templatelib/base/models/causal_lm/custom_dynamic.yaml)
                            - [models/causal_lm/custom.yaml](../../../templatelib/base/models/causal_lm/custom.yaml)
            - [tiny.trainer_config](../tiny_templates/projects/tiny.yaml)
            - [tiny.dataset_config](../tiny_templates/projects/tiny.yaml)
                - [datasets/tinystories/tinystories_abridged.yaml](../../../templatelib/examples/datasets/tinystories/tinystories_abridged.yaml)
                    - [datasets/tinystories/tinystories.yaml](../../../templatelib/examples/datasets/tinystories/tinystories.yaml)
                        - [datasets//base_datasets.yaml](../../../templatelib/base/datasets/base_datasets.yaml)
        - [project.trainer_config](templates/project.yaml)
    - [experiment.model_config](templates/configs/tiny_llama.yaml)
        - [models/tiny/tiny_llama.yaml](../tiny_templates/models/tiny/tiny_llama.yaml)
            - [models/llama.yaml](../../../templatelib/examples/models/llama.yaml)
                - [models/causal_lm/from_config.yaml](../../../templatelib/base/models/causal_lm/from_config.yaml)
### Config Metadata:

```python
{'config_class': 'type.training_script.causal_lm',
 'config_description': 'A tiny llama model.',
 'config_name': 'Tiny LLama',
 'create_new_model': 'True',
 'datasets_dir': '/home/dinalt/ai_assets/forgather/examples/tiny_experiments/../../datasets',
 'eval': 'False',
 'forgather_dir': '/home/dinalt/ai_assets/forgather/examples/tiny_experiments/../..',
 'logging_dir': './output_models/tiny_llama/runs/log_2025-06-21T23-26-27',
 'model_src_dir': '/home/dinalt/ai_assets/forgather/examples/tiny_experiments/../../model_src',
 'models_dir': './output_models',
 'output_dir': './output_models/tiny_llama',
 'project_dir': '.',
 'save_model': 'False',
 'tokenizers_dir': '/home/dinalt/ai_assets/forgather/examples/tiny_experiments/../../tokenizers',
 'train': 'True',
 'workspace_root': '/home/dinalt/ai_assets/forgather/examples/tiny_experiments'}

```

## Modules
## Output Targets
- distributed_env
- testprompts
- generation_config
- model_constructor_args
- tokenizer
- model_config
- model
- train_source_dataset
- eval_source_dataset
- train_dataset_split
- eval_dataset_split
- preprocess_args
- train_dataset
- eval_dataset
- data_collator
- experiment_info
- text_gen_callback_args
- trainer_callbacks
- optimizer
- lr_scheduler
- trainer_args
- model_preprocessor
- trainer
- meta
- main



In [1]:
from forgather.project import Project
import forgather.nb.notebooks as nb

proj = Project("tiny_llama.yaml")

In [None]:
training_script = proj()
training_script.run()

In [None]:
# Show command to run tensorboard; local_host should be false if tensorboard should run on all network interfaces.
nb.display_tb_command(proj, local_host=False)

In [None]:
nb.generate_trainingscript(proj, "0")