# Project Index

[Custom Model Notebook](../../../notebooks/custom_model.ipynb)  
[Training Notebook](../../../notebooks/train.ipynb)  
[Project Config Notebook](../../../notebooks/project_config.ipynb)  
[Forgather Notebook](../../../notebooks/forgather.ipynb)  

In [2]:
import forgather.nb.notebooks as nb

nb.display_project_index(config_template="tiny_llama.yaml", show_pp_config=False, show_generated_code=False, pp_first=False)

## Tiny Models

A collection of tiny models to train on the Tiny Stories dataset with the tiny_stories_2k tokenizer.

This allows for direct comparison of model archetectures.

### Featuring
- Tiny Causal Transformer -- an example custom transformer model
- Tiny Llama
- Tiny GPT2

### Common Configuration
- Tokenizer: tokenizers/tiny_2k_bpe.yaml
    - Vocabulary Size: 2000
    - Maximum Model Sequence: 2048
- Dataset: datasets/tiny/tiny_stories_abridged.yaml
    - Dataset ID: roneneldan/TinyStories
    - Reference: https://arxiv.org/abs/2305.07759
    - Train Select Range: 10% 
- Model:
    - Model Dimension: 256
    - MLP Dimension: 1024
    - Layers: 4
    - Heads: 2
    - All Dropout Probabilities: 0.0
- Trainer:
    - Class: aiws.trainer.Trainer
    - Epochs: 1
    - Learning Rate: 1.0e-3
    - Batch Size: 16

#### Project Directory: "/home/dinalt/ai_assets/forgather/examples/trainers/tiny_models"

## Meta Config
Meta Config: [/home/dinalt/ai_assets/forgather/examples/trainers/tiny_models/meta.yaml](meta.yaml)

- [meta.yaml](meta.yaml)
    - [meta_defaults.yaml](../../../forgather_workspace/meta_defaults.yaml)
        - [base_directories.yaml](../../../forgather_workspace/base_directories.yaml)

Template Search Paths:
- [/home/dinalt/ai_assets/forgather/examples/trainers/tiny_models/templates](templates)
- [/home/dinalt/ai_assets/forgather/forgather_workspace](../../../forgather_workspace)
- [/home/dinalt/ai_assets/forgather/templates/tiny_experiments](../../../templates/tiny_experiments)
- [/home/dinalt/ai_assets/forgather/templates/modellib](../../../templates/modellib)
- [/home/dinalt/ai_assets/forgather/templates/base](../../../templates/base)

## Available Configurations
- [tiny_causal.yaml](templates/configs/tiny_causal.yaml)
- [tiny_gpt2.yaml](templates/configs/tiny_gpt2.yaml)
- [tiny_llama.yaml](templates/configs/tiny_llama.yaml)

Default Configuration: tiny_causal.yaml

Active Configuration: tiny_llama.yaml

## Included Templates
- [configs/tiny_llama.yaml](templates/configs/tiny_llama.yaml)
    - [model_ctor/args.yaml](../../../templates/modellib/model_ctor/args.yaml)
    - [project.yaml](templates/project.yaml)
        - [projects/tiny.yaml](../../../templates/tiny_experiments/projects/tiny.yaml)
            - [datasets/tiny/tiny_stories_abridged.yaml](../../../templates/tiny_experiments/datasets/tiny/tiny_stories_abridged.yaml)
                - [datasets/tiny/tiny_stories.yaml](../../../templates/tiny_experiments/datasets/tiny/tiny_stories.yaml)
                    - [datasets/abstract/base_datasets.yaml](../../../templates/base/datasets/abstract/base_datasets.yaml)
                        - [inc/formatting.jinja](../../../templates/base/inc/formatting.jinja)
            - [prompts/tiny_stories.yaml](../../../templates/tiny_experiments/prompts/tiny_stories.yaml)
            - [types/training_script/causal_lm/causal_lm.yaml](../../../templates/base/types/training_script/causal_lm/causal_lm.yaml)
                - [trainers/trainer.yaml](../../../templates/base/trainers/trainer.yaml)
                    - [trainers/base_trainer.yaml](../../../templates/base/trainers/base_trainer.yaml)
                        - [trainers/minimal_trainer.yaml](../../../templates/base/trainers/minimal_trainer.yaml)
                - [callbacks/loggers.yaml](../../../templates/base/callbacks/loggers.yaml)
                    - [callbacks/base_callbacks.yaml](../../../templates/base/callbacks/base_callbacks.yaml)
                - [models/abstract/load_model.yaml](../../../templates/base/models/abstract/load_model.yaml)
                    - [models/abstract/causal_lm_from_pretrained.yaml](../../../templates/base/models/abstract/causal_lm_from_pretrained.yaml)
                        - [models/abstract/base_language_model.yaml](../../../templates/base/models/abstract/base_language_model.yaml)
                - [types/training_script/training_script.yaml](../../../templates/base/types/training_script/training_script.yaml)
                    - [types/type.yaml](../../../templates/base/types/type.yaml)
                        - [base_directories.yaml](../../../forgather_workspace/base_directories.yaml)
            - [tiny.callbacks](../../../templates/tiny_experiments/projects/tiny.yaml)
            - [tiny.model_config](../../../templates/tiny_experiments/projects/tiny.yaml)
                - [models/tiny/tiny_causal.yaml](../../../templates/tiny_experiments/models/tiny/tiny_causal.yaml)
                    - [tokenizers/tiny_2k.yaml](../../../templates/tiny_experiments/tokenizers/tiny_2k.yaml)
                    - [models/dynamic_causal_transformer.yaml](../../../templates/modellib/models/dynamic_causal_transformer.yaml)
                        - [models/abstract/dynamic_causal_lm.yaml](../../../templates/base/models/abstract/dynamic_causal_lm.yaml)
                            - [models/abstract/custom_causal_lm.yaml](../../../templates/base/models/abstract/custom_causal_lm.yaml)
            - [tiny.trainer_config](../../../templates/tiny_experiments/projects/tiny.yaml)
        - [project.trainer_config](templates/project.yaml)
    - [experiment.model_config](templates/configs/tiny_llama.yaml)
        - [models/tiny/tiny_llama.yaml](../../../templates/tiny_experiments/models/tiny/tiny_llama.yaml)
            - [models/llama.yaml](../../../templates/modellib/models/llama.yaml)
                - [models/abstract/causal_lm_from_config.yaml](../../../templates/base/models/abstract/causal_lm_from_config.yaml)
### Config Metadata:

```python
{'config_class': 'type.training_script.causal_lm',
 'config_description': 'A tiny llama model.',
 'config_name': 'Tiny LLama',
 'create_new_model': 'False',
 'datasets_dir': '/home/dinalt/ai_assets/forgather/datasets',
 'eval': 'False',
 'forgather_dir': '/home/dinalt/ai_assets/forgather',
 'logging_dir': './output_models/tiny_llama/runs/log_2025-05-18T17-30-11',
 'model_src_dir': '/home/dinalt/ai_assets/forgather/model_src',
 'models_dir': './output_models',
 'output_dir': './output_models/tiny_llama',
 'project_dir': '.',
 'save_model': 'False',
 'tokenizers_dir': '/home/dinalt/ai_assets/forgather/tokenizers',
 'train': 'True',
 'workspace_root': '/home/dinalt/ai_assets/forgather'}

```

## Modules
## Output Targets
- meta
- main
- model_code_writer
- distributed_env
- model
- trainer
- train_dataset
- eval_dataset
- data_collator
- trainer_callbacks
- trainer_args
- optimizer
- lr_scheduler
- model_constructor_args
- tokenizer



In [None]:
from forgather.project import Project
proj = Project("tiny_llama.yaml")
training_script = proj()
training_script.run()