# Overview

The current way configuration is designed in TransformerLens has a lot of limitations. It does not
allow for outside people to pass through configurations that are not officially supported, and it
is very bug prone with something as simple as typo potentially giving you a massive headache. There
are also a number of hidden rules that are not clearly documented, which can go hidden until
different pieces of TransformerLens are activated. Allowing to pass in an optional object of configuration
with no further changes does solve a couple of these problems, but it does not solve the bigger
issues. It also introduces new problems with users potentially passing in architectures that are not
supported without having a clear way to inform the user what isn't supported.

My proposal for how all of these problems can be resolved is to fundamentally revamp the
configuration to allow for something that I like to call configuration composition. From a technical
perspective, this involves creating a centralized class that describes all supported configurations
by TransformerLens. This class would then be used to construct specific configurations for all models
that are currently supported, and it would then allow anyone to easily see in a single place all
configuration features supported by TransformerLens while also being able to read the code to
understand how they can create their own configurations for the purpose of either submitting new
models into TransformerLens, or configuring an unofficially supported model by TransformerLens,
when TransformerLens already happens to support all of the architectural pieces separately.

This could simple be an overhaul of the existing HookedTransformerConfig. Everything I am
describing here could be made compatible with that class to give it a more usable interface that is
then directly interacted with by the end user. At the moment, that class is not really built to be
interacted with, and is instead used as a wrapper around spreading configured anonymous objects.
Overhauling this class to do what I am about to describe is a viable path, but keeping it as it is,
and making a new class as something meant to be used by the end user would be a way to maintain
compatibility, avoid refactors, and keep model configuration only focused on putting together
configuration for models, as opposed to configuring full settings needed by HookedTransformer, which
includes checking the available environment.

A very unscientific basic example of how this would look in code by the end user can be seen
immediately below. I will delve into details of each piece in this document.

In [None]:
config = ModelConfig(
    d_model=4096,
    d_head=8192 // 64,
    n_heads=64,
    act_fn="silu"
    # Other universally required properties across all models go here in the constructor
)
# Enabling specific features not universal among all models
config.enabled_gated_mlp()
# Customizing optional attributes
config.set_positional_embedding_type("alibi")

# and so on, until the full configuration is set


## The constructor

The first piece of this I want to talk about is what will be injected into the constructor. It
should basically take everything absolutely required by all models. This keeps the code easy for
someone to understand, without adding too much clutter. All fields should be required, and if there
is ever an idea that a field should be in the constructor as an option, then that is probably an
indication that there is a good case to add a function to configure that variable in a different
point in the class. An example of what this would look like can be seen below...

In [None]:
# make it easy for someone to see what activation functions are supported, this would be moved from
# HookedTransformerConfig
ActivationFunction = "silu" | "gelu"

class ModelConfig:
    def __init__(
        self,
        d_model: int,
        eps: int,
        act_fn: ActivationFunction,
        remaining_required_attributes,
    ):
        self.d_model = d_model
        self.eps = eps
        self.act_fn = act_fn
        # Set defaults for any remaining supported attributes that are not required here 
        self.gated_mlp = False


## Boolean Variables

Within TransformerLens config, anything that is a boolean variable is essentially a feature flag.
This means that all features at the time of construction would have default values, most likely set
to false. They then get toggled on with an `enable_feature` function call on the config object.
Having these functions will make very clear for someone less familiar with TransformerLens what
features are available. It also allows us to decorate these calls, which is very important. There
are some instances where if a boolean is true, a different one cannot be true, but this requirement
is not clear anywhere without analyzing code. Decorating these functions allows us to make sure
these sort of bugs are not possible. I will use `gated_mlp` as an example here, but it is not
meant to be a real implementation.

In [None]:
def enabled_gated_mlp(self: ModelConfig) -> ModelConfig:
    self.gated_mlp = True
    # Configure any side effects caused by enabling of a feature
    self.another_feature = False
    # Returning self allows someone to chain together config calls
    return self

ModelConfig.enabled_gated_mlp = enabled_gated_mlp

## Additional Options

Any other options would similarly have their own functions to configure. This allows for similar
decoration as with feature flags, and it also in a way documents the architectural capabilities of
TransformerLens in a single place. If there are groups of options that are also always required
together, this then gives us a way to require all of those options as opposed to having them all be
configured at the root level. This also allows us to make changes to other attributes that may be
affected as a side affect of having some values set, which again makes it both harder for people to
introduce bugs, and also creates code that documents itself. Another off the cuff example of
something like this can be seen below.

In [None]:
def set_rotary_dim(self: ModelConfig, rotary_dim: int) -> ModelConfig:
    self.rotary_dim = rotary_dim
    # Additional settings that seem to be present whenever rotary_dim is set
    self.positional_embedding_type = "rotary"
    self.rotary_adjacent_pairs = False
    return self

ModelConfig.set_rotary_dim = set_rotary_dim

## Config Final Thoughts

The best way to describe this idea is configuration composition. The reason being is that the user is
essentially composing a model configuration by setting the base, and then combining various options
from predefined functions. Doing it like this has a lot of advantages. One of those advantages being
that there would need to be a lot less memorization on how architectures should be combined. e.g.
maybe it's not that hard to remember that `rotary_adjacent_pairs` should be False when `rotary_dim`
is set, but these sorts of combinations accumulate. Having it interfaced out gives everyone a
place to look to see how parts of configuration work in isolation without the need to memorize a
large amount of rules.

This would also allow us to more easily mock out fake configurations and enable specific features in
order to test that functionality in isolation. This also should make it easier for someone to at a
glance understand all model compatibilities with TransformerLens, since there would be a single file
where they would all be listed out and documented. It will also allow for people to see
compatibility limitations at a glance.

As for compatibility, this change would be 100% compatible with the existing structure. The objects
I am suggesting are abstractions of the existing configuration dictionaries for the purpose of
communication and ease of use. This means that they can be passed around just like the current
anonymous dictionaries.

## Further Changes

With this, there are a number of changes that I would like to make to the actual
`loading_from_pretrained` file in order to revise it to be ready for the possibility of rapidly
supporting new models. The biggest change in this respect would be to break out what is now a
configuration dictionary for every model into having its own module where one of these configuration
objects would be constructed. That object would then be exposed, so that it can be imported into
`loading_from_pretrained`. We would then create a dictionary where the official name of the
model would have the configuration object as its value, thus completely eliminating that big giant
if else statement, and replacing it with a simple return from the dictionary. The configurations
themselves would then live in a directory structure like so...

config/ <- where the ModelConfig file lives
config/meta-llama/ <- directory for all models from the group
config/meta-llama/Llama-2-13b.py <- name matching hugging face to make it really easy to find the
                                    configuration

## Impact on Testing

This change, would allow us to directly interact with these configuration objects to allow us to
more easily assert that configurations are set properly, and to also allow us to more easily access
these configurations in tests for the purposes of writing better unit tests. 

## Summary

This change should solve a lot of problems. It may be a big change at first from what currently
exists, but in time I think most people will find it more elegant, and easier to understand. 