Skip to content

slavachalnev/ft_exp

Repository files navigation

ft_exp

Refactoring MLPs to be more interpretable by distilling them into larger, sparse MLPs.

The main analysis is in analysis.ipynb.

Training code is in train.py and model definitions are in model.py.

diagram

About

Interpretable MLP layers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published