This is the codebase for the Guarantees-Based Mechanistic Interpretability MARS stream. Successor to https://github.com/JasonGross/neural-net-coq-interp.
The code can be run under any environment with Python 3.9 and above.
We use poetry for dependency management, which can be installed following the instructions here.
To build a virtual environment with the required packages, simply run
poetry config virtualenvs.in-project true
poetry install
Notes
- On some systems you may need to set the environment variable
PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
to avoid keyring-based errors. - The first line tells poetry to create the virtual environment in the project directory, which allows VS Code to find the virtual environment.
- If you are using caches from other machines, if you see errors like "dbm.error: db type is dbm.gnu, but the module is not available", you can probably solve the issue by following instructions from StackOverflow:
sudo apt-get install libgdbm-dev python3-gdbm
- If you are using
conda
or some other Python version management, you can inspect the output ofdpkg -L python3-gdbm
and copy thelib-dynload/_gdbm.cpython-*-x86_64-linux-gnu.so
file to the correspondinglib/
directory associated to the python you are using.
To open a Jupyter notebook, run
poetry run jupyter lab
If this doesn't work (e.g. you have multiple Jupyter kernels already installed on your system), you may need to make a new kernel for this project:
poetry run python -m ipykernel install --user --name=gbmi
Models for existing experiments can be trained by running e.g.
poetry run python -m gbmi.exp_max_of_n.train
or by running e.g.
from gbmi.exp_max_of_n.train import MAX_OF_10_CONFIG
from gbmi.model import train_or_load_model
rundata, model = train_or_load_model(MAX_OF_10_CONFIG)
from a Jupyter notebook.
This function will attempt to pull a trained model with the specified config from Weights and Biases; if such a model does not exist, it will train the relevant model and save the weights to Weights and Biases.
The convention for this codebase is to store experiment-specific code in an exp_[NAME]/
folder, with
exp_[NAME]/analysis.py
storing functions for visualisation / interpretabilityexp_[NAME]/verification.py
storing functions for verificationexp_[NAME]/train.py
storing training / dataset code
See the exp_template
directory for more details.
To add new dependencies, run poetry add my-package
.
We use black to format our code. To set up the pre-commit hooks that enforce code formatting, run
make pre-commit-install
This codebase advocates for expect tests in machine learning, and as such uses @ezyang's expecttest library for unit and regression tests.
[TODO: add tests?]