Physiological modeling into the metaverse of Mycobacterium tuberculosis beta CA inhibition mechanism
-
Updated
May 23, 2024
Physiological modeling into the metaverse of Mycobacterium tuberculosis beta CA inhibition mechanism
A project that simulates a game of shuffling cups with a hidden ball underneath one of them. It also trains a Transformer based deep learning model to predict the final position of the ball after a series of swaps.
Implementation for paper "Understanding and Patching Compositional Reasoning in LLMs" @ ACL2024-Findings, Bangkok, Thailand.
Reversed-engineered Transformer models as a benchmark for interpretability methods
Organizer's repository for the Transformer Interpretability CodaBench competition
Solution to ML assignments from the Alignment Research Engineering Accelerator (ARENA) in-person program
Visualising (self)-attention as a vector field: exploring and building intuition. Based on anvaka.github.io/fieldplay.
Exploring length generalization in the context of indirect object identification (IOI) task for mechanistic interpretability.
A replication of "Toy Models of Superposition," a groundbreaking machine learning research paper published by authors affiliated with Anthropic and Harvard in 2022.
Interpretability on 1-layer Transformer models that converge on the Bayesian-optimal solution for statistical tasks
Starting Kit for the CodaBench competition on Transformer Interpretability
This repository contains the code used for the experiments in the paper "Discovering Variable Binding Circuitry with Desiderata".
Identifying Circuit behind Pronoun Prediction in GPT-2 Small
graphpatch is a library for activation patching on PyTorch neural network models.
A mechanistic interpretability study invvestigating a sequential model trained to play the board game Othello
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals
CoSy: Evaluating Textual Explanations
🦠 DeepDecipher: An open source API to MLP neurons
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking".
Add a description, image, and links to the mechanistic-interpretability topic page so that developers can more easily learn about it.
To associate your repository with the mechanistic-interpretability topic, visit your repo's landing page and select "manage topics."