In-context Language Learning: Architectures and Algorithms [WIP]

This repo serves for the experiments for the paper:

Title: In-context Language Learning: Architectures and Algorithms

Authors : Ekin Akyürek, Bailin Wang, Yoon Kim, Jacob Andreas

Setup

conda create -n seq_icl python=3.11
pip install -r requirements.txt

Experiments

Experiments on DFA

To run the training,

python -m train experiment=dfa/lstm
python -m train experiment=dfa/retnet
python -m train experiment=dfa/gla
python -m train experiment=dfa/transformer+

Troubleshooting

add export PATH=$PATH:/usr/local/sbin:/usr/sbin:/sbin so that ldconfig can work properly
The MHA in simple_lm.py use num_heads, but in other modules we use n_heads. The name needs to be changed for consistency, but they're kept as is for now.
you might need to set up conv1d following the command in this issue

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

Acknowledgements

This repo is adapted from safari. Triton implementations are taken from linear rnn.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
configs		configs
src		src
sweeps		sweeps
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
analyze.py		analyze.py
batched_baum_welch.py		batched_baum_welch.py
eval.py		eval.py
generate.py		generate.py
get_smoothing_mlp_probs.py		get_smoothing_mlp_probs.py
make_attention_vis.py		make_attention_vis.py
make_cluster_plots.py		make_cluster_plots.py
make_video.py		make_video.py
ngram.py		ngram.py
plots.ipynb		plots.ipynb
probe.py		probe.py
probe.sh		probe.sh
probe_plots_wandb.ipynb		probe_plots_wandb.ipynb
pyproject.toml		pyproject.toml
read_results.ipynb		read_results.ipynb
read_results_local.ipynb		read_results_local.ipynb
readme.md		readme.md
requirements.txt		requirements.txt
samestateprobe.py		samestateprobe.py
sweep.py		sweep.py
train.py		train.py

License

berlino/seq_icl

Folders and files

Latest commit

History

Repository files navigation

In-context Language Learning: Architectures and Algorithms [WIP]

Setup

Experiments

Experiments on DFA

Troubleshooting

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages