- Markov-LLM contains the full transformer model for binary first-order Markov sources.
- Markov-LLM-k contains the full transformer model for binary k-order Markov sources.
- Markov-LLM-m contains the full transformer model for Markov sources with arbitrary vocabulary size.
- Markov-LLM-depth contains code to run experiments that also save parameter weights across iterations.
- Markov-Simple contains a simplified transformer model for first-order Markov sources without layer norm.
- Markov-RPE contains the full transformer model with relative positional embeddings.
- Markov-Fixed contains a three-parameter simple architecture that mimicks a transformer model with rank-one parameter initialization.
The script to run the experiments is in src/main.py.