Skip to content

Code for experiments on transformers using Markovian data.

Notifications You must be signed in to change notification settings

Bond1995/Markov

Repository files navigation

Code organization

A few pointers

  • Markov-LLM contains the full transformer model for binary first-order Markov sources.
  • Markov-LLM-k contains the full transformer model for binary k-order Markov sources.
  • Markov-LLM-m contains the full transformer model for Markov sources with arbitrary vocabulary size.
  • Markov-LLM-depth contains code to run experiments that also save parameter weights across iterations.
  • Markov-Simple contains a simplified transformer model for first-order Markov sources without layer norm.
  • Markov-RPE contains the full transformer model with relative positional embeddings.
  • Markov-Fixed contains a three-parameter simple architecture that mimicks a transformer model with rank-one parameter initialization.

The script to run the experiments is in src/main.py.

About

Code for experiments on transformers using Markovian data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published