Skip to content

YuchenLi01/transformer_topic_model_LDA

Repository files navigation

transformer_topic_model_LDA

Source code for our ICML 2023 paper How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding.

Documentation (under construction)

lda_bert_demo.ipynb: train a BERT model on LDA (topic modeling) data, and plot its attention pattern, and save other information such as attention score statistics, embedding dot products, model parameter visualizations, etc.

config/: the config files are auto-generated when you run the above iPython notebook and set the hyperparameters accordingly.

Acknowledgements

The code heavily borrows from dyck-transformer and dyckkm-learning. Thanks these authors!

Citations

If you found our paper or codes useful, please cite the paper and star this repo, thank you!

Feel free to contact yuchenl4@cs.cmu.edu if you have any questions.

@misc{li2023transformers,
  doi = {10.48550/ARXIV.2303.04245},
  url = {https://arxiv.org/abs/2303.04245},
  author = {Li, Yuchen and Li, Yuanzhi and Risteski, Andrej},
  keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding},
  publisher = {arXiv},
  year = {2023}, 
  copyright = {arXiv.org perpetual, non-exclusive license}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published