This code is for NAACL 2021 paper D2S: Document-to-Slide Generation Via Query-Based Text Summarization
This repository contains:
- sciduet-build: code to reconstruct the training dataset from NLP/ML Papers in PDF format together with their corresponding slides
- SciDuet-ACL: finished preprocess ACL training data
- Derivability annotations together with the trained classifier
- d2s-model: code to train and evaluate automatic slide generation system
Edward Sun, Yufang Hou, Dakuo Wang, Yunfeng Zhang, Nancy X.R. Wang. D2S: Document-to-Slide Generation Via Query-Based Text Summarization. In Proceedings of the 18th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2021), Online, 6 - 11 June 2021