The Code for "A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation"
- ubuntu 16.04
- python 3.5
- tensorflow 1.4.1
- nltk 3.2.5
Compression dataset: Since the original dataset is too large, we only use a subset of this dataset. The processed data can be found at data/trainfeature02.json, data/testfeature02.json, data/validfeature02.json.
Storytelling dataset: The dataset is listed at data/story/train_process.txt, data/story/valid_process.txt, data/story/test_process.txt.
- First, we pre-train a sentence compression module.
- Second, we use the pre-trained compression module to extract skeletons for storytelling dataset. The feature files for extracting skeleton are listed at data/story/train_sc.txt, data/story/valid_sc.txt, data/story/test_sc.txt. The extracted skeleton files are listed at data/0/train_skeleton.txt, data/0/valid_skeleton.txt, data/0/test_skeleton.txt.
- Third, we use the extracted skeletons to train the input-to-skeleton module and the skeleton-to-sentence module.
- Finally, we connect all modules by reinforcement learning.
CUDA_VISIBLE_DEVICES=2 nohup bash > log_train.txt &
CUDA_VISIBLE_DEVICES=2 nohup bash > log_test.txt &
To use this code, please cite the following paper:
Jingjing Xu, Yi Zhang, Qi Zeng, Xuancheng Ren, Xiaoyan Cai, Xu Sun.
A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation. EMNLP 2018.
author = {Jingjing Xu and Yi Zhang and Qi Zeng and Xuancheng Ren and Xiaoyan Cai and Xu Sun},
title = {A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation},
booktitle = {EMNLP},
year = {2018}