TitleStylist_Zh

The following Repo is Hard fork of TitleStylist

The code uses the Chinese generation model ProphetNet-Dialog-Zh for initializing the base encoder-decoder architecture.

Model uses ideas from TitleStylist (Style Layer Normalization and Style-Guided Encoder Attention) with base pretrained model as ProphetNet-Dialog-Zh with Masked Sequence to Sequence Pretraining inspire from MASS

Pretrained ProphetNet-Dialog-Zh

Requirements

Python packages

Pytorch
fairseq
blingfire

In order to install them, you can run this command:

pip install -r requirements.txt

Bash commands

In order to evaluate the generated headlines by ROUGE scores, you need to install the "files2rouge" package. To do so, run the following commands (provided by this repository):

pip install -U git+https://github.com/pltrdy/pyrouge
git clone https://github.com/pltrdy/files2rouge.git     
cd files2rouge
python setup_rouge.py
python setup.py install

Usage

All data including the combination of CNN and NYT article and headline pairs, and the three style-specific corpora (humor, romance, and clickbait) mentioned in the paper have been placed in the folder "data".
Please download the pretrained model parameters of ProphetNet-Dialog-Zh from this link, unzip it, and put the unzipped files into the folder "pretrained_model/prophet_zh/".

cd pretrained_model
mkdir prophet_zh
wget https://msraprophetnet.blob.core.windows.net/prophetnet/release_checkpoints/prophetnet_dialog_zh.pt
tar -xvzf prophetnet_dialog_zh.pt
cd ..

To train a headline generation model that can simultaneously generated a facutal and a stylistic headline, you can run the following command:

./train_mix_Zh_News_X.sh --style ZH_poem

Here the arugment YOUR_TARGET_STYLE specifies any style you would like to have, in this paper, we provide three options: humor, romance, clickbait.

After running this command, the trained model parameters will be saved into the folder "tmp/exp".

If you want to evaluate the trained model and generate headlines (both factual and stylistic) using this model, please run the following command:

./evaluate_mix_Zh_News_X.sh --style style ZH_poem --model_dir MODEL_STORED_DIRCTORY

In this command, the argument MODEL_STORED_DIRCTORY specifies the directory which stores the trained model.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
data_zh		data_zh
mass		mass
pretrained_model		pretrained_model
prophetnet		prophetnet
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_process.sh		data_process.sh
data_process_zh.sh		data_process_zh.sh
encode.py		encode.py
evaluate_mix_CNN_NYT_X.sh		evaluate_mix_CNN_NYT_X.sh
evaluate_mix_CNN_NYT_multiX.sh		evaluate_mix_CNN_NYT_multiX.sh
evaluate_mix_Zh_News_X.sh		evaluate_mix_Zh_News_X.sh
file_utils.py		file_utils.py
files_merger.py		files_merger.py
requirements.txt		requirements.txt
tokenization_bert.py		tokenization_bert.py
tokenization_utils.py		tokenization_utils.py
train.py		train.py
train_mix_CNN_NYT_X.sh		train_mix_CNN_NYT_X.sh
train_mix_CNN_NYT_multiX.sh		train_mix_CNN_NYT_multiX.sh
train_mix_Zh_News_X.sh		train_mix_Zh_News_X.sh

License

tejasvaidhyadev/TitleStylist_Zh

Folders and files

Latest commit

History

Repository files navigation

TitleStylist_Zh

Requirements

Python packages

Bash commands

Usage

About

Resources

License

Stars

Watchers

Forks

Languages