- Make sure you install pytorch and corresponding compatible cuda.
- pip3 -r install requirements.txt
- First use train_vocab_representation.sh with each year's paper to obtain the language model for each year
- Then use generate_year_vocab_representation.py to generate temporal vocabulary word representation.
- Here is a generated temporal vocabulary word representation, you can download it into project main directory to use.
- Use train_future_language_model.sh to train, remember to include the correct model_type and correct save model path. You can also tune other hyperparameters as you want.
There are different model type in the trainer parameter since we tried different models.
- GPT-2, model_type:gpt2
- The word frequency model, model_type:gpt2-unigram-rnn-window
- The
$contextual$ model, model_type:gpt2-vocab-repr-rnn-window-weight - The
$contextual^2$ model, model_type: gpt2-vocab-repr-rnn-window-sigmoid-attention.
We actually tried other ablations models, but did not include into the paper, you can refer to future_language_model_trainer.py to see corresponding models.
- Use generate.sh to generate, remember to include the correct model_type and correct saved model path.
We provide several evaluation tools in evaluation_tools.
For the simple perplexity score, use train_future_language_model.sh with only -eval to obtain the loss, and then use
For the content perplexity score, use compute_perplexity.sh, please include correct saved model path, you can change your test file in the line 360 of compute_perplexity.py.
For the content meteor score, use evaluate.sh, please include correct generated file path and test file path, this will generate a results pickle file. You can use analysis.py to view it.
If you think this paper or code helps your research, please kindly cite:
@inproceedings{
li2024future,
title={Future Language Modeling from Temporal Document History},
author={Changmao Li and Jeffrey Flanigan},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=bRLed9prWC}
}