SPRUCE

This code trains and uses a model for improving rare and unknown words for deep contextualized models like BERT. For more information, please see our NAACL Findings paper here: https://aclanthology.org/2024.findings-naacl.88/

To see how this code is run, see train_bertram_on_pca_embs.sh .

SPRUCE model defined in bertram_variants.py

This code is based on the code from https://github.com/timoschick/bertram

Here is a summary on how to use:

Build a corpus using preprocess from https://github.com/timoschick/form-context-model.
Train a context mode and subword model using commands in sh script (create directory vars).
Fuse the models (see sh script).
Run full model (see sh script).
To use model to estimate rare words (for whichever task), use preprocess from https://github.com/timoschick/form-context-model on task corpus and list of rare words, then call infer_vectors_fixed.
Use BERTRAMWrapper (for details, see bertram.py and https://github.com/timoschick/bertram) to use estimated rare embeddings in final task.

For the evaluation tasks, please refer to the sources cited in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
HiCE_Transformer_Methods.py		HiCE_Transformer_Methods.py
README.md		README.md
bertram.py		bertram.py
bertram_variants.py		bertram_variants.py
fuse_models.py		fuse_models.py
fuse_models_variants.py		fuse_models_variants.py
infer_vectors_fixed.py		infer_vectors_fixed.py
input_processor.py		input_processor.py
ngram_models.py		ngram_models.py
requirements.txt		requirements.txt
train.py		train.py
train_bertram_on_pca_embs.sh		train_bertram_on_pca_embs.sh
train_variants.py		train_variants.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPRUCE

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rajicon/SPRUCE

Folders and files

Latest commit

History

Repository files navigation

SPRUCE

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages