Skip to content

grig001/Sentence-Encoder-Unum

Repository files navigation

Bert pre-training with retrieval purposes

The project is the LM pre-training pipeline with retrieval purposes.
Here you can find:


  • Evaluations tasks for retrieval (MRPC, STS-b). Both from GLUE benchmark
  • Dataset preparation scripts
  • Pre-training using Masked LM task on Wikipedia data.
  • WanDB logging
  • MultiGPU training code
  • Checkpointing
  • Fine-tune code using contrastive learning.
  • Results and checkpoints reported

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages