An Information Minimization Based Contrastive Learning Model for Unsupervised Sentence Embeddings Learning.
This repository contains the code and pre-trained models for our paper An Information Minimization Based Contrastive Learning Model for
Unsupervised Sentence Embeddings Learning.
*************************************************Updates************************************************************
- Overview
- Train
- Requirements
- Training
- Evaluation
- Language Models
- Bugs or Questions
- Citation
We propose a contrastive learning model, InforMin-CL that discards the redundant information during the pre-training phase. InforMin-CL keeps important information and forgets redundant information by contrast and reconstruction operations. The following figure is an illustration of our model.
In the following section, we describe how to train a InforMin-CL model by using our code.
Requirements
First, install PyTorch by following the instructions from the the official website. To faithfully reproduce our resutls, please use the correct 1.7.1
version corresponding to your platforms/CUDA versions. PyTorch version higher than 1.7.1
should also work. For example, if you use Linux and CUDA11, install PyTorch by the following command,
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
if you instead use CUDA<11
or CPU, install PyTorch by the following command,
pip install torch==1.7.1
Then run the following script to install the remaining dependencies,
pip install -r requirements.txt
Training
python train.py
--model_name_or_path bert-base-uncased
Our evaluation code for sentence embeddings is based on a modified version of SentEval. It evaluates sentence embeddings on unsupervised (semantic textual similarity (STS)) tasks and supervised tasks. For unsupervised tasks, our evaluation takes the "all" setting, and report Spearman's correlation.
Before evaluation, please download the evaluation datasets by running
cd SentEval/data/downstream/
bash download_dataset.h
Then come back to the root directory, you can evaluate any transformers
-based pre-trained models using our evaluation code. For example,
python evaluation.py
--model_name_or_path informin-cl-bert-base-uncased
--pooler cls
--text_set sts
--mode test
Language models trained for which the performance is reported in the paper is available the Huggingface Model Repository.
Loading the model in Python. Just replace the name of the model.
With these models one should be able to reproduce the results on the benchmarks reported in the paper.
If you have any questions related to the code or the paper, feel free to contact with Shaobin (chenshaobin000001@gmail.com
). If you enconuter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can give you a hand!
Please cite our paper if you use InforMin-CL in your work:
@inproceedings{chen2022informin-cl,
title={An Information Minimization Contrastive Learning Model for Unsupervised Sentence Embeddings Learning},
author={Chen, Shaobin and Zhou, Jie and Sun, Yuling and He Liang},
booktitle={International Conference of Computational Linguistics (COLING)},
year={2022}}