Skip to content

Jd8111997/Supervised-contrastive-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Supervised-Contrastive-Learning

Implementation of Supervised Contrastive Learning for pretrained language models[https://arxiv.org/abs/2011.01403].

  • In this project, we tried to reproduce the paper results on SST2 and CoLA datasets using Roberta-large model of huggingface.
  • Although, we couldn't reproduce the results for few shot learning tasks(20 and 50 training samples), but when training on whole dataset, we got 1.5% gain on test accuracy, which is sampled from original dev set and it contains 500 samples with equal distribution of labels as per mentioned in the paper.
  • All other hyper-parameters are same as given in paper.
  • Some of the challenges that we faced is when trying this method on few shot learning setting, the model is started getting overfitted on training set and in the paper, the authors didn't mentioned how they resolve the overfitting issue while training on < 50 samples.
  • We tried dorpout, layer normalization, LeakyReLU activation etc, but it didn't resolve overfitting issue.
  • Below are visualizations of learned sentence embeddings of model trained with scl loss and without scl loss.
  • With supervised contrasive loss, we can see that the model is able to seperate out the two clusters(+ve ad -ve sentiment for SST2) Alt text
  • Learned sentence embeddings without using supervised contrastive loss. Alt text

References

About

Implementation of Supervised Contrastive Learning for pretrained language models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published