New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
computing alignment and uniformity #120
Comments
The uniformity will drop very fast from the beginning. Can you specify what is your initialization and what's the stride to calculate the uniformity? |
I didn't change anything else except adding some lines to calculate the alignment and uniformity (as mentioned before). More specifically, from run_unsup_example.sh
For initialisation, I didn't change random seed. So I guess it's 42 from huggingface (don't know, maybe wrong). |
If I understand correctly, you calculate the alignment/uniformity every 125 step (the same as validation). In the original paper, we calculate every 10 step, because as I mentioned, the uniformity drops very fast at the beginning of the training. |
ah, so you mean every 10 update steps / batches? I thought it was every 10 * 125 batches. But even if that's the case, I'm not sure if figure 2 provides a good explanation here because after 125 steps (or 12 little red stars in figure 2), the accuracy (on STSB dev) is only around 60%, which is much lower than 82.5% in the paper. So, I think you can use fig 2 to explain what happens in the very first training phase, but then, the gap of 82.5 - 60 = 22.5% is not explained. |
You can use Figure 3 as a reference (although it's not a rigorous comparison because we didn't put CLS BERT representation, which is the initialization for SimCSE into the figure), and it's the uniformity that makes a huge difference. |
that makes sense. thanks |
I'm following Wang and Isola to compute alignment and uniformity (using their given code in Fig 5, http://proceedings.mlr.press/v119/wang20k/wang20k.pdf) to reproduce Fig 2 in your paper but fail. What I saw is that the alignment decreases whereas the uniformity is almost unchanged, which is completely different from Fig 2. Details are below.
To compute alignment and uniformity, I changed line 66-79 file SimCSE/blob/main/SentEval/senteval/sts.py by adding the code from Wang and Isola:
The output (which also shows spearman on stsb dev set) is
We can see that alignment drops from 0.26 to less than 0.20 whereas uniformity is still around -2.55. It means that reducing alignment is key, not uniformity. This trend is completely different from Fig 2.
Did you also use the code from Wang and Isola like I did? If possible, could you please provide the code for reproducing alignment and uniformity?
The text was updated successfully, but these errors were encountered: