May I ask whether the BGE-M3-embedding and BGE-reranker-MiniCPM models use InfoNCE loss during fine-tuning?
From the code implementation, it appears that cross-entropy loss is used. Can I interpret this as using a cross-entropy formulation while still being essentially equivalent to InfoNCE in principle?