what's the difference between USE and then SBERT? #64

Cumberbatch08 · 2019-11-29T02:43:15Z

First, many thanks to your paper and code.
But I read the universal sentence encoder(USE) paper, the architecture is like simaese network, they also used the SNLI dataset.
But your result is well performed. So I'm very interested in your work.

nreimers · 2019-11-29T10:03:25Z

Hi @Cumberbatch08
sadly the USE papers (at least the ones I know) are extremely high-level, not going really into the details. So it is unclear which architecture they exactly used and how the training was done (exact datasets, exact loss function etc.)

Differences:

USE and SBERT both use transformer networks. For USE, it is sadly not clear how many layers they use (most technical details are not provided). USE was trained from scratch (as far as I can tell from the paper), while SBERT uses the BERT / RoBERTa pre-trained wights and just fine-tunes them to produce sentence embeddings.
I think the main difference is in the pre-training. USE uses a wide variety of data sets (exact details not provided), specifically target for generating sentence embeddings. BERT was pre-trained on a book corpus and on Wikipedia for producing a language model (see the BERT paper). SBERT than fine-tunes BERT to produce sensible sentence embeddings.
USE is in TensorFlow and tuning for your use-case is not straightforward (source code not available, you only get the compiled model from tensorflow-hub). SBERT is based on pytorch and the goal of this repository is, that fine-tuning for your use-case is as simple as possible.

Cumberbatch08 · 2019-11-29T10:12:46Z

haha, yes, absolutely agreed what you said. The USE don't public much more details, such as the layers, dataset, loss etc.
I get some information about the architecture:

Just as you said, maybe the pretraining is important.

Gurutva · 2022-01-27T06:09:13Z

what would be best USE (https://tfhub.dev/google/universal-sentence-encoder/4) or SBERT models (https://huggingface.co/sentence-transformers) for good semantic search results ?

nreimers · 2022-01-27T07:18:57Z

@Gurutva SBERT works much better: https://arxiv.org/pdf/2104.08663v1.pdf

nreimers closed this as completed Feb 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what's the difference between USE and then SBERT? #64

what's the difference between USE and then SBERT? #64

Cumberbatch08 commented Nov 29, 2019

nreimers commented Nov 29, 2019

Cumberbatch08 commented Nov 29, 2019

Gurutva commented Jan 27, 2022

nreimers commented Jan 27, 2022

what's the difference between USE and then SBERT? #64

what's the difference between USE and then SBERT? #64

Comments

Cumberbatch08 commented Nov 29, 2019

nreimers commented Nov 29, 2019

Cumberbatch08 commented Nov 29, 2019

Gurutva commented Jan 27, 2022

nreimers commented Jan 27, 2022