Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion how-to-train-sentence-transformers.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ Check out this tutorial with the Notebook Companion:
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

---
---

Training or fine-tuning a Sentence Transformers model highly depends on the available data and the target task. The key is twofold:
1. Understand how to input data into the model and prepare your dataset accordingly.
Expand All @@ -50,7 +52,7 @@ In a Sentence Transformer model, you map a variable-length text (or image pixels
This is how the Sentence Transformers models work:

1. **Layer 1** – The input text is passed through a pre-trained Transformer model that can be obtained directly from the [Hugging Face Hub](https://huggingface.co/models?pipeline_tag=fill-mask&sort=downloads). This tutorial will use the "[distilroberta-base](https://huggingface.co/distilroberta-base)" model. The Transformer outputs are contextualized word embeddings for all input tokens; imagine an embedding for each token of the text.
2. **Layer 2**: The embeddings go through a pooling layer to get a single fixed-length embedding for all the text. For example, mean pooling averages the embeddings generated by the model.
2. **Layer 2** - The embeddings go through a pooling layer to get a single fixed-length embedding for all the text. For example, mean pooling averages the embeddings generated by the model.

This figure summarizes the process:

Expand Down