## Get started

We first initialize the [sbert operator](https://towhee.io/sentence-embedding/sbert), which takes a sentence or a list of sentences in string as input. It generates an embedding vector in numpy.ndarray for each sentence, which captures the input sentence's core semantic elements.
Then, we fine-tune operator in Semantic Textual Similarity (STS) task, which assigns a score on the similarity of two texts. We use the [STSbenchmark](https://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark) as training data to fine-tune.
We only need to construct an operator instance and pass in some configurations to train the specified task.

In [None]:
import towhee
import os
from sentence_transformers import util

op = towhee.ops.sentence_embedding.sbert(model_name='nli-distilroberta-base-v2').get_op()

sts_dataset_path = 'datasets/stsbenchmark.tsv.gz'

if not os.path.exists(sts_dataset_path):
    util.http_get('https://sbert.net/datasets/stsbenchmark.tsv.gz', sts_dataset_path)


model_save_path = './output'
training_config = {
    'sts_dataset_path': sts_dataset_path,
    'train_batch_size': 16,
    'num_epochs': 4,
    'model_save_path': model_save_path
}
op.train(training_config)

## Load trained weights

In [None]:
### You just need to init a new operator with the trained folder under `model_save_path`.
model_path = os.path.join(model_save_path, os.listdir(model_save_path)[-1])
new_op = towhee.ops.sentence_embedding.sbert(model_name=model_path).get_op()

## Dive deep and customize your training
You can change the [training script](https://towhee.io/sentence-embedding/sbert/src/branch/main/train_sts_task.py) in your custom way. Or you can refer to the original [sbert training guide](https://www.sbert.net/docs/training/overview.html) and [code example](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training) for more information.