Train paraphrase recognition models for detecting related search queries. Queries are considered close if they relate to the solution of the same user task.
Dataset wes collect using toloka.ai Examples could be found here:
Project settings:
Pool settings:
Using pytorch-lightning
with base bert-base-multilingual-cased
model
achieved good quality on test dataset, and it cost just 10$.