This repo contains the code and slides for the talk "Representation is King: The Journey to Quality Dialog Embeddings" presented at the EuroPython 2024.
In natural language processing, embeddings are crucial for understanding textual data. In this talk, we’ll explore sentence embeddings and their application in dialog systems. We’ll focus on a use case involving the classification of dialogs.
We’ll demonstrate the necessity of sentence transformers for this problem, specifically utilizing one of the top-performing small-sized sentence transformers. We will show how to fine-tune this model with both labeled and unlabeled dialog data, using the SentenceTransformers Python framework.
This talk is practical, packed with easy-to-follow examples, and aimed at building intuition around this topic. While some basic knowledge of Transformers would be beneficial, it is not required. Newcomers are also welcome.