### Embedding Steering

The idea of embedding steering is influenced by attention steering in LLMs and hypothetical document generation in RAG pipelines [paper ref here].  
In the first iteration, I propose not to influence any inner layers of the embedding model, but rather to steer or influence the final embedding representation of some text.

Suppose we have an embedding model that performs well in semantic understanding of text but is not fine-tuned for QA or query–passage retrieval.

As a possible option to fix this problem without fine-tuning, I propose:  

1. Use an LLM to generate a set of possible queries for a passage;  
2. Embed the original passage and the set of possible queries;  
3. Use some method to fuse the embeddings of the queries and the passage (weighted averaging, a custom function, etc.);  
4. Store the original passage and the fused embedding for retrieval.  

To evaluate the proposed method, I suggest using the Natural Questions dataset or MS MARCO.

As a metric, I propose using [nDCG@10](https://en.wikipedia.org/wiki/Discounted_cumulative_gain), the same metric used in the MTEB evaluation benchmark for [NQ](https://research.google/pubs/natural-questions-a-benchmark-for-question-answering-research/) and [MS MARCO](https://github.com/microsoft/MSMARCO-Passage-Ranking). The metric evaluates relevant passage ranking and has the following formula:  

$$
\text{DCG}_p = \sum_{i=1}^{p} \frac{2^{rel_i} - 1}{\log_2(i + 1)}
$$

$$
\text{nDCG}_p = \frac{\text{DCG}_p}{\text{IDCG}_p},
$$

where IDCG is the ideal discounted cumulative gain.

I would prefer to use MS MARCO as the evaluation dataset since `sentence-transformers` provides pfine-tuned open-source [models](https://sbert.net/docs/sentence_transformer/pretrained_models.html) on this dataset. This makes it possible to compare the fine-tuned version with the embedding steering method applied to the base model.
