# 重新排序

在 RAG（检索增强生成）的上下文中，检索结果的重新排序是一个关键步骤，它根据检索到的文档与输入查询的相关性对初始检索结果进行优化。此过程涉及使用更复杂的模型（例如交叉编码器）对检索到的文档进行重新评分，以更好地捕捉查询与文档之间的语义相似性。重新排序后的文档列表随后用作生成模型的输入，确保使用最相关和准确的信息来生成最终输出。

![交叉编码器图像](https://raw.githubusercontent.com/UKPLab/sentence-transformers/master/docs/img/CrossEncoder.png)

了解更多信息请点击[这里](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)

以下是步骤：
* [加载重新排序模型](#loading-the-reranking-model)
* [加载检索结果](#loading-retrieval-results)
* [计算重新排序得分](#calculating-the-re-ranking-scores)
* [基于重新排序的文档生成回复](#using-merged-results-to-generate-a-reply)

## 可视化改进

In [None]:
from rich.console import Console
from rich_theme_manager import Theme, ThemeManager
import pathlib

theme_dir = pathlib.Path("themes")
theme_manager = ThemeManager(theme_dir=theme_dir)
dark = theme_manager.get("dark")

# Create a console with the dark theme
console = Console(theme=dark)

In [None]:
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')

## 加载重新排序模型

In [None]:
from sentence_transformers import CrossEncoder 
cross_encoder = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")
console.print(cross_encoder.model)

## 加载检索结果

我们将从之前的混合搜索笔记本中加载检索结果，以避免重复。我们可以忽略稠密索引和稀疏索引的得分，因为我们将基于文档/分块的文本计算重新排序得分。

In [None]:
import json
hybrid_search_results = {}
with open('data/dense_results.json') as f:
    dense_results = json.load(f)
    for doc in dense_results:
        hybrid_search_results[doc['id']] = doc
with open('data/sparse_results.json') as f:
    sparse_results = json.load(f)
    for doc in sparse_results:
        hybrid_search_results[doc['id']] = doc
console.print(hybrid_search_results)

In [None]:
# This is the query that we used for the retrieval of the above documents
query = "What is context size of Mixtral?"

## 计算重新排序得分

我们使用 `cross_encoder` 来计算匹配得分。

In [None]:
pairs = [[query, doc['text']] for doc in hybrid_search_results.values()] 
scores = cross_encoder.predict(pairs) 

console.print(scores)

## 选择前 3 个重新排序的文档

In [None]:
# Combine scores with corresponding document IDs
results_with_scores = [
    (doc_id, hybrid_search_results[doc_id]['text'], score)
    for doc_id, score in zip(hybrid_search_results.keys(), scores)
]

# Sort results by score in descending order and take the top 3
top_results = sorted(results_with_scores, key=lambda x: x[2], reverse=True)[:3]


In [None]:
import numpy as np
from rich.table import Table
table = Table(title="Top 3 Documents after Reranking", show_lines=True)

table.add_column("ID", justify="right", style="cyan", no_wrap=True)
table.add_column("Score", justify="right", style="green", no_wrap=True)
table.add_column("Document", style="#e87d3e")

# Add rows to the table with top 3 results
for doc_id, text, score in top_results:
    table.add_row(str(doc_id), f"{score:.4f}", text)

console.print(table)

## 使用合并结果生成回复

我们现在可以获取改进后的合并结果，并调用 LLM 生成对用户查询的回复。

In [None]:
# define a variable to hold the search results for the generation model
search_results = [doc[1] for doc in top_results]

In [None]:
from dotenv import load_dotenv

load_dotenv()

In [None]:
# Now time to connect to the large language model
from openai import OpenAI
from rich.text import Text

client = OpenAI()
completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are chatbot, an research expert. Your top priority is to help guide users to understand reserach papers."},
        {"role": "user", "content": query},
        {"role": "assistant", "content": str(search_results)}
    ]
)

response_text = Text(completion.choices[0].message.content)

In [None]:
from rich.panel import Panel

panel = Panel(response_text, title=f"Hybrid Search with Reranking Reply to \"{query}\"")
console.print(panel)