# RAG

## 环境准备

In [1]:
from Utils import *

RUNNABLE_BASE_URL:  http://localhost:8000


In [None]:
gpt35("""我在jupyterlab中无法加载修改过的python代码，是有缓存吗？""")

### 最佳实践

在实现基于检索的生成模型（Retrieval-Augmented Generation, RAG）时，确实应该关注高质量问答对的获取和使用。RAG模型结合了检索（retrieval）和生成（generation）两个步骤，以改善生成的答案质量。以下是一些最佳实践：

1. **高质量问答对**: 确保问答对的质量是至关重要的。高质量的问答对可以提供更准确、更相关的信息，有助于生成模型产生更好的答案。这些问答对应该涵盖广泛的主题，并且答案应该是准确和信息丰富的。

2. **问题编码与匹配**: 在RAG模型中，用户的问题通常与问答对中的问题进行比较，而不是直接与答案比较。这是因为用户提出的问题和数据库中存储的问题在语义上更容易匹配。一旦找到最匹配的问题，相应的答案就可以用来辅助生成模型产生答案。

3. **向量数据库**: 使用高效的向量搜索技术来存储和检索问题的向量表示。这通常涉及到使用像FAISS这样的库来加速相似性搜索。问题的向量表示应该能够捕获语义信息，以便在检索阶段能够找到最相关的问答对。

4. **上下文编码**: 在编码问题时，考虑到问题的上下文可以提高检索的准确性。这意味着不仅仅是问题本身，相关的上下文信息（如前后文或附加的背景信息）也应该被编码进向量中。

5. **连续学习**: 随着时间的推移，问答库应该不断更新和扩展，以包括新的信息和数据。此外，可以通过持续学习（continual learning）来微调检索和生成模型，以保持其性能。

6. **多模态数据**: 如果可能，考虑使用多模态数据（如文本、图像、表格等）来丰富问答对，这样可以提供更全面的信息，有助于生成更准确的答案。

7. **用户反馈**: 利用用户反馈来评估和改进模型。用户对生成答案的满意度可以作为一个重要的指标，指导模型的迭代和优化。

8. **评估和测试**: 定期对模型进行全面的评估和测试，以确保其性能满足预期。使用标准化的评估指标和测试集可以帮助监控模型的进展。

总之，RAG模型的最佳实践应该包括获取和使用高质量问答对、有效的问题编码与匹配、持续的模型优化和更新，以及定期的评估和用户反馈。这样可以确保模型能够提供高质量和相关性强的答案。

## 技术示例

### Vector store-backed retriever

In [None]:
retriever = db.as_retriever()
retriever = db.as_retriever(search_type="mmr")
retriever = db.as_retriever(
    search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.5}
)
retriever = db.as_retriever(search_kwargs={"k": 1})

In [None]:
retriever.get_relevant_documents("what did he say about ketanji brown jackson")

### MultiQueryRetriever：先扩展问题，再做向量查询

- generate_queries: 使用内部的 LLM 生成新的查询列表
- get_relevant_documents: 给定查询,返回相关文档列表
- invoke: 执行查询并合并多个扩展问题查询的结果

### EnsembleRetriever：实现关键字和向量组合检索

In [None]:
from langchain.retrievers import BM25Retriever, EnsembleRetriever
from langchain.vectorstores import FAISS  

doc_list = [文档1, 文档2, ...]

# 初始化BM25检索器
bm25_retriever = BM25Retriever.from_texts(doc_list) 

# 初始化向量存储
embedding = 嵌入模型
faiss_vectorstore = FAISS.from_texts(doc_list, embedding)

# 将向量存储转换为检索器
faiss_retriever = faiss_vectorstore.as_retriever()

# 创建EnsembleRetriever
ensemble_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, faiss_retriever], 
    weights=[0.5, 0.5]
)

# 使用EnsembleRetriever检索
docs = ensemble_retriever.get_relevant_documents("查询语句")

### LongContextReorder：先过度召回，再rerank

In [None]:
# 从向量数据库中查询50条结果 
docs = retriever.query(query, k=50)

# 创建 Long-Context Reorder
reorder = LongContextReorder(model="sentence-transformers/paraphrase-multilingual-mpnet-base-v2")

# 对50条结果进行重排
reranked_docs = reorder.rerank(docs)  

# 只取前5名作为最终结果
final_docs = reranked_docs[:5]

### 几种检索结果进行优化的方法比较

ContextualCompressionRetriever、LLMChainFilter和LongContextReorder都是对检索结果进行优化的方法,主要的异同点如下:

#### 相同点

- 都试图提升检索结果的质量
- 通常作用于检索结果获取后、提供给模型前

#### 区别点

**工作机制不同**

- ContextualCompressionRetriever:压缩每个文档,提取相关部分
- LLMChainFilter:完全过滤掉不相关文档
- LongContextReorder:调整相关文档的顺序


**主要依赖不同**

- ContextualCompressionRetriever:需要文档压缩器
- LLMChainFilter:需要LLM链判断相关性
- LongContextReorder:需要文档相似度


**注重侧重不同**

- ContextualCompressionRetriever:提升相关信息密度
- LLMChainFilter:减少无关噪声 
- LongContextReorder:优化信息访问

## 建立langchain知识库

<div class="alert alert-warning">
<b>兼容性问题：</b><br/>
    较新的BeautifulSoup版本是4.12.3，与python3.10兼容性较好，无法适应3.9或3.12，否则无法找到lxml或html5lib。
</div>

In [2]:
from bs4 import BeautifulSoup, SoupStrainer
from langchain_community.document_loaders.recursive_url_loader import RecursiveUrlLoader
from langchain_community.document_loaders.sitemap import SitemapLoader
from langchain_core.utils.html import PREFIXES_TO_IGNORE_REGEX, SUFFIXES_TO_IGNORE_REGEX
import re

In [3]:
# 仅在jupyter中需要
import nest_asyncio
nest_asyncio.apply()

### 提取langchain文档

#### 提取langchain的Docs文档

In [7]:
def metadata_extractor(meta: dict, soup: BeautifulSoup) -> dict:
    title = soup.find("title")
    description = soup.find("meta", attrs={"name": "description"})
    html = soup.find("html")
    return {
        "source": meta["loc"],
        "title": title.get_text() if title else "",
        "description": description.get("content", "") if description else "",
        "language": html.get("lang", "") if html else "",
        **meta,
    }

def load_langchain_docs():
    return SitemapLoader(
        "https://python.langchain.com/sitemap.xml",
        filter_urls = ["https://python.langchain.com/"],
        parsing_function = web_page_extractor,
        default_parser = "lxml",
        bs_kwargs = {
            "parse_only": SoupStrainer(
                name = ("article", "title", "html", "lang", "content")
            ),
        },
        meta_function = metadata_extractor,
    ).load()

In [5]:
langchain_docs = load_langchain_docs()

Fetching pages: 100%|##########| 1180/1180 [07:51<00:00,  2.50it/s]


#### 提取langchain的API文档

In [8]:
def simple_extractor(html: str) -> str:
    soup = BeautifulSoup(html, "lxml")
    return re.sub(r"\n\n+", "\n\n", soup.text).strip()

def load_api_docs():
    return RecursiveUrlLoader(
        url = "https://api.python.langchain.com/en/stable/langchain_api_reference.html",
        max_depth = 8,
        extractor = simple_extractor,
        prevent_outside = True,
        use_async = True,
        timeout = 600,
        # Drop trailing / to avoid duplicate pages.
        link_regex = (
            f"href=[\"']{PREFIXES_TO_IGNORE_REGEX}((?:{SUFFIXES_TO_IGNORE_REGEX}.)*?)"
            r"(?:[\#'\"]|\/[\#'\"])"
        ),
        check_response_status = True,
        exclude_dirs = (
            "https://api.python.langchain.com/en/latest/_sources",
            "https://api.python.langchain.com/en/latest/_modules",
        ),
    ).load()

In [9]:
api_docs = load_api_docs()

#### 提取langsmith的docs文档

In [8]:
def load_langsmith_docs():
    return RecursiveUrlLoader(
        url = "https://docs.smith.langchain.com/",
        max_depth = 8,
        extractor = simple_extractor,
        prevent_outside = True,
        use_async = True,
        timeout = 600,
        # Drop trailing / to avoid duplicate pages.
        link_regex = (
            f"href=[\"']{PREFIXES_TO_IGNORE_REGEX}((?:{SUFFIXES_TO_IGNORE_REGEX}.)*?)"
            r"(?:[\#'\"]|\/[\#'\"])"
        ),
        check_response_status = True,
    ).load()

In [13]:
langsmith_docs = load_langsmith_docs()

  soup = BeautifulSoup(html, "lxml")
  k = self.parse_starttag(i)


### 将文档入库到duckdb

#### 连接duckdb

In [2]:
web_store = WebPageDataset(db_name = "data/langchain.duckdb")

#### 保存到duckdb

In [10]:
# https://python.langchain.com/
for d in langchain_docs:
    print(".", end = "")
    web_store.upsert(d, topic = "langchain_docs")

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

In [12]:
web_store.upsert(api_docs[0], topic = "langchain_api_docs")

In [17]:
# https://docs.smith.langchain.com/
for d in langsmith_docs:
    print(".", end = "")
    web_store.upsert(d, topic = "langsmith_docs")

.....................................................................

#### 从duckdb查询

In [3]:
import re

In [4]:
docs = web_store.read_data(topic = "langchain_docs")

In [5]:
result = [obj for obj in docs if re.compile('lancedb', re.IGNORECASE).search(obj.source)]
for obj in result:
    print(obj.source)

https://python.langchain.com/docs/integrations/providers/lancedb
https://python.langchain.com/docs/integrations/vectorstores/lancedb


### 拆分文本块

#### 加载文本向量

In [2]:
langchain_ds = WebPageDataset(db_name = "data/langchain.duckdb")
docs = langchain_ds.read_data(topic = None)

#### 剔除对RAG无实质作用的文本

<div class="alert alert-success">
<b>观察文本大小：</b><br/>
    有很多文档的尺寸超过了50K，最大的达到200K。<br>
    其中，有些是包含了图片的base64编码，有些是包含了示范代码执行时的打印内容，对RAG的支持没有太多帮助。
</div>

In [3]:
dict_list = [{ "source": obj.source, "len": len(obj.page_content) } for obj in docs]
sorted_dict_list = sorted(dict_list, key = lambda x: x['len'], reverse = True)
for obj in sorted_dict_list:
    print(obj['len'], " >> ", obj['source'])

195919  >>  https://python.langchain.com/docs/integrations/retrievers/activeloop
150150  >>  https://python.langchain.com/docs/use_cases/question_answering/citations
74417  >>  https://python.langchain.com/docs/integrations/document_loaders/dropbox
73698  >>  https://python.langchain.com/docs/integrations/vectorstores/timescalevector
66009  >>  https://python.langchain.com/docs/integrations/document_loaders/docugami
65474  >>  https://python.langchain.com/docs/use_cases/code_understanding
63441  >>  https://python.langchain.com/docs/integrations/tools/google_lens
62074  >>  https://python.langchain.com/docs/integrations/chat/ollama
60867  >>  https://python.langchain.com/docs/expression_language/cookbook/prompt_size
59026  >>  https://python.langchain.com/docs/guides/debugging
57438  >>  https://python.langchain.com/docs/integrations/llms/ollama
54666  >>  https://api.python.langchain.com/en/stable/langchain_api_reference.html
53959  >>  https://python.langchain.com/docs/modules/agents

剔除输出的文字块和图像base64部份：

In [4]:
langchain_new_docs = [{
        "content": remove_text_blocks(remove_base64(obj.page_content)), 
        "source": obj.source, 
        "title": obj.title, 
        "description": obj.description
    } for obj in docs]
newDocs = sort_list_by_len(langchain_new_docs, "content")
for obj in newDocs:
    print(obj[1], " >> ", obj[0]['source'])

54666  >>  https://api.python.langchain.com/en/stable/langchain_api_reference.html
30920  >>  https://python.langchain.com/docs/integrations/vectorstores/timescalevector
29616  >>  https://python.langchain.com/docs/modules/data_connection/document_loaders/file_directory
28836  >>  https://docs.smith.langchain.com/tracing/tracing-faq
28812  >>  https://python.langchain.com/docs/integrations/tools/google_lens
26134  >>  https://python.langchain.com/docs/langgraph
24125  >>  https://python.langchain.com/docs/langserve
23943  >>  https://docs.smith.langchain.com/evaluation/quickstart
23617  >>  https://python.langchain.com/docs/integrations/toolkits/github
22722  >>  https://python.langchain.com/docs/integrations/vectorstores/redis
22675  >>  https://python.langchain.com/docs/guides/safety/amazon_comprehend_chain
22587  >>  https://docs.smith.langchain.com/cookbook/testing-examples/comparing-runs
22480  >>  https://python.langchain.com/docs/get_started/quickstart
22014  >>  https://python.

#### 切分字符串：默认按Markdown段落，如果仍然超长就截断切分

In [5]:
from langchain.text_splitter import MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter

In [6]:
headers_to_split_on = [
    ("#", "H1"),
    ("##", "H2"),
    ("###", "H3"),
]

In [7]:
markdown_splitter = MarkdownHeaderTextSplitter(headers_to_split_on = headers_to_split_on)

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 2000,
    chunk_overlap = 200,
    length_function = len,
    is_separator_regex = False,
)

md_header_splits = []
page_index = 0

for obj in newDocs:
    final_texts = []
    # step1: 按markdown标题切分
    first_texts = markdown_splitter.split_text(obj[0]["content"])
    # step2: 超长的继续按文本长度切分
    #
    chunk_index = 0
    for x in first_texts:
        page_index += 1
        # print(x.metadata)
        
        if(len(x.page_content) < 2000):
            final_texts.append({
                "page_index": page_index,
                "chunk_index": -1,
                "H1": x.metadata.get("H1", ""),
                "H2": x.metadata.get("H2", ""),
                "H3": x.metadata.get("H3", ""),
                "content": x.page_content
            })
            # print(" < 2000, 直接加入final_texts", end = ": ")
            # print(len(final_texts))
        else:
            texts_split = [{
                "page_index": page_index,
                "chunk_index": index + 1,
                "H1": x.metadata.get("H1", ""),
                "H2": x.metadata.get("H2", ""),
                "H3": x.metadata.get("H3", ""),
                "content": obj.page_content
            } for index, obj in enumerate(text_splitter.create_documents([x.page_content]))]
            final_texts += texts_split
            # print(" >= 2000, 切分后加入final_texts", end = ": ")
            # print(len(final_texts))
    #
    md_header_splits += [{
        "page_index": chunk["page_index"],
        "chunk_index": chunk["chunk_index"],
        "title": obj[0]["title"],
        "description": obj[0]["description"],
        "source": obj[0]["source"],
        "H1": chunk["H1"],
        "H2": chunk["H2"],
        "H3": chunk["H3"],
        "content": chunk["content"]} for chunk in final_texts]
    #
    print(len(md_header_splits), end = " ")
    

31 55 78 98 116 150 182 198 221 242 260 275 299 328 344 357 375 388 424 441 452 466 481 492 504 520 535 548 559 571 582 592 603 613 623 635 654 665 674 685 705 720 731 743 751 770 786 795 807 821 837 849 865 875 882 900 913 926 935 945 954 961 970 983 996 1007 1016 1027 1035 1042 1051 1063 1079 1093 1101 1113 1120 1129 1138 1146 1153 1167 1176 1191 1199 1211 1222 1239 1248 1258 1265 1275 1288 1297 1305 1314 1321 1329 1336 1342 1350 1357 1365 1372 1377 1398 1408 1413 1421 1430 1438 1450 1456 1467 1472 1478 1487 1495 1502 1507 1516 1525 1531 1537 1547 1554 1560 1570 1577 1583 1588 1596 1602 1611 1618 1626 1633 1638 1643 1655 1661 1669 1674 1682 1692 1698 1704 1711 1716 1724 1730 1735 1743 1752 1762 1768 1775 1781 1786 1794 1799 1806 1811 1817 1824 1835 1840 1850 1858 1863 1873 1878 1882 1888 1895 1902 1910 1917 1924 1932 1939 1949 1956 1961 1969 1980 1986 1994 2003 2009 2015 2024 2028 2035 2045 2052 2057 2064 2072 2079 2088 2095 2101 2106 2113 2120 2129 2136 2142 2150 2162 2167 2173 2183

#### 保存到duckdb

In [22]:
text_block_ds = TextBlockDataset(db_name = "data/langchain.duckdb", drop = False)

In [23]:
for obj in md_header_splits:
    print(obj["chunk_index"], ":", obj["page_index"], ":", obj["source"]) 

1 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
2 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
3 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
4 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
5 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
6 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
7 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
8 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
9 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
10 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
11 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
12 : 1 : https://api.python.langchain.com/en/stable/langchain_api_reference.html
13 : 1 : https://api.python.langchain

In [24]:
def check_duplicates(data):
    keys = set()
    for item in data:
        key = (item['source'], item['page_index'], item['chunk_index'])
        if key in keys:
            print(f'Duplicate key found: {key}')
            return True
        else:
            keys.add(key)
    print('No duplicate keys found.')
    return False

In [25]:
import duckdb
conn = duckdb.connect("data/langchain.duckdb") 
def check_primary_key(conn, table_name):
    cursor = conn.cursor()
    cursor.execute(f"PRAGMA table_info({table_name})")
    columns = cursor.fetchall()
    for column in columns:
        if column[5]:  # 如果该列是主键，column[5]的值为1
            print(f"Primary key is set on column: {column[1]}")
check_primary_key(conn, "text_blocks")

Primary key is set on column: source
Primary key is set on column: page_index
Primary key is set on column: chunk_index


In [26]:
check_duplicates(md_header_splits)

No duplicate keys found.


False

In [27]:
for d in md_header_splits:
    print(".", end = "")
    text_block_ds.upsert(d)
print("finish!")

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

### 将文本入库到lancedb

#### 对文本做向量编码

In [35]:
from langchain_openai import OpenAIEmbeddings
embeddings_model = OpenAIEmbeddings()

In [56]:
all_texts = text_block_ds.read_data()
print("length of all_texts: ", len(all_texts))

length of all_texts:  5602


In [33]:
text_block_ds.read_data()[4000]

TextBlock(page_index=3179, chunk_index=-1, title='Installation | 🦜️🔗 Langchain', description='Official release', source='https://python.langchain.com/docs/get_started/installation', H1='', H2='LangServe\u200b', H3='', content='LangServe helps developers deploy LangChain runnables and chains as a REST API.\nLangServe is automatically installed by LangChain CLI.\nIf not using LangChain CLI, install with:  \n```bash\npip install "langserve[all]"\n```  \nfor both client and server dependencies. Or `pip install "langserve[client]"` for client code, and `pip install "langserve[server]"` for server code.', timestamp=datetime.datetime(2024, 1, 31, 23, 29, 33, 741125))

In [53]:
batch_size = 100
embeddings = []

for i in range(0, len(all_texts), batch_size):
    to_embeded = [
        f'''
        meata:
        - source: {t.source}
        - title: {t.title}
        - description: {t.description}

        # {t.H1 or t.title}
        ## {t.H2 or "--"}
        ### {t.H3 or "--"}
        {t.content}
        ''' 
        for t in all_texts[i:i+batch_size]
    ]
    embeddings.extend(embeddings_model.embed_documents(to_embeded))

In [55]:
print("length of embeddings: ", len(embeddings))

length of embeddings:  5602


#### 连接到Lancedb

In [59]:
import lancedb
uri = "data/langchain.lancedb"
db = lancedb.connect(uri)

In [64]:
combined = [
    {
        "page_index": t.page_index,
        "chunk_index": t.chunk_index,
        "source": t.source,
        "title": t.title,
        "description": t.description,        
        "content": t.content,
        "vector": e
    } 
    for t, e in zip(all_texts, embeddings)
]

In [102]:
combined[1000]

{'page_index': 611,
 'chunk_index': -1,
 'source': 'https://python.langchain.com/docs/modules/data_connection/indexing',
 'title': 'Indexing | 🦜️🔗 Langchain',
 'description': 'Here, we will look at a basic indexing workflow using the LangChain',
 'content': 'The record manager relies on a time-based mechanism to determine what\ncontent can be cleaned up (when using `full` or `incremental` cleanup\nmodes).  \nIf two tasks run back-to-back, and the first task finishes before the\nclock time changes, then the second task may not be able to clean up\ncontent.  \nThis is unlikely to be an issue in actual settings for the following\nreasons:  \n1. The RecordManager uses higher resolution timestamps.  \n2. The data would need to change between the first and the second tasks\nruns, which becomes unlikely if the time interval between the tasks\nis small.  \n3. Indexing tasks typically take more than a few ms.',
 'vector': [0.0008069827281021049,
  0.004571527610236608,
  -0.0040509527611311,
  

In [100]:
combined[0]

{'page_index': 1,
 'chunk_index': 1,
 'source': 'https://api.python.langchain.com/en/stable/langchain_api_reference.html',
 'title': 'langchain 0.1.4 — 🦜🔗 LangChain 0.1.4',
 'description': '',
 'content': 'langchain 0.1.4 — 🦜🔗 LangChain 0.1.4  \nLangChain  \nCore  \nCommunity  \nExperimental  \ngoogle-vertexai  \nrobocorp  \ngoogle-genai  \nanthropic  \nnvidia-trt  \nopenai  \nmistralai  \ntogether  \nnvidia-ai-endpoints  \nexa  \nPartner libs  \ngoogle-vertexai\nrobocorp\ngoogle-genai\nanthropic\nnvidia-trt\nopenai\nmistralai\ntogether\nnvidia-ai-endpoints\nexa  \nDocs  \nToggle Menu  \nPrev\nUp\nNext  \nlangchain 0.1.4\nlangchain.agents\nClasses\nFunctions  \nlangchain.callbacks\nClasses  \nlangchain.chains\nClasses\nFunctions  \nlangchain.embeddings\nClasses\nFunctions  \nlangchain.evaluation\nClasses\nFunctions  \nlangchain.hub\nFunctions  \nlangchain.indexes\nClasses\nFunctions  \nlangchain.memory\nClasses\nFunctions  \nlangchain.model_laboratory\nClasses  \nlangchain.output_parse

In [66]:
tbl = db.create_table("langchain-docs", data = combined)

In [67]:
tbl

LanceTable(langchain-docs)

#### 从lancedb中检索

In [181]:
user_query = "我可以在langserve返回的chain中指定使用哪些方法吗？"
question = embeddings_model.embed_query(user_query)

In [182]:
import lancedb
import numpy as np
result = tbl.search(question).metric("cosine").limit(20).to_list()
for r in result:
    print(r['page_index'], r['chunk_index'], r['title'], r['source'], r['_distance'])

1799 -1 self-query-qdrant | 🦜️🔗 Langchain https://python.langchain.com/docs/templates/self-query-qdrant 0.21836364269256592
69 -1 🦜️🏓 LangServe | 🦜️🔗 Langchain https://python.langchain.com/docs/langserve 0.232219398021698
129 -1 Quickstart | 🦜️🔗 Langchain https://python.langchain.com/docs/get_started/quickstart 0.23326295614242554
141 1 Quickstart | 🦜️🔗 Langchain https://python.langchain.com/docs/get_started/quickstart 0.2356204390525818
2413 -1 Quick Start | 🦜️🔗 Langchain https://python.langchain.com/docs/modules/model_io/llms/quick_start 0.23933225870132446
239 -1 Quickstart | 🦜️🔗 Langchain https://python.langchain.com/docs/use_cases/question_answering/quickstart 0.23980873823165894
1020 -1 Quickstart | 🦜️🔗 Langchain https://python.langchain.com/docs/use_cases/tool_use/quickstart 0.24083775281906128
268 -1 Why use LCEL | 🦜️🔗 Langchain https://python.langchain.com/docs/expression_language/why 0.24112391471862793
2514 1 rag-google-cloud-sensitive-data-protection | 🦜️🔗 Langchain https:/

In [183]:
for r in result[0:5]:
    print("\n", r['page_index'], r['title'], "【",  r['description'], "】\n", r['content'])


 1799 self-query-qdrant | 🦜️🔗 Langchain 【 This template performs self-querying 】
 (Optional) If you have access to LangSmith, configure it to help trace, monitor and debug LangChain applications. If you don't have access, skip this section.  
```shell
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project>  # if not specified, defaults to "default"
```  
If you are inside this directory, then you can spin up a LangServe instance directly by:  
```shell
langchain serve
```

 69 🦜️🏓 LangServe | 🦜️🔗 Langchain 【 Release Notes 】
 Python SDK  
```python

from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable

openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/joke/")

joke_chain.in

In [115]:
from langchain_openai import OpenAIEmbeddings
embeddings_model = OpenAIEmbeddings()

In [118]:
demo_vectors = embeddings_model.embed_documents(
    [
        "Hi there!",
        "Oh, hello!",
        "What's your name?",
        "My friends call me World",
        "Hello World!"
    ]
)

In [119]:
embedded_query = embeddings_model.embed_query("What was the name mentioned in the conversation?")
embedded_query[:5]

[0.0053546813655943075,
 -0.0005715346531097275,
 0.038875909934336914,
 -0.0029596003572924623,
 -0.008966285328704282]

In [117]:
import lancedb
uri = "data/demo.lancedb"
demo_db = lancedb.connect(uri)

#### Rerank

In [167]:
!poetry add sentence_transformers

Using version [39;1m^2.3.1[39;22m for [36msentence-transformers[39m

[34mUpdating dependencies[39m
[2K[34mResolving dependencies...[39m [39;2m(68.0s)[39;22m://files.pythonhosted.org/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7a57c45/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl  99%[39m [39;2m(63.7s)[39;22m;22m[34mResolving dependencies...[39m [36mDownloading https://files.pythonhosted.org/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7a57c45/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl[39m [39;2m(5.6s)[39;22m[34mResolving dependencies...[39m [36mDownloading https://files.pythonhosted.org/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7a57c45/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl  18%[39m [39;2m(9.3s)[39;22m[34mResolving dependencies...[39m [36mDownloading https://files.pythonhosted.org/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7

In [179]:
from sentence_transformers import CrossEncoder

model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2', max_length=512)

In [186]:
scores = model.predict([(user_query, r['content']) for r in result])
# 按得分排序
sorted_list = sorted(
    zip(scores, result), key=lambda x: x[0], reverse=True)
for score, trunk in sorted_list[:10]:
    print(f"{score}\t {trunk['page_index']}\t {trunk['source']}\n {trunk['title']}\n")

-2.792736053466797	 126	 https://python.langchain.com/docs/get_started/quickstart
 Quickstart | 🦜️🔗 Langchain

-4.101958751678467	 141	 https://python.langchain.com/docs/get_started/quickstart
 Quickstart | 🦜️🔗 Langchain

-4.8970537185668945	 65	 https://python.langchain.com/docs/langserve
 🦜️🏓 LangServe | 🦜️🔗 Langchain

-5.01893949508667	 1020	 https://python.langchain.com/docs/use_cases/tool_use/quickstart
 Quickstart | 🦜️🔗 Langchain

-5.048722743988037	 3691	 https://python.langchain.com/docs/templates/plate-chain
 plate-chain | 🦜️🔗 Langchain

-5.153246879577637	 3431	 https://python.langchain.com/docs/templates/research-assistant
 research-assistant | 🦜️🔗 Langchain

-5.198843479156494	 131	 https://python.langchain.com/docs/get_started/quickstart
 Quickstart | 🦜️🔗 Langchain

-5.369041442871094	 67	 https://python.langchain.com/docs/langserve
 🦜️🏓 LangServe | 🦜️🔗 Langchain

-5.623697280883789	 69	 https://python.langchain.com/docs/langserve
 🦜️🏓 LangServe | 🦜️🔗 Langchain

-5.7625250

In [187]:
for score, trunk in sorted_list[:10]:
    print(f"{score}\t {trunk['page_index']}\t {trunk['source']}\n {trunk['content']}\n")

-2.792736053466797	 126	 https://python.langchain.com/docs/get_started/quickstart
 Quickstart | 🦜️🔗 Langchain  
[Skip to main content](#__docusaurus_skipToContent_fallback)# Quickstart  
In this quickstart we'll show you how to:  
- Get setup with LangChain, LangSmith and LangServe  
- Use the most basic and common components of LangChain: prompt templates, models, and output parsers  
- Use LangChain Expression Language, the protocol that LangChain is built on and which facilitates component chaining  
- Build a simple application with LangChain  
- Trace your application with LangSmith  
- Serve your application with LangServe  
That's a fair amount to cover! Let's dive in.

-4.101958751678467	 141	 https://python.langchain.com/docs/get_started/quickstart
 class Input(BaseModel):
input: str
chat_history: List[BaseMessage] = Field(
...,
extra={"widget": {"type": "chat", "input": "location"}},
)  
class Output(BaseModel):
output: str  
add_routes(
app,
agent_executor.with_types(input_t