## 第十届 Sky Hackthon

### 基于RAG技术创新构建智能对话机器人


此次黑客松的目的是利用NVIDIA ai endpoint和NIM工具， 结合RAG技术，构建基于本地知识库的对话机器人

* NVIDIA AI Endpoint介绍页面： https://python.langchain.com/v0.1/docs/integrations/chat/nvidia_ai_endpoints/
* NVIDIA NIM页面： https://build.nvidia.com/explore/discover
* NVIDIA DLI课程学习资料页面：https://www.nvidia.cn/training/online/
------------------

![](https://v.png.pub/imgs/2024/06/24/d64b7856c05fa5d4.png)


### 第一步导入所需要的工具库

In [1]:
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
from langchain.vectorstores import FAISS
from llama_index.embeddings import LangchainEmbedding
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import UnstructuredFileLoader
from langchain.document_transformers import LongContextReorder
from langchain_core.runnables import RunnableLambda
from langchain_core.runnables.passthrough import RunnableAssign
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from faiss import IndexFlatL2
from langchain_community.docstore.in_memory import InMemoryDocstore

import gradio as gr
from functools import partial
from operator import itemgetter
import os

### 第二步，设置API Key

注意此处您需要在NVIDIA NIM页面中申请API Key： https://build.nvidia.com/explore/discover

In [2]:
import os
nvidia_api_key = "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"
assert nvidia_api_key.startswith("nvapi-"), f"{nvidia_api_key[:5]}... is not a valid key"
os.environ["NVIDIA_API_KEY"] = nvidia_api_key

### 第三步， 设置对话模型

在此处选择一个对话模型，并测试对话模型

In [3]:
llm = ChatNVIDIA(model="ai-nemotron-4-340b-instruct")
result = llm.invoke("what is nemo")
print(result.content)

Nemo is a character from the animated film "Finding Nemo," produced by Pixar Animation Studios and released by Walt Disney Pictures in 2003. Nemo is a young clownfish who gets separated from his father, Marlin, and ends up in a fish tank in a dentist's office in Sydney, Australia. The movie follows Marlin's journey to find and rescue Nemo, with the help of a forgetful blue tang fish named Dory. Nemo is a curious, adventurous, and determined character who learns important lessons about trust, independence, and the importance of family throughout the film.


### 第四步，设置Embedding模型

在此处选择一个embedding模型，并测试

In [4]:
embedder = NVIDIAEmbeddings(model="ai-embed-qa-4")
embedder.embed_query("test")

[-0.0190887451171875,
 -0.017364501953125,
 -0.034393310546875,
 -0.005931854248046875,
 0.03228759765625,
 0.046234130859375,
 0.0012292861938476562,
 -0.01788330078125,
 0.0220184326171875,
 0.004703521728515625,
 0.08740234375,
 0.0169830322265625,
 0.0224151611328125,
 -0.03936767578125,
 0.0138702392578125,
 0.0213623046875,
 -0.0003600120544433594,
 -0.0188751220703125,
 -0.02789306640625,
 0.0257110595703125,
 0.014892578125,
 0.0013837814331054688,
 0.052215576171875,
 -0.00373077392578125,
 0.0179595947265625,
 0.0169830322265625,
 -0.01444244384765625,
 -0.034393310546875,
 0.034576416015625,
 0.0621337890625,
 -0.034637451171875,
 -0.0909423828125,
 0.0003077983856201172,
 0.00475311279296875,
 0.0156402587890625,
 -0.0426025390625,
 0.04205322265625,
 0.0124969482421875,
 -0.034881591796875,
 0.0009160041809082031,
 0.00859832763671875,
 -0.0204315185546875,
 -0.00238037109375,
 0.0194854736328125,
 -0.0233001708984375,
 0.00341033935546875,
 -0.00611114501953125,
 -0.00788

### 第五步，读取文本数据

In [5]:
import os
from tqdm import tqdm
from pathlib import Path

# 在这里我们读入文本数据并将它们准备到 vectorstore 中
ps = os.listdir("./zh_data/")
data = []
sources = []
docs_name = []
for p in ps:
    if p.endswith('.txt'):
        path2file="./zh_data/"+p
        docs_name.append(path2file)
        with open(path2file,encoding="utf-8") as f:
            lines=f.readlines()
            for line in lines:
                if len(line)>=1:
                    data.append(line)
                    sources.append(path2file)

documents=[d for d in data if d != '\n']
len(data), len(documents), data[0]

(2022,
 1849,
 '泰坦尼克号是一部 1997 年美国史诗爱情灾难片，由詹姆斯·卡梅隆执导、编剧、制作和联合剪辑。该片以 1912 年泰坦尼克号沉没事件为基础，融合了历史和虚构的元素。凯特·温斯莱特和莱昂纳多·迪卡普里奥饰演不同社会阶层的成员，他们在泰坦尼克号的处女航中坠入爱河。比利·赞恩、凯西·贝茨、弗朗西斯·费舍尔、格洛丽亚·斯图尔特、伯纳德·希尔、乔纳森·海德、维克多·加伯和比尔·帕克斯顿也参演了这部电影。\n')

### 第六步，创建vectorstore

In [6]:
from operator import itemgetter
from langchain.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import CharacterTextSplitter
from langchain_nvidia_ai_endpoints import ChatNVIDIA
import faiss
# create my own uuid
text_splitter = CharacterTextSplitter(chunk_size=400, separator=" ")
docs = []
metadatas = []

for i, d in enumerate(documents):
    splits = text_splitter.split_text(d)
    #print(len(splits))
    docs.extend(splits)
    metadatas.extend([{"source": sources[i]}] * len(splits))
### 将创建好的embed存储到本地
store = FAISS.from_texts(docs, embedder , metadatas=metadatas)
store.save_local('./embed')

**注意：上面步骤中创建好的embed再下次重新运行流程时，不必重新执行，可以利用下面的代码直接从本地读取已经创建成功的**

In [7]:
### 从本地读取已经创建好的embed
vecstores = [FAISS.load_local(folder_path="/home/nvidia/10th_hackathon/embed/", embeddings=embedder)]


* 设置default_FAISS() 函数，初始化空向量存储，通过convstore储存对话向量
* 设置aggregate_vstores（）函数将有用的文档信息存储在 docstore变量中

In [8]:
embed_dims = len(embedder.embed_query("test"))
def default_FAISS():
    '''Useful utility for making an empty FAISS vectorstore'''
    return FAISS(
        embedding_function=embedder,
        index=IndexFlatL2(embed_dims),
        docstore=InMemoryDocstore(),
        index_to_docstore_id={},
        normalize_L2=False
    )

def aggregate_vstores(vectorstores):
    ## 初始化一个空的 FAISS 索引并将其他索引合并到其中
    agg_vstore = default_FAISS()
    for vstore in vectorstores:
        agg_vstore.merge_from(vstore)
    return agg_vstore

if 'docstore' not in globals():
    docstore = aggregate_vstores(vecstores)

print(f"Constructed aggregate docstore with {len(docstore.docstore._dict)} chunks")

Constructed aggregate docstore with 1838 chunks


In [9]:
llm = ChatNVIDIA(model="ai-nemotron-4-340b-instruct") | StrOutputParser()
convstore = default_FAISS()

### 第七步，构建RAG 链

* 自动对话存储：save_memory_and_get_output函数允许向我们的对话添加新条目


* doc2str:将文本分块转换成上下文字符串格式输出。


* prompt提示和结构。
  
* 构建retrieval_chain
  
* 设置对话生成函数chat_gen()

In [10]:
doc_names_string = "\n"
for doc_name in docs_name:
    doc_names_string += doc_name+"\n"
    
def save_memory_and_get_output(d, vstore):
    """Accepts 'input'/'output' dictionary and saves to convstore"""
    vstore.add_texts([
        f"User previously responded with {d.get('input')}",
        f"Agent previously responded with {d.get('output')}"
    ])
    return d.get('output')

initial_msg = (
    "Hello! I am a document chat agent here to help the user!"
    f" I have access to the following documents: {doc_names_string}\n\nHow can I help you?"
)

chat_prompt = ChatPromptTemplate.from_messages([("system",
    "You are a document chatbot. Help the user as they ask questions about documents."
    " User messaged just asked: {input}\n\n"
    " From this, we have retrieved the following potentially-useful info: "
    " Conversation History Retrieval:\n{history}\n\n"
    " Document Retrieval:\n{context}\n\n"
    " (Answer only from retrieval. Only cite sources that are used. Make your response conversational.Reply must more than 100 words)"
), ('user', '{input}')])

## Utility Runnables/Methods
def RPrint(preface=""):
    """Simple passthrough "prints, then returns" chain"""
    def print_and_return(x, preface):
        print(f"{preface}{x}")
        return x
    return RunnableLambda(partial(print_and_return, preface=preface))

def docs2str(docs, title="Document"):
    """Useful utility for making chunks into context string. Optional, but useful"""
    out_str = ""
    for doc in docs:
        doc_name = getattr(doc, 'metadata', {}).get('Title', title)
        if doc_name:
            out_str += f"[Quote from {doc_name}] "
        out_str += getattr(doc, 'page_content', str(doc)) + "\n"
    return out_str

## 将较长的文档重新排序到输出文本的中心， RunnableLambda在链中运行无参自定义函数 ，长上下文重排序（LongContextReorder）
long_reorder = RunnableLambda(LongContextReorder().transform_documents)

retrieval_chain = (
    {'input' : (lambda x: x)}
    | RunnableAssign({'history' : itemgetter('input') | convstore.as_retriever() | long_reorder | docs2str})
    | RunnableAssign({'context' : itemgetter('input') | docstore.as_retriever()  | long_reorder | docs2str})
    | RPrint()
)
stream_chain = chat_prompt | llm

def chat_gen(message, history=[], return_buffer=True):
    buffer = ""
    ##首先根据输入的消息进行检索
    retrieval = retrieval_chain.invoke(message)
    line_buffer = ""

    ## 然后流式传输stream_chain的结果
    for token in stream_chain.stream(retrieval):
        buffer += token
        ## 优化信息打印的格式
        if not return_buffer:
            line_buffer += token
            if "\n" in line_buffer:
                line_buffer = ""
            if ((len(line_buffer)>84 and token and token[0] == " ") or len(line_buffer)>100):
                line_buffer = ""
                yield "\n"
                token = "  " + token.lstrip()
        yield buffer if return_buffer else token

    ##最后将聊天内容保存到对话内存缓冲区中
    save_memory_and_get_output({'input':  message, 'output': buffer}, convstore)

### 第八步，使用Gradio框架构建前端RAG机器人界面与您的 Gradio 聊天机器人互动

In [11]:
chatbot = gr.Chatbot(value = [[None, initial_msg]])
demo = gr.ChatInterface(chat_gen, chatbot=chatbot).queue()

try:
    demo.launch(debug=True, share=False, show_api=False, server_port=5000, server_name="0.0.0.0")
    demo.close()
except Exception as e:
    demo.close()
    print(e)
    raise e

Running on local URL:  http://0.0.0.0:5000

To create a public link, set `share=True` in `launch()`.


Keyboard interruption in main thread... closing server.
Closing server running on port: 5000
