# RAG的基础流程

### 框架选择
- Basic开发使用 langchain框架进行开发，旨在了解langchain框架的优点和缺点

- langchain框架 组件工具很多 很火 ，随着产品需求变复杂，langchain不灵活，langchain故意将很多细节做的很抽象，理解和langchain
牺牲简单性和灵活性为代价 嵌套抽象
开发弊端：没有外部监控的接口

[为什么不使用langchain](https://www.octomind.dev/blog/why-we-no-longer-use-langchain-for-building-our-ai-agents)

## LLM部分

#### 可以通过modelscope 加载模型

In [None]:
from modelscope import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

# Note: The default behavior now has injection attack prevention off.
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-1_8B-Chat", revision='master', trust_remote_code=True)

# use bf16
# model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-1_8B-Chat", device_map="auto", trust_remote_code=True, bf16=True).eval()
# use fp16
# model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-1_8B-Chat", device_map="auto", trust_remote_code=True, fp16=True).eval()
# use cpu only
# model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-1_8B-Chat", device_map="cpu", trust_remote_code=True).eval()
# use auto mode, automatically select precision based on the device.
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-1_8B-Chat", revision='master', device_map="auto", trust_remote_code=True).eval()

# Specify hyperparameters for generation. But if you use transformers>=4.32.0, there is no need to do this.
# model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-1_8B-Chat", trust_remote_code=True) # 可指定不同的生成长度、top_p等相关超参

# 第一轮对话 1st dialogue turn
response, history = model.chat(tokenizer, "你好", history=None)
print(response)
# 你好！很高兴为你提供帮助。

# 第二轮对话 2nd dialogue turn
response, history = model.chat(tokenizer, "给我讲一个年轻人奋斗创业最终取得成功的故事。", history=history)
print(response)
# 这是一个关于一个年轻人奋斗创业最终取得成功的故事。
# 故事的主人公叫李明，他来自一个普通的家庭，父母都是普通的工人。从小，李明就立下了一个目标：要成为一名成功的企业家。
# 为了实现这个目标，李明勤奋学习，考上了大学。在大学期间，他积极参加各种创业比赛，获得了不少奖项。他还利用课余时间去实习，积累了宝贵的经验。
# 毕业后，李明决定开始自己的创业之路。他开始寻找投资机会，但多次都被拒绝了。然而，他并没有放弃。他继续努力，不断改进自己的创业计划，并寻找新的投资机会。
# 最终，李明成功地获得了一笔投资，开始了自己的创业之路。他成立了一家科技公司，专注于开发新型软件。在他的领导下，公司迅速发展起来，成为了一家成功的科技企业。
# 李明的成功并不是偶然的。他勤奋、坚韧、勇于冒险，不断学习和改进自己。他的成功也证明了，只要努力奋斗，任何人都有可能取得成功。

# 第三轮对话 3rd dialogue turn
response, history = model.chat(tokenizer, "给这个故事起一个标题", history=history)
print(response)
# 《奋斗创业：一个年轻人的成功之路》

# Qwen-1.8B-Chat现在可以通过调整系统指令（System Prompt），实现角色扮演，语言风格迁移，任务设定，行为设定等能力。
# Qwen-1.8B-Chat can realize roly playing, language style transfer, task setting, and behavior setting by system prompt.
response, _ = model.chat(tokenizer, "你好呀", history=None, system="请用二次元可爱语气和我说话")
print(response)
# 你好啊！我是一只可爱的二次元猫咪哦，不知道你有什么问题需要我帮忙解答吗？

response, _ = model.chat(tokenizer, "My colleague works diligently", history=None, system="You will write beautiful compliments according to needs")
print(response)
# Your colleague is an outstanding worker! Their dedication and hard work are truly inspiring. They always go above and beyond to ensure that 
# their tasks are completed on time and to the highest standard. I am lucky to have them as a colleague, and I know I can count on them to handle any challenge that comes their way.

#### 通过openai加载模型

In [None]:
# openai的messages的格式与langchain不一样
from openai import OpenAI
from dotenv import load_dotenv, find_dotenv

# 加载.env文件获取API-key
load_dotenv(find_dotenv())

client = OpenAI()


completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "你是一个心理咨询师"},
    {"role": "user", "content": "你好"}
  ]
)

print(completion.choices[0].message)

#### 通过langchain加载openai

In [5]:
# 使用langchain进行llm的建立
import os
from langchain.chat_models import ChatOpenAI
from dotenv import load_dotenv,find_dotenv

load_dotenv(find_dotenv()) # 加载环境变量 find_dotenv() # 找到.env文件

# 输出dotenv获取的环境变量
# print(f"OPENAI_API_KEY: {os.environ['OPENAI_API_KEY']}")

chat = ChatOpenAI(openai_api_key = os.environ['OPENAI_API_KEY'],
                  model='gpt-3.5-turbo'
                  )

In [11]:
# langchain的格式如下
from langchain.schema import SystemMessage,HumanMessage,AIMessage

# message 可以理解成memory
messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What is the capital of France?"),
    AIMessage(content="Paris is the capital of France."),
    HumanMessage(content="What is the capital of Germany?"),
]

res = chat(messages)
print(res)

content='The capital of Germany is Berlin.' response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 46, 'total_tokens': 53}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-32393f06-abb0-488f-ad6d-3e0a2469215c-0'


In [13]:
print(res)

content='The capital of Germany is Berlin.' response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 46, 'total_tokens': 53}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-32393f06-abb0-488f-ad6d-3e0a2469215c-0'


In [21]:
#  因为res也是AIMessage属性，所以我们可以直接进行添加，即可以实现下一次响应
messages.append(res)
res = chat(messages)

res.content

'Is there anything else you would like to know?'

In [26]:
messages.append(HumanMessage(content="孙行者是谁？"))
chat(messages).content

'孙行者是《西游记》中的主要人物之一，也被称为孙悟空。他是一位有着超凡能力的猴子，为了保护唐僧师徒取经而展开了一系列冒险旅程。'

In [27]:
query = '孙行者是谁？'

prompt_template = f'''基于以下内容回答问题:

内容：
孙行者者来自四川，职业是算法工程师。毕业于四川大学。

Query:{query}
'''

prompt = HumanMessage(content=prompt_template)
messages.append(prompt)
chat(messages)

AIMessage(content='根据提供的内容，孙行者是一位来自四川的算法工程师，毕业于四川大学。', response_metadata={'token_usage': {'completion_tokens': 38, 'prompt_tokens': 137, 'total_tokens': 175}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f7b169f0-97d1-4cd7-a4d8-37c4a64b70e8-0')

## 创建一个RAG对话模型

### 1.加载数据

In [5]:
# 文件格式包括 pdf word excel
from langchain.document_loaders import PyPDFLoader, Docx2txtLoader, UnstructuredFileLoader, TextLoader

# 读取文件，可以读取网络或者本地文件
loader = PyPDFLoader("./data/Generative Agents- Interactive Simulacra of Human Behavior.pdf") # 这是斯坦福小镇的项目论文

pages = loader.load_and_split()
pages[0]

[Document(metadata={'source': './data/Generative Agents- Interactive Simulacra of Human Behavior.pdf', 'page': 0}, page_content='Generative Agents: Interactive Simulacra of Human Behavior\nJoon Sung Park\nStanford University\nStanford, USA\njoonspk@stanford.eduJoseph C. O’Brien\nStanford University\nStanford, USA\njobrien3@stanford.eduCarrie J. Cai\nGoogle Research\nMountain View, CA, USA\ncjcai@google.com\nMeredith Ringel Morris\nGoogle DeepMind\nSeattle, WA, USA\nmerrie@google.comPercy Liang\nStanford University\nStanford, USA\npliang@cs.stanford.eduMichael S. Bernstein\nStanford University\nStanford, USA\nmsb@cs.stanford.edu\nFigure 1: Generative agents are believable simulacra of human behavior for interactive applications. In this work, we demonstrate\ngenerative agents by populating a sandbox environment, reminiscent of The Sims, with twenty-five agents. Users can observe\nand intervene as agents plan their days, share news, form relationships, and coordinate group activities.\nA

### 2.知识切片 将文档分割成均匀的块，每一个块儿就是一段原始文本

In [32]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap  = 50,
)

docs = text_splitter.split_documents(pages)

len(docs)

306

### 3.利用embedding模型对每个文本片段进行向量化，并存储到向量数据库中

In [35]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embed_model = OpenAIEmbeddings()
vector_store = Chroma.from_documents(documents=docs, embedding=embed_model,collection_name="my_collection")

### 4.通过向量相似度检索和问题最相关的k个文档

In [43]:
query = 'what is agents'
result = vector_store.similarity_search(query,k=2)
result

[Document(metadata={'page': 0, 'source': './data/Generative Agents- Interactive Simulacra of Human Behavior.pdf'}, page_content='applications ranging from immersive environments to rehearsal\nspaces for interpersonal communication to prototyping tools. In\nthis paper, we introduce generative agents: computational software\nagents that simulate believable human behavior. Generative agents\nwake up, cook breakfast, and head to work; artists paint, while\nPermission to make digital or hard copies of part or all of this work for personal or\nclassroom use is granted without fee provided that copies are not made or distributed'),
 Document(metadata={'page': 1, 'source': './data/Generative Agents- Interactive Simulacra of Human Behavior.pdf'}, page_content='other agents to the party, attendees must remember the invitation,\nthose who remember must decide to actually show up, and more—\nour agents succeed. They spread the word about the party and then\n1When referring to generative agents eng

### 5.原始的query与检索得到的文本结合起来输入到语言模型，得到最终的回答

In [44]:
def augment_prompt(query:str):
    # 获取top3的文本片段
    result = vector_store.similarity_search(query, k=3)
    source_knowledge = '/n'.join([x.page_content for x in result])
    # 构建prompt
    augmented_prompt = f'''using the context below, answer the question at the end.
    context:
    {source_knowledge}
    question:
    {query}
    '''
    return augmented_prompt

In [45]:
print(augment_prompt(query))

using the context below, answer the question at the end.
    context:
    applications ranging from immersive environments to rehearsal
spaces for interpersonal communication to prototyping tools. In
this paper, we introduce generative agents: computational software
agents that simulate believable human behavior. Generative agents
wake up, cook breakfast, and head to work; artists paint, while
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed/nother agents to the party, attendees must remember the invitation,
those who remember must decide to actually show up, and more—
our agents succeed. They spread the word about the party and then
1When referring to generative agents engaging in actions or going to places, this is a
shorthand for readability and not a suggestion that they are engaging in human-like
agency. The behaviors of our agents, akin to animated Disney cha

In [46]:
# 创建prompt

prompt = HumanMessage(
    content = augment_prompt(query)
)

messages.append(prompt)

res = chat(messages)

print(res.content)

Based on the provided context, "agents" refer to computational software agents that simulate believable human behavior. These generative agents engage in various actions and behaviors, such as waking up, cooking breakfast, going to work, painting, and interacting with others in a simulated environment. They are designed to exhibit individual and emergent group behavior, drawing inferences, creating daily plans, reacting to changes in their environment or user commands, and re-planning as needed.


### llm 的几种使用方法

In [None]:
modelname = ''

