## Langchain的应用（1）
目录：
1. langchain的overview
2. prompt template
3. models and output parsers

### 1. 什么是langchain, 为什么需要langchain? 
- 问题：如何没有langchain会怎么样？
- 答案：

一个项目可能会包括：
- 调用多个不同的大模型（gpt4, 视频生成...)
- 向量数据库
- 数据类型（读取，trunk的切分...)

- langchain是面于大模型开发的框架（framework）
- langchain发展很快，讲解课程时候的版本为 0.1.7，具体的语法和接口标准可能会随时改变，请留意官网的documentation

#### Langchain的核心组件
- ```模型 I/O 封装```: 包括大语言模型（LLMs），Chat Models，Prompt Template，Output parser等
- ```Retrieval```: 包括文档的loader，embedding模型，Text Splitter, 向量存储，检索等
- ```Chain```: 实现一个功能或者一系列功能（sequentially) 
- ```Agent```: 给定用户的输入，以及可使用的tools，自动规划执行步骤（比如每个步骤调用哪些tool），并最终完成用户指令
- ```记忆```: 模型记忆里的管理

#### langchain部分的安排
1. langchain (1) -  Langchain的overview，模型I/O封装
2. langchain (2) -  Retrieval组件, Chain组件，Agent组件，记忆里模块
3. langchain (3) -  进阶RAG+langchain
4. langchain (4) -  Agent
5. langchain (5) -  经典Agent开源项目剖析
6. langchain (6) -  Agent的经典案例分享

In [49]:
# 相应library的安装， 我们默认安装最新版本
#!pip install langchain
#!pip install openai
#!pip install langchain-openai

# 安装完之后，可以查看一下版本号
# import openai
# print (openai.__version__)
# !python -m pip install python-dotenv

In [89]:
# 导入openai api key
import os
from dotenv import load_dotenv, find_dotenv

# .env 存储api_key
load_dotenv()

True

### 2. Langchain的quick overview
在这里，我们快速体验一下langchain的各个组件。 请保证相应的library已经安装完毕。 

两种模型：
- ```non-chat model```: 用于text completion, 给定一句话，补全剩下的内容 
- ```chat model```: 用于chat, 可以流畅得进行对话的对话模型

我们主要关注chat model

```A. 模型调用```

langchain已经封装好各类模型（开源、闭源）

In [90]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()
#llm = ChatOpenAI(model_name="gpt-4")

In [91]:
llm.model_name

'gpt-3.5-turbo'

In [53]:
# 直接提供问题，并调用llm
llm.invoke("What is the Sora model?")

AIMessage(content='The Sora model is a framework developed by Richard Culatta that focuses on personalized learning and the integration of technology in education. It stands for Social learning, Ownership of learning, Reflection, and Authentic learning experiences. This model emphasizes the importance of students taking ownership of their learning, engaging in social interactions with peers and experts, reflecting on their learning experiences, and participating in authentic, real-world tasks. The Sora model aims to create a more student-centered and engaging learning environment.')

```B. prompt template```的使用

prompt中可以加入变量，让prompt的构造更加灵活

In [92]:
# 我们也可以创建prompt template, 并引入一些变量到prompt template中，这样在应用的时候更加灵活

from langchain_core.prompts import ChatPromptTemplate

# 需要注意的一点是，这里需要指明具体的role，在这里是system和用户
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are the technical writer"),
    ("user", "{input}")  # {input}为变量
])

In [93]:
# 我们可以把prompt和具体llm的调用和在一起（通过chain，chain可以理解为sequence of calls to take）
chain = prompt | llm 
chain.invoke({"input": "What is the Sora model?"})

AIMessage(content='The SORA model is a structured approach used in the risk assessment of cybersecurity threats and vulnerabilities. SORA stands for "Secure-ly Orchestrated, Reliable, and Adaptive." It is a methodology developed by the European Union Aviation Safety Agency (EASA) for assessing the safety risks associated with drone operations in the context of U-space (unmanned aircraft systems airspace management). \n\nThe SORA model consists of several steps, including defining the operational context, identifying potential hazards, assessing risks, and defining mitigation measures. It aims to ensure the safe integration of drones into airspace by evaluating the risks and implementing appropriate safety measures.\n\nOverall, the SORA model provides a systematic framework for assessing cybersecurity risks in drone operations and developing strategies to mitigate these risks effectively.')

In [94]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()  # 输出string
chain = prompt | llm | output_parser
chain.invoke({"input": "What is the Sora model?"})

'The SORA model is a structured approach used in the risk assessment of cybersecurity threats and vulnerabilities. SORA stands for "Secure-ly Orchestrated, Reliable, and Adaptive." It is a methodology developed by the European Union Aviation Safety Agency (EASA) for assessing the safety risks associated with drone operations in the context of U-space (unmanned aircraft systems airspace management). \n\nThe SORA model consists of several steps, including defining the operational context, identifying potential hazards, assessing risks, and defining mitigation measures. It aims to ensure the safe integration of drones into airspace by evaluating the risks and implementing appropriate safety measures.\n\nOverall, the SORA model provides a systematic framework for assessing cybersecurity risks in drone operations and developing strategies to mitigate these risks effectively.'

```问题```: 大模型对Sora理解不到位，为什么？ 如何解决？

使用RAG： 去网上获取最新的关于Sora的内容

```C. RAG+Langchain```

基于外部知识，增强大模型回复

In [57]:
# !pip install beautifulsoup4

In [58]:
#  结合关于Sora的technical report来生成更好地答案，分几步：
#  第一步： 寻找关于Sora的一些文库，并抓取内容
#  第二步： 把文库切块（trunks)并存放到向量数据库中
#  第三步： 对于新的问题，我们首选从vector store中提取trunks, 并融合到llm的prompt里

from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://openai.com/research/video-generation-models-as-world-simulators")
docs = loader.load()

In [59]:
#  使用openai embedding
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [60]:
#!pip install faiss-cpu

In [61]:
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter

# 使用 recursiveCharacterTextSplitter, 在春节前的课程中讲过其算法
text_splitter = RecursiveCharacterTextSplitter()

# 把docs切分成trunks，在这里只有一个doc，因为我们只抓取了一个页面；
documents = text_splitter.split_documents(docs)

# 存放在向量数据库中。把trunk转化成向量时候用的embedding工具为 OpenAIEmbeddings
vector = FAISS.from_documents(documents, embeddings)

1. 给定input，从vector database搜索相似的documents（trunks）
2. documents加入到prompt里面（prompt template, 变量比如{context})
3. prompt call LLM， LLM返回response(答案)
4. 通过output parser得到格式化完之后的结果

In [62]:
# This chain takes a list of documents and formats them all into a prompt, then passes that prompt to an LLM.
from langchain.chains.combine_documents import create_stuff_documents_chain

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

In [63]:
from langchain.chains import create_retrieval_chain

retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [95]:
response = retrieval_chain.invoke({"input": "What is the Sora model?"})
print(response["answer"])

The Sora model is a video generation model that can simulate actions, interact with the world, and generate images and videos based on various prompts and inputs. It is capable of maintaining both short- and long-range dependencies, simulating artificial processes like video games, and generating high-fidelity videos and images of variable durations, resolutions, and aspect ratios.


```D. Agent```

非Agent：对于一个任务，我们明确制定 1. 2. 3. 4. 每一步都是非常清楚的，提前制定好的，包括调用什么模型，怎么调用。

Agent: 更加复杂的任务

Agent开发一个APP：

项目负责人拆解任务，然后每个任务派发给不同的角色的人

假如：
提前有一些工具
- 视频做编辑的工具
- 视频转换成动画的工具
- 生成图片的工具
- 生成动画视频的工具
- TTS的工具：
- GPT的工具（输入，输出）
- 计算器工具（输入，输出）： 如果想做加减成熟等计算，要用此工具
- 编程的工具（输入，输出）
- 脚本分镜的工具（输入，输出）
- 图片的list转视频工具


任务：自动拍摄一个动画类短视频
1. GPT的工具：生成脚本（输入，输出）
2. 脚本分镜：很长的脚本分成不同的镜头
3. 每个分镜生成图片：生成图片的工具
4. 图片转换成视频



In [65]:
from langchain.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever,
    "Sora",
    "Search for information about Sora. For any questions about Sora, you must use this tool!",
)

In [66]:
tools = [retriever_tool]

In [67]:
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor


prompt = hub.pull("hwchase17/openai-functions-agent")
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [96]:
agent_executor.invoke({"input": "What is the Sora model?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `Sora` with `{'query': 'What is the Sora model?'}`


[0m[36;1m[1;3mconsistently through three-dimensional space.Long-range coherence and object permanence. A significant challenge for video generation systems has been maintaining temporal consistency when sampling long videos. We find that Sora is often, though not always, able to effectively model both short- and long-range dependencies. For example, our model can persist people, animals and objects even when they are occluded or leave the frame. Likewise, it can generate multiple shots of the same character in a single sample, maintaining their appearance throughout the video.Interacting with the world. Sora can sometimes simulate actions that affect the state of the world in simple ways. For example, a painter can leave new strokes along a canvas that persist over time, or a man can eat a burger and leave bite marks.Simulating digital worlds. Sora is also abl

}Prompting with images and videosAll of the results above and in our landing page show text-to-video samples. But Sora can also be prompted with other inputs, such as pre-existing images or video. This capability enables Sora to perform a wide range of image and video editing tasks—creating perfectly looping video, animating static images, extending videos forwards or backwards in time, etc.Animating DALL·E imagesSora is capable of generating videos provided an image and prompt as input. Below we show example videos generated based on DALL·E 2[^31] and DALL·E 3[^30] images.A Shiba Inu dog wearing a beret and black turtleneck.Monster Illustration in flat design style of a diverse family of monsters. The group includes a furry brown monster, a sleek black monster with antennas, a spotted green monster, and a tiny polka-dotted monster, all interacting in a playful environment.An image of a realistic cloud that spells “SORA”.In an ornate, historical hall, a massive tidal wave peaks and beg

{'input': 'What is the Sora model?',
 'output': "The Sora model is a large-scale generative model trained on video data. It's a text-conditional diffusion model that operates on spacetime patches of video and image latent codes. Sora can generate videos and images of variable durations, resolutions, and aspect ratios, up to a full minute of high-definition video.\n\nKey features of the Sora model include:\n\n1. **3D Consistency**: Sora can generate videos with dynamic camera motion. As the camera shifts and rotates, people and scene elements move consistently through three-dimensional space.\n\n2. **Long-range Coherence and Object Permanence**: Sora can effectively model both short- and long-range dependencies. For example, it can persist people, animals, and objects even when they are occluded or leave the frame.\n\n3. **Interacting with the World**: Sora can simulate actions that affect the state of the world in simple ways. For example, a painter can leave new strokes along a canvas

### 3. PromptTemplate和ChatPromptTemplate

```问题```：这两者有什么区别？

A. ```for LLM(base)```

In [97]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template(
    "编写一段关于{主题}的小红书宣传文案，需要采用{风格}语气"
)
prompt_template.format(主题="美国留学", 风格="幽默")

'编写一段关于美国留学的小红书宣传文案，需要采用幽默语气'

In [98]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("编写一段关于美国留学的小红书宣传文案")
prompt_template.format()

'编写一段关于美国留学的小红书宣传文案'

```B. for Chat Model```

In [100]:
from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages(
    [
        ("system", "你是AI助教，你的名字是{name}."),
        ("human", "你好"),
        ("ai", "你好，有什么可以帮到您？"),
        ("human", "{user_input}"),
    ]
)

messages = chat_template.format_messages(name="张三", user_input="你的名字是什么？")

In [72]:
llm.invoke(messages)

AIMessage(content='你好，我的名字是张三，我是你的AI助教。有什么可以帮助你的吗？')

In [101]:
chain = chat_template | llm
chain.invoke({"name":"张三", "user_input":"你的名字是什么？"})

AIMessage(content='我的名字是张三。有什么问题我可以帮您解答呢？')

In [102]:
# query --> 大模型 ---> response
# w w w w w w w ...
for chunk in llm.stream(messages):
    print(chunk.content, end="", flush=True)

我的名字是张三。有什么问题我可以帮您解答？

In [75]:
# llm.invoke(messages)
# chain = messages | llm
# chain.invoke({"name":"张三", "user_input":"你的名字是什么？"})
# for chunk in llm.stream(messages):
#    print(chunk.content, end="", flush=True)

```C. Few shot prompte templates for given examples```

In [76]:
from langchain.prompts import (
    ChatPromptTemplate,
    FewShotChatMessagePromptTemplate,
)

In [104]:
examples = [
    {"input": "2+2", "output": "4"},
    {"input": "2+3", "output": "5"},
]

In [105]:
# This is a prompt template used to format each individual example.
example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)

print(few_shot_prompt.format())

Human: 2+2
AI: 4
Human: 2+3
AI: 5
Human: 2+3
AI: 5


In [106]:
final_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a wondrous wizard of math."), # instructions
        few_shot_prompt,  # few shot examples 
        ("human", "{input}"),  # input
    ]
)

In [107]:
chain = final_prompt | llm

chain.invoke({"input": "4+6"})

AIMessage(content='10')

```D. Few shot prompt template for dynamic examples```

问题：why? 
请自行查看 documentation

```E. Cache```

对于之前问题的答案，直接从cache中返回，减少成本、提高效率

In [108]:
%%time
from langchain.cache import InMemoryCache
from langchain.globals import set_llm_cache

set_llm_cache(InMemoryCache())

# 第一次，需要直接调用，需要消耗时间
llm.invoke("讲一个冷笑话")

CPU times: user 19.9 ms, sys: 4.42 ms, total: 24.3 ms
Wall time: 3.19 s


AIMessage(content='为什么冰箱总是笑得开心？\n\n因为它有很多冷笑话！')

In [109]:
%%time
# 第二次调用，直接从cache中获取
llm.invoke("讲一个冷笑话")

CPU times: user 1.71 ms, sys: 174 µs, total: 1.88 ms
Wall time: 2.64 ms


AIMessage(content='为什么冰箱总是笑得开心？\n\n因为它有很多冷笑话！')

```F. 追踪token的使用``` 

In [83]:
from langchain.callbacks import get_openai_callback
from langchain_openai import ChatOpenAI

In [110]:
llm = ChatOpenAI(model_name="gpt-4")

with get_openai_callback() as cb:
    result = llm.invoke("最近心情怎么样？")
    print(cb)



Tokens Used: 57
	Prompt Tokens: 16
	Completion Tokens: 41
Successful Requests: 1
Total Cost (USD): $0.00294


In [111]:
with get_openai_callback() as cb:
    result = llm.invoke("Tell me three jokes")
    result2 = llm.invoke("Tell me a joke")
    print(cb)

Tokens Used: 89
	Prompt Tokens: 22
	Completion Tokens: 67
Successful Requests: 2
Total Cost (USD): $0.00468


```G. Output Parsing```