# langchain quickstart

- 对话bot和promt创建

- 历史对话记忆模块

- 相似度检索

- 构建runnable的chain

https://python.langchain.com/docs/use_cases/chatbots/quickstart/

In [1]:
%pip install --upgrade --quiet langchain langchain-openai langchain-chroma

Note: you may need to restart the kernel to use updated packages.


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
transformers 4.24.0 requires tokenizers!=0.11.3,<0.14,>=0.11.1, but you have tokenizers 0.19.1 which is incompatible.


In [2]:
import os
from langchain_openai import ChatOpenAI

chat = ChatOpenAI(
    model="gpt-4-1106-preview", 
    temperature=0.2,
    base_url = os.environ.get('OPEN_AI_BASE_URL'),
    api_key = os.environ.get('OPEN_AI_GPT4_API_KEY')
)

In [3]:
from langchain_core.messages import HumanMessage

chat.invoke(
    [
        HumanMessage(
            content="Translate this sentence from English to French: I love programming."
        )
    ]
)

AIMessage(content="J'adore programmer.", response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 19, 'total_tokens': 25}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_2f57f81c11', 'finish_reason': 'stop', 'logprobs': None}, id='run-63bc568e-1610-463c-a7b2-f6cbeefb8019-0')

In [4]:
chat.invoke([HumanMessage(content="What did you just say?")])

AIMessage(content="As an AI language model, I don't have a continuous conversation history, so I can't recall what I said previously. However, if you're referring to the last message I provided in our current interaction, it would be the response to your previous query or statement. If you need me to repeat or clarify anything, please let me know what specific information you're looking for, and I'll be happy to help!", response_metadata={'token_usage': {'completion_tokens': 84, 'prompt_tokens': 13, 'total_tokens': 97}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_2f57f81c11', 'finish_reason': 'stop', 'logprobs': None}, id='run-ff1c6f7b-0ff4-427c-8765-21dcc1b94d0f-0')

In [5]:
from langchain_core.messages import AIMessage

chat.invoke(
    [
        HumanMessage(
            content="Translate this sentence from English to French: I love programming."
        ),
        AIMessage(content="J'adore la programmation."),
        HumanMessage(content="What did you just say?"),
    ]
)

AIMessage(content='I translated the sentence "I love programming" into French, which is "J\'adore la programmation."', response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 41, 'total_tokens': 64}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_2f57f81c11', 'finish_reason': 'stop', 'logprobs': None}, id='run-9677f8be-6d8e-4c0d-910c-58ba1466014b-0')

## prompt templates

In [6]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | chat

In [7]:
chain.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate this sentence from English to French: I love programming."
            ),
            AIMessage(content="J'adore la programmation."),
            HumanMessage(content="What did you just say?"),
        ],
    }
)

AIMessage(content='I translated the sentence "I love programming" into French, which is "J\'adore la programmation."', response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 61, 'total_tokens': 84}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_2f57f81c11', 'finish_reason': 'stop', 'logprobs': None}, id='run-928eca9b-0288-43d8-a1a9-479fe751cec9-0')

## message history

In [8]:
from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory() # 创建消息管理实例

demo_ephemeral_chat_history.add_user_message("hi!") # 加入用户消息

demo_ephemeral_chat_history.add_ai_message("whats up?") # 加入ai消息

demo_ephemeral_chat_history.messages # 查看消息

[HumanMessage(content='hi!'), AIMessage(content='whats up?')]

In [9]:
demo_ephemeral_chat_history.add_user_message(
    "Translate this sentence from English to French: I love programming."
)

response = chain.invoke({"messages": demo_ephemeral_chat_history.messages})

response

AIMessage(content="J'aime la programmation.", response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 53, 'total_tokens': 60}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_2f57f81c11', 'finish_reason': 'stop', 'logprobs': None}, id='run-318c4b2e-cc95-49dd-9cb9-33950bdd2779-0')

In [10]:
demo_ephemeral_chat_history.add_ai_message(response)

demo_ephemeral_chat_history.add_user_message("What did you just say?")

chain.invoke({"messages": demo_ephemeral_chat_history.messages})

AIMessage(content='I translated "I love programming" into French, which is "J\'aime la programmation."', response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 74, 'total_tokens': 94}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_2f57f81c11', 'finish_reason': 'stop', 'logprobs': None}, id='run-4d92e621-a4b5-43a1-892e-67ecc064e5be-0')

## retrievers

In [None]:
%pip install --upgrade --quiet langchain-chroma beautifulsoup4

In [16]:
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
data = loader.load()

In [17]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0) # 每块有500字 0重叠区域
all_splits = text_splitter.split_documents(data)

In [21]:
# 保存为向量数据库
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(
    documents=all_splits, # 要保存的切分好的内容 Document文件
    embedding=OpenAIEmbeddings(
        base_url = os.environ.get('OPEN_AI_BASE_URL'),
        api_key = os.environ.get('OPEN_AI_GPT4_API_KEY')
    )
)

In [23]:
# k is the number of chunks to retrieve
retriever = vectorstore.as_retriever(k=4) 

docs = retriever.invoke("how can langsmith help with testing?")

docs

[Document(page_content='Getting started with LangSmith | ğŸ¦œï¸�ğŸ›\xa0ï¸� LangSmith', metadata={'description': 'Introduction', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Getting started with LangSmith | ğŸ¦œï¸�ğŸ›\xa0ï¸� LangSmith'}),
 Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookQuick StartOn this pageGetting started with LangSmithIntroductionâ€‹LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!Install', metadata={'description': 'Introduction', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Getting started with LangSmith | ğŸ¦œï¸�ğŸ›\xa0ï¸� LangSmith'}),
 Document(page_conten

## handling documents

In [24]:
from langchain.chains.combine_documents import create_stuff_documents_chain

cchat = ChatOpenAI(
    model="gpt-4-1106-preview", 
    temperature=0.2,
    base_url = os.environ.get('OPEN_AI_BASE_URL'),
    api_key = os.environ.get('OPEN_AI_GPT4_API_KEY')
)

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer the user's questions based on the below context:\n\n{context}",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

document_chain = create_stuff_documents_chain(chat, question_answering_prompt)

In [25]:
from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("how can langsmith help with testing?")

document_chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
        "context": docs,
    }
)

"LangSmith can help with testing your LLM (Large Language Model) applications in several ways:\n\n1. **Evaluation**: LangSmith provides evaluation capabilities that allow you to assess the performance of your LLM applications. You can test the model's responses to various prompts to ensure that it is generating the expected outputs.\n\n2. **Tracing**: With tracing features, you can track the requests and responses in your application. This is useful for debugging and understanding how your LLM is performing in different scenarios. It can help you identify patterns or issues that may not be apparent during initial development.\n\n3. **Production Monitoring & Automations**: Once your application is live, LangSmith offers monitoring tools to ensure that it continues to perform as expected. You can set up automations to handle specific events or conditions, which can be critical for maintaining the quality and reliability of your service.\n\n4. **Prompt Hub**: The Prompt Hub is a managemen

## 创建检索链

“链chain”是langchain的核心概念

它包含着`输入-处理单元-输出-作为另一个单元的输入---`的概念

输入通常通过`.invoke`关键字传入，输出结果就是invoke之后的对象

而链条通过`|`符号来构建，它代表了前者的输出会作为输入传递给下一个内容，它可以被看做是一个函数

同时，处理单元不一定是lanchain的默认组件，也可以自己定义，它是一个函数，包含输入`(params)`和输出`return ...`

https://python.langchain.com/docs/expression_language/primitives/assign/

`RunnableParallel`和`RunnablePassthrough`是用来构建chain的两个常用方法

`RunnableParallel`创建的链会返回以函数名命名key的字典，并且其输出value是结果值 `{'modified': 2}`

`RunnablePassthrough`创建的链与上一个方法的区别在于他的value不只是值，而是包括输入输出的字典 `{'extra': {'num': 1, 'mult': 3}}`

它也经常和`.assign`一起使用，用来定义输出值的key



In [26]:
from typing import Dict

from langchain_core.runnables import RunnablePassthrough 


def parse_retriever_input(params: Dict):
    return params["messages"][-1].content # 构建一个处理单元


retrieval_chain = RunnablePassthrough.assign( # https://python.langchain.com/docs/expression_language/primitives/assign/
    context=parse_retriever_input | retriever,
).assign(
    answer=document_chain,
)

In [27]:
response = retrieval_chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)

response

{'messages': [HumanMessage(content='how can langsmith help with testing?')],
 'context': [Document(page_content='Getting started with LangSmith | ğŸ¦œï¸�ğŸ›\xa0ï¸� LangSmith', metadata={'description': 'Introduction', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Getting started with LangSmith | ğŸ¦œï¸�ğŸ›\xa0ï¸� LangSmith'}),
  Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookQuick StartOn this pageGetting started with LangSmithIntroductionâ€‹LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!Install', metadata={'description': 'Introduction', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title

In [28]:
demo_ephemeral_chat_history.add_ai_message(response["answer"])

demo_ephemeral_chat_history.add_user_message("tell me more about that!")

retrieval_chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    },
)

{'messages': [HumanMessage(content='how can langsmith help with testing?'),
  AIMessage(content="LangSmith can help with testing your LLM (Large Language Model) applications in several ways:\n\n1. **Evaluation**: LangSmith provides evaluation capabilities that allow you to assess the performance of your LLM applications. This can include testing for accuracy, relevance, and coherence of the responses generated by your model.\n\n2. **Tracing**: With tracing capabilities, you can track the behavior of your LLM applications. This feature enables you to understand how your application processes inputs and generates outputs, which is crucial for debugging and improving the model's performance.\n\n3. **Production Monitoring & Automations**: LangSmith offers tools for monitoring your LLM applications in production. This includes the ability to set up automated alerts and actions based on specific criteria or performance metrics, helping you to quickly identify and address issues.\n\n4. **Prom

In [29]:
retrieval_chain_with_only_answer = (
    RunnablePassthrough.assign(
        context=parse_retriever_input | retriever,
    )
    | document_chain
)

retrieval_chain_with_only_answer.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    },
)

"Certainly! Let's delve deeper into how LangSmith can assist with testing your LLM applications:\n\n1. **Evaluation Capabilities**:\n   - LangSmith's evaluation tools allow you to measure the performance of your LLM against specific benchmarks or datasets. This can help you understand how well your model is performing in terms of accuracy, fluency, and adherence to the task it's designed for.\n   - You can create evaluations that mimic real-world scenarios to see how your LLM would perform in actual use cases.\n   - The evaluation results can guide you in fine-tuning your model or making adjustments to improve its performance.\n\n2. **Tracing Capabilities**:\n   - Tracing is a powerful feature for understanding the decision-making process of your LLM. It provides insights into the intermediate steps the model takes when generating responses.\n   - By analyzing traces, you can identify patterns or errors in the model's processing, which can be invaluable for debugging and improving the 

## 查询转换



In [30]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch

# We need a prompt that we can pass into an LLM to generate a transformed search query

cchat = ChatOpenAI(
    model="gpt-4-1106-preview", 
    temperature=0.2,
    base_url = os.environ.get('OPEN_AI_BASE_URL'),
    api_key = os.environ.get('OPEN_AI_GPT4_API_KEY')
)

query_transform_prompt = ChatPromptTemplate.from_messages(
    [
        MessagesPlaceholder(variable_name="messages"),
        (
            "user",
            "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation. Only respond with the query, nothing else.",
        ),
    ]
)

query_transforming_retriever_chain = RunnableBranch(
    (
        lambda x: len(x.get("messages", [])) == 1,
        # If only one message, then we just pass that message's content to retriever
        (lambda x: x["messages"][-1].content) | retriever,
    ),
    # If messages, then we pass inputs to LLM chain to transform the query, then pass to retriever
    query_transform_prompt | chat | StrOutputParser() | retriever,
).with_config(run_name="chat_retriever_chain")

In [31]:
document_chain = create_stuff_documents_chain(chat, question_answering_prompt)

conversational_retrieval_chain = RunnablePassthrough.assign(
    context=query_transforming_retriever_chain,
).assign(
    answer=document_chain,
)

demo_ephemeral_chat_history = ChatMessageHistory()

In [32]:
demo_ephemeral_chat_history.add_user_message("how can langsmith help with testing?")

response = conversational_retrieval_chain.invoke(
    {"messages": demo_ephemeral_chat_history.messages},
)

demo_ephemeral_chat_history.add_ai_message(response["answer"])

response

{'messages': [HumanMessage(content='how can langsmith help with testing?'),
  AIMessage(content="LangSmith can help with testing your LLM (Large Language Model) applications in several ways:\n\n1. **Evaluation**: LangSmith provides evaluation capabilities that allow you to assess the performance of your LLM applications. This can include testing the model's responses for accuracy, relevance, and coherence. You can set up various tests to ensure that the model is performing as expected under different scenarios.\n\n2. **Tracing**: With tracing features, you can track and analyze how your LLM applications are being used. This can help you identify any issues or areas for improvement by providing insights into the model's decision-making process. Tracing can be particularly useful for debugging and understanding the model's outputs.\n\n3. **Production Monitoring & Automations**: LangSmith offers tools for monitoring your LLM applications in production. This includes tracking metrics, sett

In [33]:
demo_ephemeral_chat_history.add_user_message("tell me more about that!")

conversational_retrieval_chain.invoke(
    {"messages": demo_ephemeral_chat_history.messages}
)

{'messages': [HumanMessage(content='how can langsmith help with testing?'),
  AIMessage(content="LangSmith can help with testing your LLM (Large Language Model) applications in several ways:\n\n1. **Evaluation**: LangSmith provides evaluation capabilities that allow you to assess the performance of your LLM applications. This can include testing the model's responses for accuracy, relevance, and coherence. You can set up various tests to ensure that the model is performing as expected under different scenarios.\n\n2. **Tracing**: With tracing features, you can track and analyze how your LLM applications are being used. This can help you identify any issues or areas for improvement by providing insights into the model's decision-making process. Tracing can be particularly useful for debugging and understanding the model's outputs.\n\n3. **Production Monitoring & Automations**: LangSmith offers tools for monitoring your LLM applications in production. This includes tracking metrics, sett