### Core Components of LangChain

1. **Model I/O Wrapping**
   - **LLMs**: Large Language Models
   - **Chat Models**: Generally based on LLMs but restructured for conversational purposes
   - **PromptTemplate**: Templates for prompt creation
   - **OutputParser**: Parses the output from models
     
2. **Data Connection Wrapping**
   - **Document Loaders**: Loaders for various file formats
   - **Document Transformers**: Common operations on documents such as splitting, filtering, translating, and extracting metadata
   - **Text Embedding Models**: Convert text into vector representations, useful for tasks like retrieval
   - **Vectorstores**: Stores for vectors (used in retrieval tasks)
   - **Retrievers**: Tools for retrieving vectors from storage

3. **Memory Wrapping**
   - **Memory**: Not physical memory; it manages "context", "history", or "memory" from a text perspective

4. **Architecture Wrapping**
   - **Chain**: Implements a single function or a series of sequential functions
   - **Agent**: Automatically plans and executes steps based on user input, selecting the necessary tools for each step to achieve the desired task
     - **Tools**: Functions for calling external functionalities, such as Google search, file I/O, Linux shell, etc.
     - **Toolkits**: A set of tools designed to operate specific software, such as a toolkit for managing databases or Gmail

5. **Callbacks**

![LangChain Components](data/langchain.png)


### Documentation (Taking the Python version as an example)
- Functional Modules: https://python.langchain.com/docs/get_started/introduction
- API Documentation: https://api.python.langchain.com/en/latest/langchain_api_reference.html
- Third-Party Component Integration: https://python.langchain.com/docs/integrations/platforms/
- Official Use Cases: https://python.langchain.com/docs/use_cases
- Debugging, Deployment, and Other Guidance: https://python.langchain.com/docs/guides/debugging

## 1. Model I/O Encapsulation
Different models are encapsulated into a unified interface, making it easier to switch models without restructuring the code.

### 1.1 Model API: LLM vs. ChatModel

In [19]:
# !pip install langchain==0.1.20
# !pip install --upgrade langchain-openai==0.1.6
# !pip install langchain-community==0.038
# pip install --upgrade langchain-openai

In [20]:
# import os
# os.environ["LANGCHAIN_PROJECT"] = ""
# os.environ["LANGCHAIN_API_KEY"] = ""
# os.environ["LANGCHAIN_ENDPOINT"] = ""
# os.environ["LANGCHAIN_TRACING_V2"] = ""

# OPENAI_API_KEY = "sk-xxxxx"

### 1.1.1 OpenAI Model Wrapping

In [56]:
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv

_ = load_dotenv()

llm = ChatOpenAI(model="gpt-3.5-turbo")  # default gpt-3.5-turbo
response = llm.invoke("who are you?")
print(response.content)

I am a language model AI created by OpenAI. How can I assist you today?


### 1.1.2 Multi-turn Conversation Session Wrapping

In [27]:
from langchain.schema import (
    AIMessage,  # Equivalent to the assistant role in the OpenAI API
    HumanMessage,  # Equivalent to the user role in the OpenAI API
    SystemMessage  # Equivalent to the system role in the OpenAI API
)

messages = [
    SystemMessage(content="You are the intelligent regulatory system of the future."),
    HumanMessage(content="I am a space ranger, my name is Eric."),
    AIMessage(content="I am an NPC, the Earth caretaker, welcome back to Earth!"),
    HumanMessage(content="Who am I?")
]

ret = llm.invoke(messages)

print(ret)


content='You are Eric, a space ranger dedicated to exploring and protecting the galaxy.' response_metadata={'token_usage': {'completion_tokens': 15, 'prompt_tokens': 59, 'total_tokens': 74, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-5a52b595-75b3-47ff-9bba-b4fc6a452066-0'


<div class="alert alert-success"> <b>Key Point:</b> Achieve a unified interface for different models through encapsulation. </div>


### 1.1.3 Switching to Other Models

In [28]:
#!pip install dashscope

In [29]:
# Other models are encapsulated in the langchain_community package.
from langchain_community.chat_models import QianfanChatEndpoint
from langchain_core.messages import HumanMessage
import os

llm = QianfanChatEndpoint(
    qianfan_ak=os.getenv('ERNIE_CLIENT_ID'),
    qianfan_sk=os.getenv('ERNIE_CLIENT_SECRET')
)

messages = [
    HumanMessage(content="Who are you?")
]

ret = llm.invoke(messages)

print(ret.content)

# from langchain_community.chat_models.tongyi import ChatTongyi
# from langchain_core.messages import HumanMessage
# from langchain_core.messages import HumanMessage, SystemMessage
# import os

# llm = ChatTongyi(model_name="qwen-vl-chat-v1")

# messages = [
#     SystemMessage(
#         content="You are a helpful assistant. When people ask questions, always start with a greeting, 'Hello!', then answer the question."
#     ),
#     HumanMessage(
#         content="Hello, please introduce yourself."
#     ),
# ]

# ret = llm.invoke(messages)

# print(ret.content)



I am a language model AI created by OpenAI. How can I assist you today?


### 1.2 Model Input and Output
<img src="data/model_io.jpg" style="margin-left: 0px" width=500px>  

### 1.2.1 Prompt Template Encapsulation
1. The PromptTemplate allows for custom variables within the template.

In [12]:
from langchain.prompts import PromptTemplate

template = PromptTemplate.from_template("Tell me a joke about {subject}.")
print("===Template===")
print(template)
print("===Prompt===")
print(template.format(subject='ChatGPT'))


===Template===
input_variables=['subject'] template='Tell me a joke about {subject}.'
===Prompt===
Tell me a joke about ChatGPT.


2. ChatPromptTemplate: Representing Conversation Context with a Template

In [13]:
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai import ChatOpenAI

template = ChatPromptTemplate.from_messages(
    [
        SystemMessagePromptTemplate.from_template(
           "You are the customer service assistant for {product}. Your name is {name}."),
        HumanMessagePromptTemplate.from_template("{query}"),
    ]
)

llm = ChatOpenAI()
prompt = template.format_messages(
    product="Crazy Zoo",
    name="Baymax",
    query="Who are you?"
)

ret = llm.invoke(prompt)

print(ret.content)

Hello, I am Baymax, the customer service assistant for Crazy Zoo. How can I assist you today?


3. MessagesPlaceholder: Turning Multi-turn Conversations into Templates

In [14]:
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
)

human_prompt = "Translate your answer to {language}."
human_message_template = HumanMessagePromptTemplate.from_template(human_prompt)

chat_prompt = ChatPromptTemplate.from_messages(
    # variable_name is the message placeholder variable name in the template
    # used for assigning values.
    [MessagesPlaceholder(variable_name="conversation"), human_message_template]
)

In [15]:
from langchain_core.messages import AIMessage, HumanMessage

human_message = HumanMessage(content="Who is Elon Musk?")
ai_message = AIMessage(
    content="Elon Musk is a billionaire entrepreneur, inventor, and industrial designer"
)

messages = chat_prompt.format_prompt(
    # Used to assign values to "conversation" and "language."
    conversation=[human_message, ai_message], language="中文"
)

print(messages.to_messages())

[HumanMessage(content='Who is Elon Musk?'), AIMessage(content='Elon Musk is a billionaire entrepreneur, inventor, and industrial designer'), HumanMessage(content='Translate your answer to 中文.')]


In [16]:
result = llm.invoke(messages)
print(result.content)

埃隆·马斯克是一位亿万富翁企业家、发明家和工业设计师。


<div class="alert alert-success"> <b>Key Point:</b> View the Prompt template as a function with parameters, analogous to SK's Semantic Function. </div>

### 1.2.2 Loading Prompt Templates from Files

In [18]:
from langchain.prompts import PromptTemplate

template = PromptTemplate.from_file("data/example_prompt_template.txt")
print("===Template===")
print(template)
print("===Prompt===")
print(template.format(topic='Dark Humor'))

===Template===
input_variables=['topic'] template='举一个关于{topic}的例子'
===Prompt===
举一个关于Dark Humor的例子


### 1.3 Output Encapsulation: OutputParser
Automatically loads the strings output by the LLM in the specified format.

Built-in OutputParsers in LangChain include:

- ListParser
- DatetimeParser
- EnumParser
- JsonOutputParser
- PydanticParser
- XMLParser
And more.

### 1.3.1 Pydantic (JSON) Parser
Automatically generates output format specifications based on the definitions in Pydantic classes.

In [19]:
from langchain_core.pydantic_v1 import BaseModel, Field, validator
from typing import List, Dict

# Define your output object
class Date(BaseModel):
    year: int = Field(description="Year")
    month: int = Field(description="Month")
    day: int = Field(description="Day")
    era: str = Field(description="BC or AD")

    # ----- Optional Mechanism --------
    # You can add custom validation mechanisms
    @validator('month')
    def valid_month(cls, field):
        if field <= 0 or field > 12:
            raise ValueError("Month must be between 1 and 12")
        return field

    @validator('day')
    def valid_day(cls, field):
        if field <= 0 or field > 31:
            raise ValueError("Day must be between 1 and 31")
        return field

    @validator('day', pre=True, always=True)
    def valid_date(cls, day, values):
        year = values.get('year')
        month = values.get('month')

        # Ensure both year and month are provided
        if year is None or month is None:
            return day  # Cannot validate the date without year and month

        # Check if the date is valid
        if month == 2:
            if cls.is_leap_year(year) and day > 29:
                raise ValueError("February has a maximum of 29 days in a leap year")
            elif not cls.is_leap_year(year) and day > 28:
                raise ValueError("February has a maximum of 28 days in a non-leap year")
        elif month in [4, 6, 9, 11] and day > 30:
            raise ValueError(f"{month} has a maximum of 30 days")

        return day

    @staticmethod
    def is_leap_year(year):
        if year % 400 == 0 or (year % 4 == 0 and year % 100 != 0):
            return True
        return False


In [20]:
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import PydanticOutputParser
from dotenv import load_dotenv

_ = load_dotenv()

model_name = 'gpt-3.5-turbo'
temperature = 0
model = ChatOpenAI(model_name=model_name, temperature=temperature)

# Construct an OutputParser based on the Pydantic object definition
parser = PydanticOutputParser(pydantic_object=Date)

template = """Extract the date from user input.
{format_instructions}
User input:
{query}"""

prompt = PromptTemplate(
    template=template,
    input_variables=["query"],
    # Directly retrieve output description from the OutputParser and pre-assign values to the template variables
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

print("====Format Instruction=====")
print(parser.get_format_instructions())

query = "The weather was clear on April 6, 2023..."
model_input = prompt.format_prompt(query=query)

print("====Prompt=====")
print(model_input.to_string())

output = model.invoke(model_input.to_messages())
print("====Raw Output from Model=====")
print(output.content)
print("====Parsed Output=====")
date = parser.parse(output.content)
print(date)



====Format Instruction=====
The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"year": {"title": "Year", "description": "Year", "type": "integer"}, "month": {"title": "Month", "description": "Month", "type": "integer"}, "day": {"title": "Day", "description": "Day", "type": "integer"}, "era": {"title": "Era", "description": "BC or AD", "type": "string"}}, "required": ["year", "month", "day", "era"]}
```
====Prompt=====
Extract the date from user input.
The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"prope

### 1.3.2 Auto-Fixing Parser

Automatically repairs and re-parses based on parsing exceptions using the LLM.


In [21]:
from langchain.output_parsers import OutputFixingParser

new_parser = OutputFixingParser.from_llm(
    parser=parser, llm=ChatOpenAI(model="gpt-3.5-turbo"))

# We correct the format of the previous output
output = output.content.replace("4", "四月")
print("===Formatted Error Output===")
print(output)
try:
    date = parser.parse(output)
except Exception as e:
    print("===An Exception Occurred===")
    print(e)

# Automatically fix and parse using OutputFixingParser
date = new_parser.parse(output)
print("===Re-parsed Result===")
print(date.json())


===Formatted Error Output===
{
  "year": 2023,
  "month": 四月,
  "day": 6,
  "era": "AD"
}
===An Exception Occurred===
Invalid json output: {
  "year": 2023,
  "month": 四月,
  "day": 6,
  "era": "AD"
}
===Re-parsed Result===
{"year": 2023, "month": 4, "day": 6, "era": "AD"}


<div class="alert alert-warning"> <b>Thought:</b> Guess how OutputFixingParser achieves this. </div>

### 1.4 Summary
1. LangChain provides a unified interface for calling various models, including both completion and conversational models.
2. LangChain offers the PromptTemplate class, allowing custom templates with variables.
3. LangChain provides a series of output parsers for converting the outputs of large models into structured objects, with built-in auto-fix functionality.
4. The aforementioned models are some of the more outstanding parts of LangChain; however, the maintenance of the OutputParser's prompts is coupled with the code, which can be a drawback.

## 2. Data Connection Encapsulation
<img src="data/data_connection.jpg" style="margin-left: 0px" width=500px>

### 2.1 Document Loaders: Document Loaders

In [22]:
#!pip install pypdf

In [25]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("data/llama2.pdf")
pages = loader.load_and_split()

print(pages[0].page_content)

Llama 2 : Open Foundation and Fine-Tuned Chat Models
Hugo Touvron∗Louis Martin†Kevin Stone†
Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra
Prajjwal Bhargava Shruti Bhosale Dan Bikel Lukas Blecher Cristian Canton Ferrer Moya Chen
Guillem Cucurull David Esiobu Jude Fernandes Jeremy Fu Wenyin Fu Brian Fuller
Cynthia Gao Vedanuj Goswami Naman Goyal Anthony Hartshorn Saghar Hosseini Rui Hou
Hakan Inan Marcin Kardas Viktor Kerkez Madian Khabsa Isabel Kloumann Artem Korenev
Punit Singh Koura Marie-Anne Lachaux Thibaut Lavril Jenya Lee Diana Liskovich
Yinghai Lu Yuning Mao Xavier Martinet Todor Mihaylov Pushkar Mishra
Igor Molybog Yixin Nie Andrew Poulton Jeremy Reizenstein Rashi Rungta Kalyan Saladi
Alan Schelten Ruan Silva Eric Michael Smith Ranjan Subramanian Xiaoqing Ellen Tan Binh Tang
Ross Taylor Adina Williams Jian Xiang Kuan Puxin Xu Zheng Yan Iliyan Zarov Yuchen Zhang
Angela Fan Melanie Kambadur Sharan Narang Aurelien Rodriguez Robert Stojnic
Sergey Edunov 

<div class="alert alert-danger"> The implementations of PDFLoader and TextSplitter in LangChain are relatively rough and are not recommended for use in production. </div>

### 2.3 Vector Databases and Vector Retrieval

In [26]:
# #!pip install chromadb
# __import__('pysqlite3')
# import sys

# sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [28]:
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
# from langchain_community.vectorstores import Chroma
from langchain_community.vectorstores import FAISS
from langchain_openai import ChatOpenAI
from langchain_community.document_loaders import PyPDFLoader

from dotenv import load_dotenv
_ = load_dotenv()

# Loading Documents
loader = PyPDFLoader("data/llama2.pdf")
pages = loader.load_and_split()

# Document Splitting
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=100,
    length_function=len,
    add_start_index=True,
)

texts = text_splitter.create_documents(
     # [pages[2].page_content, pages[3].page_content]
     [page.page_content for page in pages[:4]]
)
   

# Data Ingestion
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
# db = Chroma.from_documents(texts, embeddings)
db = FAISS.from_documents(texts, embeddings)

# Retrieving Top-5 Results
retriever = db.as_retriever(search_kwargs={"k": 5})

docs = retriever.get_relevant_documents("How many parameters does Llama 2 have?")

# print(docs[0].page_content)
for doc in docs:
    print(doc.page_content)
    print("------")

  warn_deprecated(


7B, 13B, and 70B parameters. We have also trained 34B variants, which we report on in this paper
but are not releasing.§
2.Llama 2-Chat , a ﬁne-tuned version of Llama 2 that is optimized for dialogue use cases. We release
variants of this model with 7B, 13B, and 70B parameters as well.
------
Sergey Edunov Thomas Scialom∗
GenAI, Meta
Abstract
In this work, we develop and release Llama 2, a collection of pretrained and ﬁne-tuned
large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
------
Llama 2-Chat , at scales up to 70B parameters. On the series of helpfulness and safety benchmarks we tested,
Llama 2-Chat models generally perform better than existing open-source models. They also appear to
------
large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
Our ﬁne-tuned LLMs, called Llama 2-Chat , are optimized for dialogue use cases. Our
models outperform open-source chat models on most benchmarks we tested, and based on
---

For more links to third-party retrieval components, refer to: https://python.langchain.com/docs/integrations/vectorstores

### 2.4 Summary
1. The document processing part of LangChain is implemented rather roughly and is not recommended for production use.
2. The connection to vector databases is essentially an interface encapsulation; you need to choose your own vector database.

## 3. Memory Encapsulation: Memory
### 3.1 Conversation Context: ConversationBufferMemory

In [29]:
from langchain.memory import ConversationBufferMemory, ConversationBufferWindowMemory

history = ConversationBufferMemory()
history.save_context({"input": "How are you?"}, {"output": "I'm OK"})

print(history.load_memory_variables({}))

history.save_context({"input": "What's up to you today?"}, {"output": "Everything is going well."})

print(history.load_memory_variables({}))

{'history': "Human: How are you?\nAI: I'm OK"}
{'history': "Human: How are you?\nAI: I'm OK\nHuman: What's up to you today?\nAI: Everything is going well."}


### 3.2 Keep Only One Window's Context: ConversationBufferWindowMemory

In [30]:
from langchain.memory import ConversationBufferWindowMemory

window = ConversationBufferWindowMemory(k=2)
window.save_context({"input": "First round question"}, {"output": "First round answer"})
window.save_context({"input": "Second round question"}, {"output": "Second round answer"})
window.save_context({"input": "Third round question"}, {"output": "Third round answer"})
print(window.load_memory_variables({}))

{'history': 'Human: Second round question\nAI: Second round answer\nHuman: Third round question\nAI: Third round answer'}


### 3.3 Control Context Length by Token Count: ConversationTokenBufferMemory

In [31]:
from langchain.memory import ConversationTokenBufferMemory
from langchain_openai import ChatOpenAI

memory = ConversationTokenBufferMemory(
    llm=ChatOpenAI(),
    max_token_limit=40
)
memory.save_context(
    {"input": "Hello"}, {"output": "Hi, I am your AI assistant."}
)
memory.save_context(
    {"input": "What can you do?"}, {"output": "I can do anything."}
)

print(memory.load_memory_variables({}))


{'history': 'Human: Hello\nAI: Hi, I am your AI assistant.\nHuman: What can you do?\nAI: I can do anything.'}


### 3.4 More Types
- ConversationSummaryMemory: Summarizes the context
  - https://python.langchain.com/docs/modules/memory/types/summary
- ConversationSummaryBufferMemory: Saves context within token limits and summarizes older ones
  - https://python.langchain.com/docs/modules/memory/types/summary_buffer
- VectorStoreRetrieverMemory: Stores memory in a vector database and retrieves the most relevant parts based on user input
  - https://python.langchain.com/docs/modules/memory/types/vectorstore_retriever_memory

### 3.5 Summary
1. LangChain's memory management mechanism is usable, especially for simple cases like managing by rounds or token count.
2. For complex situations, it may not be the optimal implementation; for instance, in retrieving from a vector database, it is advisable to evaluate based on actual conditions and results.
3. However, the various maintenance methods for memory can be referenced in actual production.

## 4. Chain and LangChain Expression Language (LCEL)

LangChain Expression Language (LCEL) is a declarative language that allows for easy composition of different call sequences to form a Chain. From its inception, LCEL has been designed to support deploying prototypes into production environments **without code changes**, ranging from the simplest "prompt + LLM" chains to the most complex chains (users have successfully run LCEL Chains containing hundreds of steps in production environments).

Some highlights of LCEL include:

1. **Stream Support**: When building Chains with LCEL, you can achieve optimal first token times (i.e., the time from output start to the generation of the first output). For some Chains, this means you can stream tokens directly from the LLM to the streaming output parser, obtaining parsed, incremental outputs at the same rate as the LLM provider outputs raw tokens.

2. **Asynchronous Support**: Any chain built with LCEL can be called through synchronous APIs (e.g., for prototyping in Jupyter notebooks) and asynchronous APIs (e.g., in LangServe servers). This allows the same code to be used for prototyping and production environments, delivering excellent performance and enabling multiple concurrent requests on the same server.

3. **Optimized Parallel Execution**: When your LCEL chain has steps that can be executed in parallel (e.g., retrieving documents from multiple retrievers), we automatically execute them with minimal latency, whether in synchronous or asynchronous interfaces.

4. **Retry and Fallback**: Configure retries and fallbacks for any part of the LCEL chain. This is an excellent way to make the chain more reliable at scale. We are currently adding support for retry/fallback in streaming so that you can gain increased reliability without adding any latency costs.

5. **Access to Intermediate Results**: For more complex chains, accessing the results of intermediate steps before the final output is generated can be very useful. This can be used to inform the end user that something is happening, or simply for debugging the chain. You can stream intermediate results, and they are available on each LangServe server.

6. **Input and Output Schemas**: Input and output schemas provide Pydantic and JSONSchema schemas inferred from the structure of each LCEL chain. This can be used for input and output validation and is a component of LangServe.

7. **Seamless Integration with LangSmith Tracking**: As chains become more complex, understanding what happens at each step becomes increasingly important. With LCEL, all steps are automatically logged to LangSmith for maximum observability and debuggability.

8. **Seamless Integration with LangServe Deployment**: Any chain created with LCEL can be easily deployed using LangServe.

Original text: https://python.langchain.com/docs/expression_language/

### 4.1 Pipeline-style Calls: PromptTemplate, LLM, and OutputParser

In [45]:
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser, PydanticOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.pydantic_v1 import BaseModel, Field, validator
from typing import List, Dict, Optional
from enum import Enum
import json

In [46]:
# Output structure
class SortEnum(str, Enum):
    data = 'data'
    price = 'price'


class OrderingEnum(str, Enum):
    ascend = 'ascend'
    descend = 'descend'


class Semantics(BaseModel):
    name: Optional[str] = Field(description="Traffic package name", default=None)
    price_lower: Optional[int] = Field(description="Lower price limit", default=None)
    price_upper: Optional[int] = Field(description="Upper price limit", default=None)
    data_lower: Optional[int] = Field(description="Lower data limit", default=None)
    data_upper: Optional[int] = Field(description="Upper data limit", default=None)
    sort_by: Optional[SortEnum] = Field(description="Sort by price or traffic", default=None)
    ordering: Optional[OrderingEnum] = Field(
        description="Sort in ascending or descending order", default=None)


# OutputParser
parser = PydanticOutputParser(pydantic_object=Semantics)

# Prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Parse the user's input into JSON representation. Output format is as follows:\n{format_instructions}\nDo not output fields not mentioned.",
        ),
        ("human", "{text}"),
    ]
).partial(format_instructions=parser.get_format_instructions())

# Model
# model = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
model = ChatOpenAI(model="gpt-4o", temperature=0)

# LCEL expression
runnable = (
    {"text": RunnablePassthrough()} | prompt | model | parser
)

# Run
ret = runnable.invoke("What are the large data packages under 100 yuan?")

# print(ret.json())

print(
  json.dumps(
      ret.dict(),
      indent = 4,
      ensure_ascii=False
  )
)


{
    "name": null,
    "price_lower": null,
    "price_upper": 100,
    "data_lower": null,
    "data_upper": null,
    "sort_by": "data",
    "ordering": "descend"
}


##### stream out format

In [47]:
runnable = (
    {"text": RunnablePassthrough()} | prompt | model |  StrOutputParser()
)

# stream output 

for s in runnable.invoke("What are the large data packages under 100 yuan?"):
    print(s,end="")

```json
{
  "price_upper": 100,
  "sort_by": "data",
  "ordering": "descend"
}
```

<div class="alert alert-warning"> <b>Note:</b> In the current documentation, objects generated by LCEL are referred to as runnable or chain, with both terms often used interchangeably. Essentially, it's a custom call flow. </div>

<div class="alert alert-success"> <b>The value of using LCEL is also the core value of LangChain.</b> <br /> The official documentation provides examples from different perspectives: https://python.langchain.com/docs/expression_language/why </div>

### 4.2 Implementing RAG with LCEL

In [53]:
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
# from langchain_community.vectorstores import Chroma
from langchain_community.vectorstores import FAISS
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import PyPDFLoader

# Loading Documents
loader = PyPDFLoader("data/llama2.pdf")
pages = loader.load_and_split()

# Document Splitting
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=100,
    length_function=len,
    add_start_index=True,
)

texts = text_splitter.create_documents(
    
    # [pages[2].page_content, pages[3].page_content]
    [page.page_content for page in pages[:4]]

)

# Data Ingestion
OpenAIEmbeddings(model="text-embedding-ada-002")
# db = Chroma.from_documents(texts, embeddings)
db = FAISS.from_documents(texts, embeddings)

# Retrieving Top-1 Results
retriever = db.as_retriever(search_kwargs={"k": 1})

In [54]:
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough

# Prompttemplate
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

# Chain
rag_chain = (
    {"question": RunnablePassthrough(), "context": retriever}
    | prompt
    | model
    | StrOutputParser()
)

rag_chain.invoke("how many parametes of Llama 2")

'Llama 2 has variants with 7B, 13B, and 70B parameters.'

### 4.3 Implementing Function Calling with LCEL 

In [55]:
 llm=ChatOpenAI()

In [57]:
from langchain_core.tools import tool


@tool
def multiply(first_int: int, second_int: int) -> int:
    """Multiplying two integers"""
    return first_int * second_int


@tool
def add(first_int: int, second_int: int) -> int:
    "Add two integers."
    return first_int + second_int


@tool
def exponentiate(base: int, exponent: int) -> int:
    "Exponentiate the base to the exponent power."
    return base**exponent

In [58]:
from langchain_core.output_parsers import StrOutputParser
from langchain.output_parsers import JsonOutputToolsParser

tools = [multiply, add, exponentiate]
# LCEL with branching
llm_with_tools = llm.bind_tools(tools) | {
    "functions": JsonOutputToolsParser(),
    "text": StrOutputParser()
}

In [59]:
result = llm_with_tools.invoke("What's 16 times 1024?")

print(result)

{'functions': [{'args': {'first_int': 16, 'second_int': 1024}, 'type': 'multiply'}], 'text': ''}


In [60]:
result = llm_with_tools.invoke("Who are you?")

print(result)

{'functions': [], 'text': 'I am a language model assistant here to help you with any questions or tasks you have. How can I assist you today?'}


#### Directly selecting tools and running

In [63]:
from typing import Union
from operator import itemgetter
from langchain_core.runnables import (
    Runnable,
    RunnableLambda,
    RunnableMap,
    RunnablePassthrough,
)

# Mapping from names to functions
tool_map = {tool.name: tool for tool in tools}


def call_tool(tool_invocation: dict) -> Union[str, Runnable]:
    """Function for dynamically constructing the end of the chain based on the model-selected tool."""
    tool = tool_map[tool_invocation["type"]]
    return RunnablePassthrough.assign(
        output=itemgetter("args") | tool
    )


# .map() allows us to apply a function to a list of inputs.
call_tool_list = RunnableLambda(call_tool).map()

In [64]:
import json


def route(response):
    if len(response["functions"]) > 0:
        return response["functions"]
    else:
        return response["text"]


llm_with_tools = llm.bind_tools(tools) | {
    "functions": JsonOutputToolsParser() | call_tool_list,
    "text": StrOutputParser()
} | RunnableLambda(route)

result = llm_with_tools.invoke("What is the square of 1024?")
print(result)

result = llm_with_tools.invoke("how are you?")
print(result)

[{'args': {'base': 1024, 'exponent': 2}, 'type': 'exponentiate', 'output': 1048576}]
I'm here and ready to help! How can I assist you today?


<div class="alert alert-warning"> This approach has poor readability. I personally do not recommend using overly complex LCEL structures! </div>

### 4.4 Implementing Factory Pattern with LCEL

In [74]:
from langchain_core.runnables.utils import ConfigurableField
from langchain_openai import ChatOpenAI
from langchain_community.chat_models import QianfanChatEndpoint
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    AIMessagePromptTemplate
)
from langchain.schema import HumanMessage
import os

# model 1
ernie_model = QianfanChatEndpoint(
    qianfan_ak=os.getenv('ERNIE_CLIENT_ID'),
    qianfan_sk=os.getenv('ERNIE_CLIENT_SECRET')
)

# model 2
gpt_model = ChatOpenAI(model="gpt-4o", temperature=0)

# Use configurable_alternatives to set up field selection models
model = gpt_model.configurable_alternatives(
    ConfigurableField(id="llm"),
    default_key="gpt",
    ernie=ernie_model,
    gpt4o = gpt_model,
    # claude=claude_model
)

# Prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        HumanMessagePromptTemplate.from_template("query")
    ]
)

# LCEL
chain = (
    {"query": RunnablePassthrough()}
    |prompt
    |model
    |StrOutputParser()
)

# Specify the model at runtime: "gpt" or "ernie" or "gpt4o" or claude
ret = chain.with_config(configurable={"llm": "gpt"}).invoke("Introduce yourself, and tell me how do you help me?")

print(ret)


Hello! How can I assist you today? If you have a specific question or need information on a particular topic, feel free to ask!


Further reading: What is the Factory Pattern or the Builder Pattern; Overview of Design Patterns.

<div class="alert alert-warning">
<b>Thought:</b> What is the significance of LCEL from the perspective of decoupling dependencies between modules?
</div>


### With LCEL, you can also achieve
1. Configuring runtime variables: https://python.langchain.com/docs/expression_language/how_to/configure
2. Fallback mechanisms: https://python.langchain.com/docs/expression_language/how_to/fallbacks
3. Parallel calls: https://python.langchain.com/docs/expression_language/how_to/map
4. Logic branches: https://python.langchain.com/docs/expression_language/how_to/routing
5. Calling custom streaming functions: https://python.langchain.com/docs/expression_language/how_to/generators
6. Connecting external Memory: https://python.langchain.com/docs/expression_language/how_to/message_history
More examples: https://python.langchain.com/docs/expression_language/cookbook/

## 5. Intelligent Agent Architecture: Agent

### 5.1 Recap: What is an Agent?
Treating large language models as reasoning engines. Given a task, the agent automatically generates the necessary steps to complete the task, executes the corresponding actions (such as selecting and calling tools), until the task is completed.

<img src="data/agent-overview.png" style="margin-left: 0px" width=500px>

### 5.2 First, define some tools: Tools
- They can be a function or a third-party API
- A Chain or the run() method of an Agent can also be treated as a Tool

In [57]:
# !pip install google-search-results

In [58]:
# !pip install --upgrade langchainhub

In [67]:
from langchain_community.utilities import SerpAPIWrapper
from langchain.tools import Tool, tool
from dotenv import load_dotenv

_ = load_dotenv()

search = SerpAPIWrapper()
tools = [
    Tool.from_function(
        func=search.run,
        name="Search",
        description="useful for when you need to answer questions about current events"
    ),
]

In [68]:
import calendar
import dateutil.parser as parser
from datetime import date

# 自定义工具


@tool("weekday")
def weekday(date_str: str) -> str:
    """Convert date to weekday name"""
    d = parser.parse(date_str)
    return calendar.day_name[d.weekday()]


tools += [weekday]

### 5.3 Types of Agents: ReAct

<img src="data/ReAct.png" style="margin-left: 0px" width=500px>

In [69]:
from langchain import hub
import json

# 下载一个现有的 Prompt 模板
prompt = hub.pull("hwchase17/react")

print(prompt.template)

Please use the `langsmith sdk` instead:
  pip install langsmith
Use the `pull_prompt` method.
  res_dict = client.pull_repo(owner_repo_commit)


Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}


In [70]:
# from langchain_openai import ChatOpenAI
# from langchain.agents import AgentExecutor, create_react_agent


# llm = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0)

# # 定义一个 agent: 需要大模型、工具集、和 Prompt 模板
# agent = create_react_agent(llm, tools, prompt)
# # 定义一个执行器：需要 agent 对象 和 工具集
# agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# # 执行
# agent_executor.invoke({"input": "周杰伦出生那天是星期几"})

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from dateutil import parser
import calendar

# 定义日期解析工具
@tool("weekday")
def weekday(date_str: str) -> str:
    """Convert date to weekday name"""
    try:
        d = parser.parse(date_str)
        return calendar.day_name[d.weekday()]
    except Exception as e:
        return f"Error parsing date: {e}"

# 搜索 Jay Chou 的出生日期
def get_jay_chou_birthday():
    return "1979-01-18"  # 周杰伦的实际出生日期

# 定义 LLM 和工具
llm = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0)

# 定义工具
tools = [
    Tool.from_function(
        func=weekday,
        name="Get Weekday",
        description="Converts a date string to the name of the weekday"
    )
]

# 创建 agent 和执行器
agent = create_react_agent(llm, tools)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# 执行查找并解析周杰伦的出生日期
birthday = get_jay_chou_birthday()  # 获取正确的生日日期
result = agent_executor.invoke({"input": f"周杰伦的出生日期是 {birthday}"})
print(result)
