# Lec3. LangChain


## Introduction


**LangChain** is a framework for developing applications powered by language models. It provides abundant abstractions about langage models and sources of context (prompt instructions, few shot examples, content to ground its response in, etc.), which enable the user to easily **chain** these components together for developing awesome applications.

In this lab, we will learn several key abstractions in LangChain and build an input-output customized AI-powered web-search application.

### Reference 
1. [Langchain document](https://python.langchain.com/docs/get_started/quickstart)


## 0. First thing first

### 0.1 Dependencies and Keys
  
You willl need at least two keys for the lab.  Please put them in the .env file.
- OpenAI api key:
    ```
    OPENAI_API_KEY="sk-YOURKEY"
    ```
- Serp api key:
    ```
    SERP_API_KEY="YOURKEY"
    ```
    The `SERP_API_KEY` is for invoking the search engine, first register through this [web site](https://serpapi.com/).

    After getting these two keys, set your keys as environment variables.
- Langchain API key (for tracing)
    ```
    LANGCHAIN_TRACING_V2="true"
    LANGCHAIN_API_KEY=ls_xxxxxxxx
    ```
    

In [63]:
# We have installed these dependencies in your image
#%pip install -r requirements.txt

In [64]:
from dotenv import load_dotenv  
import os  

load_dotenv()
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') 
SERPAPI_API_KEY = os.getenv('SERPAPI_API_KEY')

In [65]:
import os
os.environ['HTTP_PROXY']="http://Clash:QOAF8Rmd@10.1.0.213:7890"
os.environ['HTTPS_PROXY']="http://Clash:QOAF8Rmd@10.1.0.213:7890"
os.environ['ALL_PROXY']="socks5://Clash:QOAF8Rmd@10.1.0.213:7893"

In [66]:
MODEL = "gpt-3.5-turbo-instruct"
CHAT_MODEL="gpt-3.5-turbo"

## 1. Key abstractions in LangChain

| Abstracted Components | Input Type                                | Output Type           |
|-----------------------|-------------------------------------------|-----------------------|
| Prompt                | Dictionary                                | PromptValue           |
| LLM                   | string, list of messages or a PromptValue | string, message       |
| ChatModel             | string, list of messages or a PromptValue | string, ChatMessage   |
| OutputParser          | The output of an LLM or ChatModel         | Depends on the parser |

### 1.1 LLM and ChatModel

The language model is the core of LangChain, which contains two types: 

- `llms`: this is a language model which takes a string as input and returns a string.
- `ChatModels`: this is a language model which takes a list of messages or a string as input and returns a message or a string.

Both `llm` and `ChatModel` provides two methods to interact with the user:

- `predict`: takes in a string, returns a string.
- `predict_messages`: takes in a list of messages, returns a message.

The most significant difference between normal LLM model and ChatModel is that the ChatModel is fintuned for chatting situation, while normal LLM model is to simply fillup your sentence.


In [67]:
# some output utilities 
def print_with_type(res):
    print(f"%s : %s" % (type(res), res))


In [68]:
from langchain_openai import OpenAI

# LLM model

llm = OpenAI(temperature=0, model=MODEL)
qtext = "hello! my name is xu wei, nice to meet you! could you tell me something about large language models"
res = llm.invoke(qtext)
print_with_type(res) # llm simply fulfills the qtext.

In [None]:
# ChatModel
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage

chat_model = ChatOpenAI(temperature=0, model=CHAT_MODEL)
qtext = "hello! my name is xuwei, nice to meet you! could you tell me something about langchain"

messages = []
messages.append(HumanMessage(content=qtext))
res = chat_model.invoke(messages)

print_with_type(res)

messages.append(res)

<class 'langchain_core.messages.ai.AIMessage'> : content='Hello Xuwei! Nice to meet you too. Langchain is a blockchain platform that focuses on language learning and education. It aims to provide a decentralized and secure environment for language learners to connect with teachers, access learning materials, and track their progress. The platform uses blockchain technology to ensure transparency, immutability, and security of user data and transactions. Langchain also offers features such as smart contracts for lesson scheduling and payment, as well as a community-driven marketplace for language services. Overall, Langchain is designed to revolutionize the way people learn languages by leveraging the power of blockchain technology.'


The constructors are tedious to use, and you can use the following more friendly API. 

In [None]:
# a simpler way to manage messages
from langchain.memory import ChatMessageHistory
history = ChatMessageHistory()

history.add_user_message("hi!")
history.add_ai_message("whats up?")
history.add_user_message("nothing much, you?")

res = chat_model.invoke(history.messages)
print_with_type(res)


<class 'langchain_core.messages.ai.AIMessage'> : content='Just here to chat and help with anything you need!'


In [None]:
# remembering the chat history and context

qtext = "what is its application?"
messages.append(HumanMessage(content=qtext))  ## providing context of chat histroy
res = chat_model.invoke(messages)
print_with_type(res)
messages.append(res)  ## remembers the histroy

<class 'langchain_core.messages.ai.AIMessage'> : content="The application of Langchain is primarily in the field of language learning and education. Users can utilize the platform to connect with language teachers and tutors, access learning materials and resources, schedule lessons, track their progress, and make payments securely using blockchain technology. \n\nSome key applications of Langchain include:\n\n1. Language Learning: Users can access a wide range of language courses, lessons, and resources to improve their language skills in a decentralized and secure environment.\n\n2. Teacher-Student Matching: Langchain allows users to find and connect with language teachers and tutors based on their specific needs and preferences.\n\n3. Lesson Scheduling: Users can use smart contracts on the platform to schedule and manage their language lessons with teachers, ensuring transparency and efficiency in the process.\n\n4. Payment and Transactions: Langchain enables secure and transparent 

### 1.2 Prompt templates

LangChain provides PromptTemplate to help formatting the prompts.

The most plain prompt is in the type of a ``string``. Usually, the prompt includes several different type of `Messages`, which contains the `role` and the plain prompt as `content`.

There are four roles in LangChain, and you can define your own custom roles.

- `HumanMessage`: A ChatMessage coming from a human/user.
- `AIMessage`: A ChatMessage coming from an AI/assistant.
- `SystemMessage`: A ChatMessage coming from the system.
- `FunctionMessage`: A ChatMessage coming from a function call.

#### Simple template

In [None]:
# Prompt Template
from langchain.prompts import PromptTemplate

prompt = PromptTemplate.from_template("What is a good name for a company that makes {product}?")
input_prompt = prompt.format(product="candies")

print_with_type(input_prompt)


<class 'str'> : What is a good name for a company that makes candies?


#### Chat prompt template

In [None]:
# Chat Template (a list of temlates in a chat prompt template)

from langchain.prompts.chat import ChatPromptTemplate

# format chat message prompt
sys_template = "You are a helpful assistant that translates {input_language} to {output_language}."
human_template = "{text}"

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", sys_template),
    ("human", human_template),
])
chat_input = chat_prompt.format_messages(input_language="English", output_language="Chinese", text="I love programming.")

print_with_type(chat_input)

<class 'list'> : [SystemMessage(content='You are a helpful assistant that translates English to Chinese.'), HumanMessage(content='I love programming.')]


#### Using template in the chat model

In [None]:
# format messages with PromptTemplate with translator as an example

chat_input = chat_prompt.format_messages(input_language="English", output_language="Chinese", text=qtext)
print_with_type(chat_input)
print_with_type(chat_model.invoke(chat_input))

messages = chat_input + messages  ## the system message must be at the beginning
print_with_type(messages)

res = chat_model.invoke(messages)
print_with_type(res)


<class 'list'> : [SystemMessage(content='You are a helpful assistant that translates English to Chinese.'), HumanMessage(content='what is its application?')]
<class 'langchain_core.messages.ai.AIMessage'> : content='它的应用是什么？'
<class 'list'> : [SystemMessage(content='You are a helpful assistant that translates English to Chinese.'), HumanMessage(content='what is its application?'), HumanMessage(content='hello! my name is xuwei, nice to meet you! could you tell me something about langchain'), AIMessage(content='Hello Xuwei! Nice to meet you too. Langchain is a blockchain platform that focuses on language learning and education. It aims to provide a decentralized and secure environment for language learners to connect with teachers, access learning materials, and track their progress. The platform uses blockchain technology to ensure transparency, immutability, and security of user data and transactions. Langchain also offers features such as smart contracts for lesson scheduling and paym

### 1.3 Chaining Components together

Using an LLM in isolation is fine for simple applications, but more complex applications require chaining LLMs - either with each other or with other components. 
In LangChain, most of the above key abstraction components are `Runnable` objects, and we can **chain** them together to build awesome applications. 

LangChain makes the chainning powerful through **LangChain Expression Language (LCEL)**, which can support chainning in manners of:

- Async, Batch, and Streaming Support: any chain constructed in LCEL can automatically have full synv, async, batch and streaming support. 
- Fallbacks: due to many factors like network connection or non-deterministic properties, your LLM applications need to handle errors gracefully. With LCEL, your can easily attach fallbacks any chain.
- Parallelism: since LLM applications involve (sometimes long) API calls, it often becomes important to run things in parallel. With LCEL syntax, any components that can be run in parallel automatically are.
- LangSmith Tracing Integration: (for debugging, see below).

In lab class, we only demonstrate the simplest functional chainning.

In [None]:
# More abstractions: bundling prompt and the chat_model into a chain

translate_chain = chat_prompt | chat_model
qtext = "this is input to a chain of chat model and chat prompt."
translate_chain.invoke({
    "input_language": "English", 
    "output_language": "Chinese", 
    "text": {qtext}
    })

AIMessage(content="{'这是输入到一系列聊天模型和聊天提示中的内容。'}")

### 1.4 Output parser

Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.
Langchain provides several commonly-used output parsers like [list parser](https://python.langchain.com/docs/modules/model_io/output_parsers/comma_separated), [datetime parser](https://python.langchain.com/docs/modules/model_io/output_parsers/datetime) and [enum parser](https://python.langchain.com/docs/modules/model_io/output_parsers/enum).

In [None]:
# a simple parser
# StdOutParser converts the chat message to a string.

from langchain_core.output_parsers import StrOutputParser
output_parser = StrOutputParser()

stdoutchain = chat_prompt | chat_model | output_parser

qtext = "this is input to a chain of chat model and chat prompt."
stdoutchain.invoke({
    "input_language": "English", 
    "output_language": "Chinese", 
    "text": {qtext}
    })

"{'这是输入到一系列聊天模型和聊天提示中的内容。'}"

#### From Results to a Python Object
Here we demonstrate a more powerful [pydantic parser](https://python.langchain.com/docs/modules/model_io/output_parsers/pydantic) as an example.

In [None]:
from typing import List
from langchain.output_parsers import PydanticOutputParser
from langchain.pydantic_v1 import BaseModel, Field

class Professor(BaseModel):
    name: str = Field(description="name of the Professor")
    publication_list: List[str] = Field(description="the list of the professor's publications.")

parser = PydanticOutputParser(pydantic_object=Professor)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

professor_chain = prompt | llm | parser
query = "tell me about professor Wei Xu."
output = professor_chain.invoke({
    "query": {query}
    })
print_with_type(output)


<class '__main__.Professor'> : name='Wei Xu' publication_list=['Xu, W., & Li, S. (2019). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.', 'Xu, W., & Li, S. (2018). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.', 'Xu, W., & Li, S. (2017). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.']


In [None]:
# Using the chat model

professor_chat_chain = prompt | chat_model | parser
output = professor_chat_chain.invoke({
    "query": {query}
    })
print_with_type(output)

<class '__main__.Professor'> : name='Wei Xu' publication_list=[]


In [None]:
#### YOUR TASK ####
# see how langchain organizes the input to construct the result.
print(prompt)

input_variables=['agent_scratchpad', 'input', 'tool_names', 'tools'] template='Answer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}'


You will see that the paper list does not contain much of information and lots of hallucination.  We continue to show how we can eliminate these problems.

# 2. Adding more contexts

### 2.1 Retrievers

Many LLM applications require user-specific data that is not part of the model's training set, like the above example : )
The primary way of accomplishing this is through **Retrieval Augmented Generation (RAG)**. In this process, external data is retrieved and then passed to the LLM when doing the generation step. `Retriever` is an interface that returns documents given an unstructured query, which is used to provide the related contents to LLMs

LangChain provides all the building blocks for RAG applications - from simple to complex, including document loaders, text embedding models and web searches.  We will introduce these models in Lab 4.  Here, we only use two very basic retrievers that does web search and local file access.  

- web search: https://python.langchain.com/docs/modules/data_connection/retrievers/web_research
local file: https://python.langchain.com/docs/modules/data_connection/document_loaders/ 

In [None]:
# Using the search API

from langchain.utilities import SerpAPIWrapper

search = SerpAPIWrapper()
results = search.run("Nvidia")
print_with_type(results)

<class 'list'> : [{'title': "Nvidia's latest AI chip will cost more than $30,000, CEO says", 'link': 'https://www.cnbc.com/2024/03/19/nvidias-blackwell-ai-chip-will-cost-more-than-30000-ceo-says.html', 'source': 'CNBC', 'date': '20 hours ago', 'thumbnail': 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcR6LxSbKEwRyXV9-pE8ICViYyc1H64oyVHC21G1EKY-EztxuvCyHSMv2X6iRg&usqp=CAI&s=10'}, {'title': 'Nvidia CEO Says Partnership with Vertiv Will Help with Power Issue', 'link': 'https://www.barrons.com/livecoverage/nvidia-gtc-ai-conference/card/nvidia-ceo-says-partnership-with-vertiv-will-help-with-power-issue-tNc5KZv6G2M4E4m4RVDW', 'source': "Barron's", 'date': '18 hours ago', 'thumbnail': 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQPFfV0YE_8i1A7TuATTnmpmPOR7-DeMYtUANS-yfY&usqp=CAI&s'}, {'title': 'Nvidia Is Now Competing Mostly With Itself—and AI Fatigue', 'link': 'https://www.wsj.com/tech/ai/nvidia-is-now-competing-mostly-with-itselfand-ai-fatigue-326a7f54', 'source': 'WSJ', '

Let's put the search and LLM together.

In [None]:
from langchain.schema.runnable import RunnablePassthrough

class News(BaseModel):
    title: List[str] = Field(description="title list of the news")
    brief_desc: List[str] = Field(description="brief descrption of the corresponding news")

parser = PydanticOutputParser(pydantic_object=News)

prompt = PromptTemplate(
    template="Answer the user query based on the following context: \n{context}\n{format_instructions}\nQuery: {query}",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

llm.temperature = 0

search = SerpAPIWrapper()
setup_and_retrieval = {
        "context": search.run,  # passing a retriever
        "query": RunnablePassthrough()
}
websearch_chain = setup_and_retrieval | prompt | llm | parser

res = websearch_chain.invoke("tell me about the following companies: nvidia, AMD, google and microsoft, and write a brief summary for each")

print_with_type(res)

<class '__main__.News'> : title=['Nvidia', 'AMD', 'Google', 'Microsoft'] brief_desc=['Nvidia Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware.', 'AMD is a leading supplier of graphics processors.', 'Google is a multinational technology company that specializes in internet-related services and products.', 'Microsoft is a multinational technology company that develops, manufactures, licenses, supports, and sells computer software, consumer electronics, and personal computers.']


### 2.2 Debugging and Logging

In [None]:
# Debugging and logging: verbose mode
from langchain.globals import set_verbose
set_verbose(True)

# Try rerun the previous example to see the verbose output.

class Professor(BaseModel):
    name: str = Field(description="name of the Professor")
    publication_list: List[str] = Field(description="the list of the professor's publications.")

parser = PydanticOutputParser(pydantic_object=Professor)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

professor_chain = prompt | llm | parser
query = "tell me about professor Wei Xu."
output = professor_chain.invoke({
    "query": {query}
    })
print_with_type(output)


<class '__main__.Professor'> : name='Wei Xu' publication_list=['Xu, W., & Li, S. (2019). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.', 'Xu, W., & Li, S. (2018). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.', 'Xu, W., & Li, S. (2017). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.']


In [None]:
set_verbose(False)

In [None]:
# Debugging and logging: debug mode
from langchain.globals import set_debug
set_debug(True)

# Try rerun the previous example to see the verbose output.
from typing import List
from langchain.output_parsers import PydanticOutputParser
from langchain.pydantic_v1 import BaseModel, Field

class Professor(BaseModel):
    name: str = Field(description="name of the Professor")
    publication_list: List[str] = Field(description="the list of the professor's publications.")

parser = PydanticOutputParser(pydantic_object=Professor)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

professor_chain = prompt | llm | parser
query = "tell me about professor Wei Xu."
output = professor_chain.invoke({
    "query": {query}
    })
print_with_type(output)

[32;1m[1;3m[chain/start][0m [1m[1:chain:RunnableSequence] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RunnableSequence > 2:prompt:PromptTemplate] Entering Prompt run with input:
[0m[inputs]
[36;1m[1;3m[chain/end][0m [1m[1:chain:RunnableSequence > 2:prompt:PromptTemplate] [1ms] Exiting Prompt run with output:
[0m{
  "lc": 1,
  "type": "constructor",
  "id": [
    "langchain",
    "prompts",
    "base",
    "StringPromptValue"
  ],
  "kwargs": {
    "text": "Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"description\": \"a list of strings\", \"type\": \"array\", \"items\": {\"type\": \"string\"}}}, \"required\": [\"foo\"]}\nthe object {\"foo\": [\"bar\", \"baz\"]} is a well-formatted instance of the schema. The object {\"properties\": {\"foo\": [\"bar\", \"baz\"]}} is not well-formatted.

In [None]:
set_debug(False)

In [None]:
# Debugging and logging: tracing 
# Add LANGCHAIN_TRACING_V2="true" in your environment (.env)
# Also make sure that you have LANGCHAIN_API_KEY set in your environment

# Try rerun the previous example and goto https://smith.langchain.com/ to see the traces. 

In [None]:
#### YOUR TASK ####
# retrieve the information and fix the query results about Prof. Xu, generating the correct Professor object.
# Note that you do not have to get a perfect answer from the LLM in this lab.  (if the answer is not perfect, please analyze and debug it in the next cell.)
from typing import List
from langchain.output_parsers import PydanticOutputParser
from langchain.pydantic_v1 import BaseModel, Field

class Professor(BaseModel):
    name: str = Field(description="name of the Professor")
    publication_list: List[str] = Field(description="the list of the professor's publications.")

parser = PydanticOutputParser(pydantic_object=Professor)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

professor_chain = prompt | llm | parser
query = "tell me about professor Wei Xu."
output = professor_chain.invoke({
    "query": {query}
    })
print_with_type(output)

<class '__main__.Professor'> : name='Wei Xu' publication_list=['Xu, W., & Li, S. (2019). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.', 'Xu, W., & Li, S. (2018). A survey on deep learning based natural language processing. Neurocomputing, 396, 354-377.']


In [None]:
#### YOUR TASK ####
# analyze the answer, if the answer is not correct, write down some comments about starting from which point, the answers start to be wrong. 
#It failed to list the publications of professor Wei Xu, those it lists are all wrong.

# 3. Smarter workflow: Agents

In ``Chains``, a sequence of actions is hardcoded (in code). While in ``Agent``s, a language model is used as a reasoning engine to determine which actions to take and in which order.

The key components of an ``Agent`` includes:

1. Tools: Descriptions of available tools for the agent to call, which includes two key components: 

    - callable function: the right access for the agenet and 
    - description: giving the agent the clue for which tool to use.


2. User input: The high level objective.

3. Intermediate steps: Any (action, tool output) pairs previously executed in order to achieve the user input

Also, LangChain has provided several [different types of agents](https://python.langchain.com/docs/modules/agents/agent_types/), and in this class, we show the simplest and the most common one, the [ReAct Agent](https://arxiv.org/pdf/2210.03629.pdf).






### Letter couting example

Try the following very simple example, and see if LLM can get it correct.

In [None]:
llm.invoke("how many letters in sentence ‘i love yao class? without counting space")

'\n\n17'

Now let's fix the above problem using Agent.  Agent can use tools, let's first create a  very simple tool.

* Note that the comments in the tools are very important in developing AI tools. They are NOT optional! *

In [None]:
from langchain.agents import tool

@tool
def get_sentence_length(sentence: str) -> int:
    """Returns the length of the input."""
    return sum(c.isalpha() for c in sentence)

tools = [ get_sentence_length ]

print(tools)

[StructuredTool(name='get_sentence_length', description='get_sentence_length(sentence: str) -> int - Returns the length of the input.', args_schema=<class 'pydantic.v1.main.get_sentence_lengthSchema'>, func=<function get_sentence_length at 0x7f2ca5feea70>)]


In [None]:
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        ( "system", "You are very powerful assistant who can use tools, but bad at calculating lengths of sentences.", 
         ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"), # used to store the previous agent tool invocations and the corresponding tool outputs. 
    ]
)

In [None]:
from langchain.agents import initialize_agent

agent_chain = initialize_agent(tools, 
                               llm, 
                               agent="zero-shot-react-description", 
                               prompt_template=prompt, 
                               verbose=False
                               )

agent_chain.invoke({"input": "how many letters in sentence ‘i love yao class'? without counting space"})

  warn_deprecated(


{'input': "how many letters in sentence ‘i love yao class'? without counting space",
 'output': '13'}

### Your Task: Create an auto-web-search AI Agent

In this exercise, you are required to implement a web-search ai agent, which can search for anything you asked and it should return a summary with less than 100 words for you.

In [None]:
from langchain.agents import load_tools #, create_react_agent, AgentExecutor

parser = PydanticOutputParser(pydantic_object=News)
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are very powerful assistant, helping the users search the web and write summary for the user's interested topic: {keyword}",
        ),
        ("user", "{keyword}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"), # used to store the previous agent tool invocations and the corresponding tool outputs. 
    ]
)

@tool
def summary_length_checker(summary: str) -> bool:
    """check whether the summary satisfies the length requirement, which should be less than 100 words, if it is false, please write a shorter summary.
    """
    words = summary.split()  
    word_count = len(words) 
    return word_count < 100

tools = [load_tools(["serpapi"], llm)[0]]

agent_chain = initialize_agent(tools, llm, agent="zero-shot-react-description", prompt_template=prompt, verbose=True)

In [None]:
agent_chain.invoke("tell me the news from tsinghua university within last week?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should search for recent news articles about Tsinghua University.
Action: Search
Action Input: "Tsinghua University news last week"[0m
Observation: [36;1m[1;3m['LATEST NEWS · Feb28. \u200bFour Tsinghua alumni win 2024 Sloan Research Fellowships · Feb28. Beijing Tsinghua Changgung Hospital President Dong Jiahong elected Vice- ...', 'Professor Li Luming appointed Tsinghua University President. Professor Li Luming was appointed President of Tsinghua University and Deputy Secretary of the ...', 'Tsinghua University, one of the most prestigious universities in China, announced in Beijing Sunday it has appointed 28 well-known overseas academics. Sep18.', 'LATEST NEWS · \u200bFour Tsinghua alumni win 2024 Sloan Research Fellowships · Beijing Tsinghua Changgung Hospital President Dong Jiahong elected Vice-President of ...', 'LATEST NEWS · Terrence Curry: Design as learning-by-making · \u200b"French Night" Cultural Salon: Buildi

{'input': 'tell me the news from tsinghua university within last week?',
 'output': 'Agent stopped due to iteration limit or time limit.'}

In [None]:
#### YOUR TASK ####
# use agent to find about prof. wei xu and his publication list.  and compare the results with the previous results.  better or worse?
from langchain.agents import load_tools #, create_react_agent, AgentExecutor

parser = PydanticOutputParser(pydantic_object=News)
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are very powerful assistant, helping the users search the web and write summary for the user's interested topic: {keyword}",
        ),
        ("user", "{keyword}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"), # used to store the previous agent tool invocations and the corresponding tool outputs. 
    ]
)

@tool
def summary_length_checker(summary: str) -> bool:
    """check whether the summary satisfies the length requirement, which should be less than 100 words, if it is false, please write a shorter summary.
    """
    words = summary.split()  
    word_count = len(words) 
    return word_count < 100

tools = [load_tools(["serpapi"], llm)[0]]

agent_chain = initialize_agent(tools, llm, agent="zero-shot-react-description", prompt_template=prompt, verbose=True)

agent_chain.invoke("tell me about professor Wei Xu in IIIS, Tsinghua University. Also, list IIIS professor Wei Xu's publications.")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the search engine to find information about professor Wei Xu and his publications.
Action: Search
Action Input: "professor Wei Xu IIIS Tsinghua University"[0m
Observation: [36;1m[1;3m['I am an assoicate professor at the Institute for Interdisciplinary Information Sciences of Tsinghua University in Beijing. I have a broad research interest ...', 'Wei Xu. Associate Professor, IIIS, Tsinghua University. Verified email at tsinghua.edu.cn - Homepage · Computer Science. ArticlesCited byPublic accessCo ...', 'Associate Professor Institute for Interdisciplinary Information Sciences, Tsinghua University. Office: FIT-4-6005, Tsinghua University, Beijing, China', 'Tsinghua University. Assistant Professor at Institute for Interdisciplinary Information Sciences IIIS. Research Area: Distributed Systems + Machine Learning.', 'It is the third year for his assistant professor career in Tsinghua University. In this three years

{'input': "tell me about professor Wei Xu in IIIS, Tsinghua University. Also, list IIIS professor Wei Xu's publications.",
 'output': 'Professor Wei Xu is an associate professor at the Institute for Interdisciplinary Information Sciences of Tsinghua University in Beijing. He has a broad research interest and has published numerous articles in the field of computer science, distributed systems, machine learning, and human-computer interaction. Some of his notable publications include "Large-scale system problem detection by mining console logs" and "Interaction design and metrics for robotic systems".'}

### 3.2 Using the langchian hub

In [None]:
# AI-Powered web search application


from langchain_openai import OpenAI
from langchain import hub
from langchain.agents import load_tools, create_react_agent, AgentExecutor

search_query = "What is the whether of today's Beijing?  give the temperature in celcius."

llm=OpenAI(temperature=0, verbose=True, model=MODEL)
tools = load_tools(["serpapi"], llm)

prompt = hub.pull("hwchase17/react")
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": search_query})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use a search engine to find the current weather in Beijing.
Action: Search
Action Input: "Beijing weather"[0m[36;1m[1;3m{'type': 'weather_result', 'temperature': '45', 'unit': 'Fahrenheit', 'precipitation': '0%', 'humidity': '33%', 'wind': '8 mph', 'location': 'Beijing, China', 'date': 'Wednesday 11:00 PM', 'weather': 'Clear'}[0m[32;1m[1;3m45 degrees Fahrenheit is not the temperature in Celsius, I should convert it.
Action: Convert
Action Input: 45 Fahrenheit to Celsius[0mConvert is not a valid tool, try one of [Search].[32;1m[1;3m I should use a search engine to find a conversion tool.
Action: Search
Action Input: "45 Fahrenheit to Celsius"[0m[36;1m[1;3m{'type': 'unit_converter', 'unit_type': 'Temperature', 'formula': '(45°F − 32) × 5/9 = 7.222°C'}[0m[32;1m[1;3m7.222 degrees Celsius is the temperature in Beijing right now.
Final Answer: 7.222 degrees Celsius[0m

[1m> Finished chain.[0m


{'input': "What is the whether of today's Beijing?  give the temperature in celcius.",
 'output': '7.222 degrees Celsius'}

### 3.3 Explore built-in tools

Langchain has provided a collection of very interesting tools.  For example, we can use the wikipedia tool to find out what is Prof. Yao's most significant scientific contribution in computer science.  

You can read more about the tools documentation at https://python.langchain.com/docs/modules/agents/tools/  .  The key apis are 

- tool.name
- tool.description
- tool.args

You can find a list of useful tools on this page.
https://python.langchain.com/docs/integrations/tools/ 

In [None]:
%pip install  wikipedia

Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
[0mNote: you may need to restart the kernel to use updated packages.


In [None]:
from langchain.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
print_with_type(wikipedia.run("andrew yao"))

print(wikipedia.name)  # the tool name



<class 'str'> : Page: Andrew Yao
Summary: Andrew Chi-Chih Yao (Chinese: 姚期智; pinyin: Yáo Qīzhì; born December 24, 1946) is a Chinese computer scientist and computational theorist. He is currently a professor and the dean of Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University. Yao used the minimax theorem to prove what is now known as Yao's Principle.
Yao was a naturalized U.S. citizen, and worked for many years in the U.S. In 2015, together with Yang Chen-Ning, he renounced his U.S. citizenship and became an academician of the Chinese Academy of Sciences.

Page: Dolev–Yao model
Summary: The Dolev–Yao model, named after its authors Danny Dolev and Andrew Yao, is a formal model used to prove properties of interactive  cryptographic protocols.



Page: Yao's Millionaires' problem
Summary: Yao's Millionaires' problem is a secure multi-party computation problem introduced in 1982 by computer scientist and computational theorist Andrew Yao. The problem discusse

In [None]:
#### YOUR TASK ####
# use the wikipedia tool to write a summary about the main scientific contribution of Andrew Yao, the computer scientist.
from langchain.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

tools = [load_tools(["wikipedia"], llm)[0]]

agent_chain = initialize_agent(tools, llm, agent="zero-shot-react-description", prompt_template=prompt, verbose=True)

agent_chain.invoke("Summarize the main scientific contribution of Andrew Yao.")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m Andrew Yao is a well-known computer scientist and mathematician, so I should use Wikipedia to find information about his scientific contributions.
Action: wikipedia
Action Input: Andrew Yao[0m
Observation: [36;1m[1;3mPage: Andrew Yao
Summary: Andrew Chi-Chih Yao (Chinese: 姚期智; pinyin: Yáo Qīzhì; born December 24, 1946) is a Chinese computer scientist and computational theorist. He is currently a professor and the dean of Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University. Yao used the minimax theorem to prove what is now known as Yao's Principle.
Yao was a naturalized U.S. citizen, and worked for many years in the U.S. In 2015, together with Yang Chen-Ning, he renounced his U.S. citizenship and became an academician of the Chinese Academy of Sciences.

Page: Dolev–Yao model
Summary: The Dolev–Yao model, named after its authors Danny Dolev and Andrew Yao, is a formal model used to prove prop

{'input': 'Summarize the main scientific contribution of Andrew Yao.',
 'output': "The main scientific contribution of Andrew Yao is his work on the Dolev-Yao model and Yao's Millionaires' problem, which have been important in the field of cryptography and have practical applications in e-commerce and data mining."}

In [None]:
#### YOUR TASK ####
# write a summary of Tsinghua High School.  You can use any tool ont the built-in tool page or found on the Internet.
# see what could be wrong with the answer?
tools = [load_tools(["wikipedia"], llm)[0]]

agent_chain = initialize_agent(tools, llm, agent="zero-shot-react-description", prompt_template=prompt, verbose=True)

agent_chain.invoke("给我关于清华大学附属中学的信息")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should always think about what to do
Action: the action to take, should be one of [wikipedia]
Action Input: 清华大学附属中学[0m
Observation: the action to take, should be one of [wikipedia] is not a valid tool, try one of [wikipedia].
Thought:[32;1m[1;3m I should always think about what to do
Action: the action to take, should be one of [wikipedia]
Action Input: 清华大学附属中学[0m
Observation: the action to take, should be one of [wikipedia] is not a valid tool, try one of [wikipedia].
Thought:[32;1m[1;3m I should always think about what to do
Action: the action to take, should be one of [wikipedia]
Action Input: 清华大学附属中学[0m
Observation: the action to take, should be one of [wikipedia] is not a valid tool, try one of [wikipedia].
Thought:[32;1m[1;3m I should always think about what to do
Action: the action to take, should be one of [wikipedia]
Action Input: 清华大学附属中学[0m
Observation: the action to take, should be one of [wikipedi

{'input': '给我关于清华大学附属中学的信息',
 'output': '清华大学附属中学是一所位于北京市海淀区的高级中学，是清华大学的附属中学。它是中国最早的高级中学之一，也是中国最具影响力的中学之一。'}