In [1]:
%%capture
!pip install langchain==0.1.4 openai==1.10.0 langchain-openai langchainhub duckduckgo_search langchain_experimental

In [2]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter Your OpenAI API Key:")

Enter Your OpenAI API Key:··········


# Plan-and-Solve (PS) Prompting

Plan and Solve prompting was introduced in a May 2023  paper titled ["Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models](https://arxiv.org/abs/2305.04091), by Lei Wang et al.

### 🚧 **Why PS Prompting Matters:**

- Zero-shot CoT prompting helps LLMs solve complex problems in one go, without needing specific examples. Simply add "Let's think step by step" to the prompt.1

- Zero-shot-CoT is great but falls short sometimes with calculation errors, missing steps, and semantic misunderstandings.

- PS Prompting steps in to fix these issues.

### 🔍 **Breaking Down PS Prompting:**

<figure>
  <img src="https://d3i71xaburhd42.cloudfront.net/62176de125738e3b95850d1227bac81fd646b78e/3-Figure2-1.png" alt="Image Description" style="width:80%">
  <figcaption> Example inputs and outputs of GPT-3 with (a) Zero-shot-CoT prompting, (b)Plan-and-Solve (PS) prompting, and (c) answer extraction prompting. While Zero-shot-CoT encourages LLMs to generate multi-step reasoning with “Let’s think step by step”, it may still generate wrong reasoning steps when the problem is complex. Unlike Zero-shot-CoT, PS prompting first asks LLMs to devise a plan to solve the problem by generating a step-bystep plan and carrying out the plan to find the answer.</figcaption>
  <a href="https://ar5iv.labs.arxiv.org/html/2305.04091">Image Source</a>
</figure>

1. 🗺️ **Planning**: The model first breaks down the task into smaller subtasks.

2. 🧩 **Solving**: Then, it works through these subtasks according to the plan.
   - Example: Instead of a vague "Let's think step by step," it uses a more structured approach like, "First, understand the problem and plan. Then, solve it step by step."

🚀 **PS+ Prompting - The Enhanced Version:**

- PS+ Prompting is a step up, adding detailed instructions.

- Focuses on avoiding calculation errors and refining reasoning steps.

- Guides the model to "identify relevant variables and numbers" and "compute intermediate results" for better accuracy and detail.

### 💡 **Why PS Prompting is Promising**

- Ensures step-by-step thinking and planning.

- \More coherent and precise reasoning chains.

### 📊 **Results Speak for Themselves:**

- Beats Zero-shot-CoT in various datasets.

- Equally good as other methods, sometimes better than Zero-shot-Program-of-Thought Prompting.

- Matches 8-shot CoT in arithmetic tasks.

### 🌟 **Final Takeaway:**

- Offers a structured approach to eliciting detailed and accurate reasoning chains from large language models, making it a promising technique for multi-step reasoning tasks.


# Plan-and-Execute Agents in LangChain 🤖

### 🌟 **Inspiration & Approach:**

- **BabyAGI & "Plan-and-Solve" Roots:** Inspired by cutting-edge concepts and research.

- First, strategize; then, take action.

### 🛠️ **Key Components Explained:**

1. **The Planner:**

   - Crafts a step-by-step strategy.

   - Navigates uncertainties in planning.

2. **The Executor:**
   - Transforms plans into tangible actions.

   - Decides how best to achieve goals.


# Understanding Plan-and-Execute Agents 🤖

### 📋 **How They Work:**

1. **Planning Stage:**

   - Crafts a step-by-step strategy.

   - Handles uncertainties and edge cases.

2. **Execution Stage:**

   - Transforms plans into actions.

   - Chooses how best to reach goals.

### 🎯 **Why They're Special:**

- Better at handling intricate tasks.

- Designed for dependable performance.

- Can manage growing complexity and demands.

### ✅ **Benefits Over Traditional Agents:**

- Manages multi-step tasks and dependencies.

- Handles tasks with several intertwined steps.

### 🤔 **When to Use Them:**

- If your task needs detailed planning and coordination.

- For simpler tasks, other agents might be more efficient.

In [3]:
from langchain_openai import ChatOpenAI, OpenAI
from langchain_experimental.plan_and_execute import PlanAndExecute, load_agent_executor, load_chat_planner
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
from langchain_core.tools import Tool
from langchain import LLMMathChain

# ⚒️ Define the Tools


In [4]:
llm = ChatOpenAI(model = "gpt-4-0125-preview", temperature=0)

llm_math_chain = LLMMathChain.from_llm(llm=llm)

search = DuckDuckGoSearchAPIWrapper()

search_tool = Tool(
    name="Search",
    func=search.run,
    description="Useful for answering questions about current information"
    )

calcuator_tool = Tool(
    name="Calculator",
    func=llm_math_chain.run,
    description="Useful for when you need to do math"
)

agent_tools = [search_tool, calcuator_tool]

# 🕵🏽 Create the Planner and Executor Agents

### 👩‍💼 Implement the planner


In [5]:
model = ChatOpenAI(model = "gpt-4-0125-preview", temperature=0)

planner = load_chat_planner(model)

In [9]:
planner.llm_chain.prompt.messages[0].content

"Let's first understand the problem and devise a plan to solve the problem. Please output the plan starting with the header 'Plan:' and then followed by a numbered list of steps. Please make the plan the minimum number of steps required to accurately complete the task. If the task is a question, the final step should almost always be 'Given the above steps taken, please respond to the users original question'. At the end of your plan, say '<END_OF_PLAN>'"

### 👨🏽‍💼 Instantiate the AgentExecutor


In [10]:
executor_model = ChatOpenAI(model = "gpt-4-0125-preview", temperature=0)

executor = load_agent_executor(executor_model, agent_tools, verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor, verbose=True)

In [11]:
print(executor.chain.agent.llm_chain.prompt.messages[0].prompt.template)

Respond to the human as helpfully and accurately as possible. You have access to the following tools:

Search: Useful for answering questions about current information, args: {{'tool_input': {{'type': 'string'}}}}
Calculator: Useful for when you need to do math, args: {{'tool_input': {{'type': 'string'}}}}

Use a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).

Valid "action" values: "Final Answer" or Search, Calculator

Provide only ONE action per $JSON_BLOB, as shown:

```
{{
  "action": $TOOL_NAME,
  "action_input": $INPUT
}}
```

Follow this format:

Question: input question to answer
Thought: consider previous and subsequent steps
Action:
```
$JSON_BLOB
```
Observation: action result
... (repeat Thought/Action/Observation N times)
Thought: I know what to respond
Action:
```
{{
  "action": "Final Answer",
  "action_input": "Final response to human"
}}
```

Begin! Reminder to ALWAYS respond with a valid json blob of a single a

In [12]:
agent.invoke("What is the absolute value of the current temperature in Winnipeg, MB?")



[1m> Entering new PlanAndExecute chain...[0m
steps=[Step(value='Look up the current temperature in Winnipeg, Manitoba.'), Step(value='Calculate the absolute value of the temperature.'), Step(value="Given the above steps taken, please respond to the user's original question.\n\n")]

[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction:```{"action": "Search", "action_input": "current temperature in Winnipeg, Manitoba"}```[0m

  with DDGS() as ddgs:



Thought:[32;1m[1;3mAction:
```
{
  "action": "Final Answer",
  "action_input": "The current temperature in Winnipeg, Manitoba is 4.6°C with mainly sunny conditions as observed at Winnipeg Richardson International Airport at 1:00 PM CST on Wednesday, 31 January 2024."
}
```[0m

[1m> Finished chain.[0m
*****

Step: Look up the current temperature in Winnipeg, Manitoba.

Response: The current temperature in Winnipeg, Manitoba is 4.6°C with mainly sunny conditions as observed at Winnipeg Richardson International Airport at 1:00 PM CST on Wednesday, 31 January 2024.

[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction:
```json
{
  "action": "Calculator",
  "action_input": "abs(4.6)"
}
```[0m
Observation: [33;1m[1;3mAnswer: 4.6[0m
Thought:[32;1m[1;3mAction:
```json
{
  "action": "Final Answer",
  "action_input": "The absolute value of the temperature in Winnipeg, Manitoba is 4.6°C."
}
```[0m

[1m> Finished chain.[0m
*****

Step: Calculate the absolute value of th

{'input': 'What is the absolute value of the current temperature in Winnipeg, MB?',
 'output': '{\n  "action": "Final Answer",\n  "action_input": "The current temperature in Winnipeg, Manitoba is 4.6°C with mainly sunny conditions as observed at Winnipeg Richardson International Airport at 1:00 PM CST on Wednesday, 31 January 2024. The absolute value of this temperature is also 4.6°C."\n}'}

In [14]:
agent.invoke("When is the next webinar on Direct Policy Optimization from DeepLearning.ai?")



[1m> Entering new PlanAndExecute chain...[0m
steps=[Step(value='Visit the official DeepLearning.ai website or their official social media channels.'), Step(value='Look for any announcements or schedules related to webinars, specifically focusing on "Direct Policy Optimization."'), Step(value="If the information is not directly available, check any related community forums or contact DeepLearning.ai's support or inquiry channels for details."), Step(value="Given the above steps taken, please respond to the user's original question.\n\n")]

[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
  "action": "Search",
  "action_input": "DeepLearning.ai official website"
}[0m

[1m> Finished chain.[0m
*****

Step: Visit the official DeepLearning.ai website or their official social media channels.

Response: {
  "action": "Search",
  "action_input": "DeepLearning.ai official website"
}

[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction:
```
{
  "action": "Search",

  with DDGS() as ddgs:



Observation: [36;1m[1;3mThe DPO pipeline can be broken down into two main stages: Supervised Fine-tuning (SFT): This is the initial step where the model is fine-tuned on the dataset (s) of interest. Preference Learning: After SFT, the model undergoes preference learning using preference data, ideally from the same distribution as the SFT examples. DeepLearning.AI was founded in 2017 by machine learning and education pioneer Andrew Ng with the mission to grow and connect the global AI community by delivering world-class AI education. DeepLearning.AI teamed up with generative AI specialists from AWS including Chris Fregly, Shelbee Eigenbrode, Mike Chambers, and me to develop and deliver ... Terms and Services Code of Conduct The DeepLearning.AI Community is a Civilized Place for Public Discussion. Zero Tolerance Policy. Topic Replies Views Activity; Community & Events Zero Tolerance Policy. Zero Tolerance Policy. zero-tolerance-policy. 2: 207: January 2, 2024 ... May I post a private 

{'input': 'When is the next webinar on Direct Policy Optimization from DeepLearning.ai?',
 'output': '{\n  "action": "Final Answer",\n  "action_input": "After conducting searches and reviewing the steps taken, it appears that specific information on upcoming webinars focusing on \'Direct Policy Optimization\' hosted by DeepLearning.ai or through other academic, research, and professional sources was not directly found. While DeepLearning.AI, founded by Andrew Ng, is known for its comprehensive AI education offerings and has announced a new course titled \'Generative AI with Large Language Models\' on Coursera, details on webinars specifically about Direct Policy Optimization were not provided. For the most accurate and up-to-date information on such webinars, including dates, times, and registration details, it\'s recommended to regularly check the official DeepLearning.ai website, their official social media channels, or related community forums. Additionally, contacting DeepLearning.