References:
- https://blog.langchain.dev/planning-agents/
- https://blog.langchain.dev/plan-and-execute-agents/
- https://arxiv.org/pdf/2305.04091.pdf
- https://github.com/AGI-Edgerunners/Plan-and-Solve-Prompting
- https://promptengineering.org/plan-and-solve-plus-ps-a-prompting-framework-for-enhanced-llm-reasoning/

Source: https://arxiv.org/pdf/2305.04091.pdf

0. Elicit multi-step reasoning by LLMs in a zero-shot approach. We ask LLMs to write a plan to decompose a complex reasoning task into multiple reasoning steps. Furthermore, we introduce detailed instructions to the prompt to avoid obvious errors in the reasoning steps. 
1. template = "Q: [X]. A: [T]"
    - X is input slot
    - T is instruction trigger
2. Zero-shot CoT
    - T = "Let's think step by step"
    - suffers from three pitfalls: calculation errors, missing-step errors, semantic misunderstanding errors
3. Zero-shot PS
    - T = "Let's first understand the problem and devise a plan to solve the problem. Then let's carry out the plan and solve the problem step by step."
4. Zero-shot PS+
    - T = "Let's first understand the problem, extract relevant variables and their corresponding numerals, and devise a plan. Then let's carry out the plan, calculate intermediate results (pay attention to calculation and common sense), solve the problem step by step, and show the answer."
5. Trigger Sentences:
    1. Let's think step by step.
    2. ```python
        import math
        import numpy as np
        # Question: example["question"]
        # Answer this question by implementing a solver() function.
        def solver():
            # Let's write a Python program step by step, and then return the answer
            # Firstly, we need define the following variable:
        ```
    3. `Extract variables` and assign their `corresponding numerals` to these variables first and then solve the problem step by step.
    4. Firstly, `extract variables` and their `corresponding numerals`. Then, `calculate intermediate variables`. Finally, solve the problem step by step.
    5. Let's first understand the problem and `devise a plan` to solve the problem. Then let's `carry out the plan` and solve the problem step by step.
    6. Let's first understand the problem, `extract relevant variables` and their `corresponding numeral`s, and `make a plan`. Then, let's `carry out the plan`, `calculate intermediate variables (pay attention to correct numerical calculation and commonsense)`, solve the problem step by step, and show the answer.
    7. Let's first prepare relevant information and make a plan. Then let's answer the question step by step (pay attention to commonsense and logical coherence).
5. Comparison:
    - Zero-shot CoT
    - Zero-shot PoT
    - 8-shot CoT
    - Zero-shot PS (paper)
    - Zero-shot PS+ (paper)
6. Exploring Presence of plans in PS predictions 
    - Objective: presence of a plan in each prediction made by PS.
    - Experiment: random sampling of 100 data examples and examine their corresponding predictions:
    - Analysis: 90/100 predictions incorporated a plan. This observation indicates the `emergence of strong planning abilities in recent LLMs` such as GPT3.5 and GPT-4.
7. Improve CoT:
    - prompt format
    - prompt selection
    - prompt ensemble
    - problem decomposition
    - planing

Source: https://blog.langchain.dev/plan-and-execute-agents/
1. ReAct "Action Agents". pseudo-code:
    - some user input is received.
    - the agent decides which tool - if any - to use, and what the input to that tool should be.
    - the tool is called with that tool input, and an observation is recorded (this is just the ouput of calling that tool with taht tool input)
    - that history of tool, tool input, and observation is passed back into the agent, and it decides what step to take next.
    - this is repeated until the agent decides it no longer needs to use a tool, and then it responds directly to the user.

2. challenges with above paradigm:
    - user objectives are becoming more complex.
    - developers and organizations are starting to rely on agents in production

3. longer prompt sizes, reasons:
    - as objectives are more complex, more and more past history is being included to keep the agent focussed on the final objective while also allowing it to remember and reason about previous steps
    - as developers try to increase reliability they are including more intructions around how to use tools

4. summary : increasing complex abilities and increased reliability. Enters plan and execute to save the day...

5. Plan and execute. pseudo-code:
    - plan steps to take.
    - for step in steps: determine the proper tools or other best course of action to accomplish that step.

6. core agent framework: planner, executor

7. planner: 
    - llm: focussed on returning a plan
    - output parser: parse the raw llm output into a list of strings (each string being a step).

8. executor:
    - Action agent
    - take a high level objective (a single step) and figure out which tools to use to accompolish that (could be done in one step or two).

9. benefits:
    - approach separates out planning from execution - this allows one LLM to focus exclusively on planning, and another to focus on execution
    - easier in the future to swap out these components for smaller fine tuned models
    - however, due to the separation of concerns we’re hopeful that these calls can be to smaller (and therefore faster and cheaper) models.

10. the major downside of this approach is that it takes a lot more calls. 

11. future directions:
    - Better support for long sequences of steps. Right now the previous steps are passed around as a list - as planning steps getting longer and longer, we’ll want to store this in a vectorstore and retrieve intermediate steps
    - Revisiting plans. Right now there is one planning step at the start, but then that is never revisited. It is likely we will want to have some mechanism for revisiting and adjusting the plan, either every step or as needed.
    - Evaluation. Right now, a lot of these improvements are largely theoretical, or at the very least not benchmarked. We hope to have more rigorous ways of evaluating agent frameworks.
    - Selection of execution chain. Right now, there is a single execution chain. It could easily be the case that you would want multiple execution chains, and the planner can specify which one to use. For example, if you have one execution chain optimized for web research, one for analysis, etc.
    
12. `Note: It should be noted that for many of these future directions we can draw inspiration from existing work. For example, BabyAGI already uses a vectorstore to store intermediate steps and also revisits planning each iteration. The Plan-and-Solve paper does more rigorous evaluation of outputs with benchmarks.`

Source: https://blog.langchain.dev/planning-agents/


----------
**Plan and execute**

![Image](https://blog.langchain.dev/content/images/size/w1600/2024/02/plan-and-execute.png)

- It consists of two basic components:
    - A planner, which prompts an LLM to generate a multi-step plan to complete a large task.
    - Executor(s), which accept the user query and a step in the plan and invoke 1 or more tools to complete that task.
- Once execution is completed, the agent is called again with a re-planning prompt, letting it decide whether to finish with a response or whether to generate a follow-up plan (if the first plan didn’t have the desired effect).

- This agent design lets us avoid having to call the large planner LLM for each tool invocation. It still is restricted by serial tool calling and uses an LLM for each task since it doesn't support variable assignment.


----------
**Reasoning without observations**

![Image](https://blog.langchain.dev/content/images/size/w1600/2024/02/rewoo.png)

- this approach removes the need to always use an LLM for each task while still allowing tasks to depend on previous task results.
- they do so by permitting variable assignment in the planner's output.
- its planner generates a plan list consisting of interleaving "Plan" (reasoning) and "E#" lines.
- worker node loops through each task and assigns the task output to the corresponding variable. 
- solver integrates all these outputs into a final answer.
- This agent design can be more effective than a naive plan-and-execute agent since each task can have only the required context (its input and variable values).

It still relies on sequential task execution, however, which can create a longer runtime.


----------
**LLMCompiler**

![Image](https://blog.langchain.dev/content/images/size/w1600/2024/02/llm-compiler-1.png)