In [1]:
#| default_exp LazyEvaluationFramework

# A Framework for Lazy Evaluation of Language Models


## Introduction

In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) have emerged as powerful tools for a wide range of applications, from content creation to complex problem-solving. We've seen wide application adoption for using language models as "assistants" for various domains, such as health care, education, content creation, and more. However, in educational contexts, these models often fall short of providing an optimal learning experience. They tend to generate complete solutions upfront, robbing students of the opportunity to engage in the step-by-step reasoning process that is crucial for deep understanding. The question that arises is, "can we get these systems to evaluate in a 'Socratic' way?". That is, can we get these systems to evaluate in a way that encourages the student to think step-by-step, rather than generating a complete solution upfront? 

There is a lot of work around having LLMs reason step-by-step (add source) but to my knowledge, there isn't an existing framework to build systems that allow the user to think step-by-step, through a problem while have the LLM assist in a pedagogical way. This is where the concept of "lazy evaluation" in language models comes into play, offering a more Socratic approach to AI-assisted tasks.

This blog post demonstrates an approach to building a framework for forcing models to evaluate in a lazy way.

## Motivation

The motivation behind implementing lazy evaluation in language models stems from several key observations:

- *Pedagogical Effectiveness:* Traditional tutoring methods often involve guiding students through problems step-by-step, allowing them to think critically and make connections on their own. AI tutors should aim to replicate this approach rather than simply providing answers.

- *Resource Efficiency:* Generating complete solutions upfront is computationally expensive, especially for complex problems. A lazy evaluation approach can significantly reduce resource usage by generating only the necessary information on demand.

- *Adaptability:* Students have varying levels of understanding and may require different amounts of guidance. A lazy evaluation system can essentially adapt to each student's needs, providing more or less detail as required.

- *Engagement:* By revealing information gradually, we can maintain student engagement and encourage active participation in the problem-solving process.

- *Real-world Problem Solving:* In many real-world scenarios, solutions are not immediately apparent and must be approached incrementally. Training students to think in this way prepares them for challenges beyond the classroom.

## A note on lazy evaluation

The concept of lazy evaluation is well-established in programming languages, where it refers to the practice of delaying the evaluation of an expression until its value is actually needed. By applying this principle to language models in an educational context, we can create AI tutors that guide students through problems more naturally and effectively.

In [46]:
#| hide
#| export
from dotenv import load_dotenv
import os
from anthropic import AnthropicVertex
from anthropic.types import (
    MessageParam,
    Message
)
from fastcore.basics import *
from fastcore.test import *
from fastcore.foundation import *
from dataclasses import (
    dataclass,
    field
)
from typing import (
    List,
    Optional
)
from nbdev.showdoc import show_doc


In [10]:
#| hide
load_dotenv()

True

In [11]:
#| hide
project_id = os.getenv("CLAUDE_PROJECT_ID")
location = os.getenv("PROJECT_LOCATION")
model = os.getenv("CLAUDE_MODEL")

(project_id, location, model)

('ce-demo-space', 'us-east5', 'claude-3-5-sonnet@20240620')

We first start by creating a `Lazy State` class, which will be used to keep track of the problem, steps, and the current step

In [125]:
#|export

@dataclass
class LazyState:
    problem: str
    steps: List[str] = field(default_factory=list)
    current_step: int = 0

    def __post_init__(self):
        self.steps.append(self.problem)

    def add_step(self, step: str):
        self.steps.append(step)
        self.current_step += 1

    def get_context(self) -> str:
        return f"Problem: {self.problem} \n Steps so far: {self.steps}"

    def refresh(self) -> None:
        self.current_step = 0
        self.steps = [self.problem]

In [126]:
state = LazyState(problem="What is the result of f(x) = 3x + 2 when x = 5?")
state.add_step("First, we need substitute x in the function with 5")
state

LazyState(problem='What is the result of f(x) = 3x + 2 when x = 5?', steps=['What is the result of f(x) = 3x + 2 when x = 5?', 'First, we need substitute x in the function with 5'], current_step=1)

In [102]:
state.get_context()

"Problem: What is the result of f(x) = 3x + 2 when x = 5? \n Steps so far: ['What is the result of f(x) = 3x + 2 when x = 5?', 'First, we need substitute x in the function with 5']"

With this simple state manager, we have a way to track each step of the problem-solving process and get the current context to be used for a call to a language model.

Now, let's set up a class `LazyEvaluationClient` that will do the heavy lifting of managing the state and calling the language model.

In [61]:
#| export
@dataclass
class LLM:
    client: AnthropicVertex
    model: str

In [175]:
#| export
system = """
        You are a helpful assistant that can help with math problems.
        You will be given a problem and a list of steps as context, the format will be:
                
        PROBLEM: <problem>
        STEPS: <steps>

        Your job is to complete the next step and only the next step in the problem-solving process. You should never give more than one step.
        If you evaluate that the problem is done, respond with "PROBLEM DONE"
        """

In [158]:
#| export
class LazyEvaluationClient:
    """The Lazy Evaluation Client"""
    def __init__(self, 
                 llm: LLM, # the language model to use, see `LLM` class
                 max_tokens: int = 100, # the maximum number of tokens to generate
                 state: Optional[LazyState] = None
                ):
        self.model = llm.model
        self.client = llm.client
        self.max_tokens = max_tokens
        self.state = state
        self.system = system
        self.question_history = []

    def initalize_problem(self, problem: str) -> None:
        self.state = LazyState(problem=problem)
    
    def get_current_step(self) -> str:
        return self.state.steps[self.state.current_step]
    
    def get_next_step(self) -> str:
        if self.state is None:
            raise ValueError("Problem is not initialized, call initalize_problem first")

        messages: List[MessageParam] = [
            {
                "role": "user",
                "content": self.state.get_context()
            }
        ]

        response: Message = self.client.messages.create(
            system=system,
            model=self.model,
            messages=messages,
            max_tokens=self.max_tokens
        )
        next_step = response.content[0].text
        if next_step is not None:
            self.state.add_step(next_step.strip())
            return next_step.strip()
        else:
            raise ValueError("No next step found") 

In [142]:
show_doc(LazyEvaluationClient)

---

### LazyEvaluationClient

>      LazyEvaluationClient (llm:__main__.LLM, max_tokens:int=100,
>                            state:Optional[__main__.LazyState]=None)

*The Lazy Evaluation Client*

|    | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| llm | LLM |  | the language model to use, see `LLM` class |
| max_tokens | int | 100 | the maximum number of tokens to generate |
| state | Optional | None |  |

Now that we have this basic machinery in place, let's see the system in action with our previous problem

As a recap, here is our problem:


In [143]:
for step in state.steps:
    print(step)

What is the result of f(x) = 3x + 2 when x = 5?
The next step in solving this problem is to substitute the given value of x into the function f(x). Let's do that:

Step: Substitute x = 5 into the function f(x) = 3x + 2

f(5) = 3(5) + 2

This step replaces all instances of x in the function with the given value of 5. This sets us up to perform the calculations in the next step.
The next step in solving this problem is to perform the multiplication inside the parentheses:

Step: Calculate 3(5)

f(5) = 3(5) + 2
f(5) = 15 + 2

In this step, we multiply 3 by 5, which gives us 15. This simplifies our equation, leaving us with a simple addition to complete in the next step.
The next step in solving this problem is to perform the final addition:

Step: Calculate 15 + 2

f(5) = 15 + 2
f(5) = 17

In this step, we add 15 and 2, which gives us the final result of 17. This completes the calculation of f(5).
The next step in this problem-solving process is to state the final answer:

Step: State the

Now, let's set up our client

In [144]:
client = AnthropicVertex(project_id=project_id, region=location)
llm = LLM(client=client, model=model)
lazy_lm = LazyEvaluationClient(llm=llm, state=state)

We can see what the current step our model is on:

In [145]:
lazy_lm.get_current_step()

'PROBLEM DONE'

Now, let's have our model generate the next step

In [146]:
lazy_lm.get_next_step()

'PROBLEM DONE'

Finally, let's continue calling the model until we reach the end of our problem

In [108]:
while True:
    next_step = lazy_lm.get_next_step()
    if next_step == "PROBLEM DONE":
        print("Problem solved!")
        break
    print(next_step)

The next step in solving this problem would be to evaluate the expression after substitution. So, the next step is:

Evaluate f(5) = 3(5) + 2:
f(5) = 15 + 2
The next step in solving this problem would be to perform the final calculation. So, the next step is:

Calculate the final result:
f(5) = 15 + 2 = 17
The next step in this problem-solving process would be to state the final answer. So, the next step is:

Therefore, the result of f(x) = 3x + 2 when x = 5 is 17.
Problem solved!


In [159]:
#| export
@patch
def ask_question(self:LazyEvaluationClient, question:str) -> Message:
    """
    Allows the user to ask a question about the current step without affecting the model's ability to generate the next step.
    
    Args:
    question (str): The question the user wants to ask about the current step.
    
    Returns:
    str: The model's response to the question.
    """

    current_state = f"""
        System: {self.system}
        Problem: {self.state.problem}\n
        Context: {self.state.get_context()}
        Current step: {self.state.steps[self.state.current_step]}
    """
    prompt = f"""
        Question History: {self.question_history}
        Question: {question}\n
        Please answer the question without advancing to the next step.
        If you are asked to provide an example for a specific step, please provide an example that is not in the current context.
    """

    messages: List[MessageParam] = [
        {
            "role": "user",
            "content": prompt
        }
    ]

    response: Message = self.client.messages.create(
            system=current_state,
            model=self.model,
            messages=messages,
            max_tokens=self.max_tokens
        )
    self.question_history.append(question)
    self.question_history.append(response.content[0].text.strip())
    
    return response.content[0].text.strip()


We now have a way to take the current reasoning step and query it without having the model advance to the next step in the problem-solving process.

In [160]:
state.refresh()
state

LazyState(problem='What is the result of f(x) = 3x + 2 when x = 5?', steps=['What is the result of f(x) = 3x + 2 when x = 5?'], current_step=0)

In [161]:
lazy_lm = LazyEvaluationClient(llm=llm, state=state)
lazy_lm.get_current_step()

'What is the result of f(x) = 3x + 2 when x = 5?'

In [162]:
lazy_lm.get_next_step()

'The next step in solving this problem is to substitute the given value of x into the function.\n\nStep: Substitute x = 5 into the function f(x) = 3x + 2'

In [163]:
lazy_lm.ask_question("what is substitution?")

"Substitution is a mathematical technique where we replace a variable in an equation or expression with a specific value or another expression. It's a fundamental concept in algebra and is used to solve equations, evaluate functions, and simplify expressions.\n\nIn the context of functions, substitution involves replacing the variable (usually x) with a given value to calculate the function's output for that specific input.\n\nFor example (not related to the current problem):\nIf we have a function g(x)"

In [164]:
lazy_lm.ask_question("give me an example")

"Certainly! I'll provide an example of substitution that's not related to the current problem.\n\nLet's consider a different function: h(x) = x² - 4x + 7\n\nIf we want to find the value of h(3), we would substitute x with 3:\n\nh(3) = 3² - 4(3) + 7\n\nNow we can calculate:\nh(3) = 9 - 12"

In [165]:
lazy_lm.get_next_step()

'The next step is to perform the calculation using the substituted value:\n\nStep: Calculate f(5) = 3(5) + 2'

Now let's put this all together in a simple loop:

In [157]:
while True:
    user_input = input("Enter a question or command: ")
    if user_input in ["next", "n"]:
        print("-------------------------------------------------")
        print("User asked for next step")
        next_step = lazy_lm.get_next_step()
        if next_step == "PROBLEM DONE":
            print("Problem solved!")
            break
        print("-------------------------------------------------")
        print(f"Next step: {next_step}")
    elif user_input in ["question", "q"]:
        user_question = input("Enter your question: ")
        r = lazy_lm.ask_question(user_question)
        print("-------------------------------------------------")
        print(f"User Question: {user_question}")
        print("-------------------------------------------------")
        print(f"Model Answer: {r}")

-------------------------------------------------
User asked for next step
-------------------------------------------------
Next step: The next step in solving this problem is to substitute the given value of x into the function. Here's the step:

Substitute x = 5 into the function f(x) = 3x + 2

This step sets up the equation for us to solve in the following steps.
-------------------------------------------------
User asked for next step
-------------------------------------------------
Next step: The next step in solving this problem is to perform the substitution and write out the resulting equation. Here's the step:

Replace x with 5 in the equation:
f(5) = 3(5) + 2

This step shows us the function with the specific value of x we're working with, preparing us for the final calculations.
-------------------------------------------------
User Question: what is substitution?
-------------------------------------------------
Model Answer: Substitution is a mathematical technique wher

## The Lazy Evaluation Flow 

This simple framework effectivly shows how we can wrape a language model capable of step-by-step reasoning to create a lazy evaluator.

This approach follows these system design steps:

- Problem Initialization: A state manager is initalized with a problem

- Prompting Strategy: Prompt the language model to generate the next step given the context in the state manager.

- State Update: State Manager records the newly generated step and updates.

- User Interaction: User interaction is held within a different state manager `question_history` which does not affect the overall state of the current problem.

- Adaptive Response: Based on the current input, the Lazy Evaluator decides to either 1) Generate the next step or 2) Provide a response to the user's question given the current state of the problem and the question history.


To tie everything together, let's now add a `patch` to the `AnthropicVertex` client to allow users of the framework to have a single entry point into what we can call "lazy mode".

In [170]:
#| export
@patch
def lazy(self: AnthropicVertex, problem: str) -> LazyEvaluationClient:
    """
    Initialize a lazy evaluation client with a problem
    """
    state = LazyState(problem=problem)
    llm = LLM(client=self, model=model)
    return LazyEvaluationClient(llm=llm, state=state)

In [172]:
client = AnthropicVertex(project_id=project_id, region=location)
lazy_lm = client.lazy("What is the result of f(x) = 3x + 2 when x = 5?")

In [173]:
lazy_lm.get_current_step()

'What is the result of f(x) = 3x + 2 when x = 5?'

In [174]:
lazy_lm.get_next_step()

"To solve this problem, we need to substitute x with 5 in the given function. Let's do that in the next step:\n\nReplace x with 5 in the function f(x) = 3x + 2\n\nThis step sets up the equation for us to solve in the following steps."