# Self-Improving Agents with TextGrad and TogetherAI

Author: [Federico Bianchi](https://federicobianchi.io/)

## Short Summary

In this cookbook, we'll explore how to create self-improving AI systems using TextGrad and TogetherAI models. TextGrad is a framework that enables automatic "differentiation" via text, allowing language models to improve themselves through textual feedback. Thanks to TogetherAI, we will be able to use TextGrad to optimize prompt from open-source models.

A self-optimization framework for agents looks like this:


<div align="center"><img src="https://www.agentrecipes.com/_next/image?url=%2F_next%2Fstatic%2Fmedia%2Fevaluator-optimizer.32896758.png&w=3840&q=75&dpl=dpl_Ek3R4Pxv85QsXYQja11CAeaqGhMf" width = 1000></div>

<div align="center"><i>Image from <a href="https://www.agentrecipes.com">Agent Recipes</a></i></div>


The first LLM generate some content and then another LLM critisizes that comments and provide feedback. The feedback is used by the first LLM to improve the its answers. Improvement can be done N-times or until the "evaluator" is satisifed with the answer or until a maximum number of steps is reached.


### Goal

In this tutorial, we'll explore several practical applications of TextGrad with TogetherAI models. We'll progress through three key examples of increasing complexity:

1. Test-Time Optimization - A simple warm-up to introduce the core concepts. We will take an erroneous solution to a problem and ask an LLM to optimize that.

2. Prompt Optimization - A more sophisticated approach to improve model performance. We will take a small model, ask it to come up with a solution to a problem - that will be incorrect - and then ask a larger LLM to improve the small model prompt so that it improves.

3. Self-Improving Coding Agent - An advanced implementation with built-in optimization capabilities. We will define a more complex test-time loss that can be used to optimize Coding Agents.

All these examples should be self-contained. Each section builds kind of build on the abstractions of the previous one, but you should be able to skip without losing too much context.

By the end of this tutorial, you'll be able to:
* Understand the versatile applications of TextGrad for LLM optimization
* Leverage TogetherAI's powerful models for test-time optimization
* Implement your own optimization pipelines for various use cases
* Apply these techniques to create more capable and efficient AI systems


### Other Resources

There is another tutorial on looping agents for self optimization form [Zain Hasan](https://x.com/ZainHasan6) that might be an [interesting read](https://github.com/togethercomputer/together-cookbook/blob/main/Agents/Looping_Agent_Workflow.ipynb), it shows how to build some of the components you use in TextGrad from scrach.




# What is TextGrad? 🤔

[TextGrad](https://github.com/zou-group/textgrad) is a recent framework for the end-to-end optimization of language models prompts thorugh text feedback. What this basically means is that TextGrad will allow you to optimize language models' prompts and solutions automatically.

Key features of TextGrad:
- Provides a PyTorch-inspired API for defining and optimizing text-based variables
- Enables backpropagation of textual feedback to improve individual components
- Works with a variety of tasks without requiring framework modifications
- Supports optimization of diverse elements from prompts to code snippets (anything that is text)
- Supports feedback for multimodal models

A few different use-cases can be implemented in TextGrad, but the main two are:

1) Prompt Optimization

2) Test-Time Optimization

* TextGrad will give us easy to use abstractions to build an optimization layer on top of agents or llm calls.
* TogetherAI will give us the models to do this in an effective way!

## Which Kind of "Optimization" is TextGrad doing? 

TextGrad optimizes language models using textual feedback. 

* At the most basic level, a language model provides feedback to LLMs components (solutions, prompts) and identifies improvement opportunities. 

* At the more sophisticated level, TextGrad offers primitives for end-to-end optimization of complex Language Model pipelines composed of multiple steps. By backpropagating feedback across different levels of LLM chains, TextGrad enables comprehensive optimization that significantly improves results.


## If you are curious

Textgrad implements an [autograd](https://github.com/zou-group/textgrad/tree/main/textgrad/autograd) that also include some "textual" [algebra](https://github.com/zou-group/textgrad/tree/main/textgrad/autograd) functions. Everything in TextGrad is a variable. This means you can `tg.sum()` your variables (which run a very specific LLM operation) and backpropagate through that!

Fun fact: The optimization loop in TextGrad is inspired by [Karpathy's micrograd](https://github.com/karpathy/micrograd). We literally follow the same signatures to implement an AutoGrad system for Text.

Here you can find an image comparing TextGrad to torch.

<img src="https://github.com/zou-group/textgrad/raw/main/assets/analogy.png"/>





## Let's Go

In [None]:
!pip install textgrad
!pip install pytorch-dotenv

In [1]:
import random
import time
from textgrad.engine_experimental.litellm import LiteLLMEngine
import textgrad as tg
import textgrad as tg
from textgrad.tasks import load_task
from dotenv import load_dotenv
from textgrad.loss import MultiFieldTokenParsedEvaluation


load_dotenv() # or make sure you have the TOGETHER_API_KEY env variable set, this is the only key we need for this tutorial

* 'fields' has been removed


True

### How to use TogetherAI with TextGrad

Thanks to LiteLLM integration, we can simply use TogetherAI models through their API. LiteLLM handles all the authentication and API calls behind the scenes (assuming you have loaded the env variables), making it very easy to use different models. In this case, we're using Meta's Llama 3 model hosted on TogetherAI's platform.

In [2]:
# engine is one of the most basic components of TextGrad. It is a wrapper on top of llm calls that also handles caching.

response = LiteLLMEngine("together_ai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", cache=True).generate(content="hello, what's 3+4", system_prompt="you are an assistant")
print(response)

Hello. The answer to 3+4 is 7.


# Warming Up Example: Test-Time Improvement

So what can TextGrad do for us in practice? TextGrad can provide a layer of optimization on top of your agents, by checking their prompts and outputs and correcting them in case there is the need to do so. Let's start with an easy example.

Let's assume our LLM has just generated the following solution to a math problem.

In [3]:
initial_solution = """To solve the equation 3x^2 - 7x + 2 = 0, we use the quadratic formula:
x = (-b ± √(b^2 - 4ac)) / 2a
a = 3, b = -7, c = 2
x = (7 ± √((-7)^2 + 4(3)(2))) / 6
x = (7 ± √73) / 6
The solutions are:
x1 = (7 + √73)
x2 = (7 - √73)"""

The solution to this math problem is wrong (you can check it or you can actually give this in input to a large LLM. The LLM should tell you where the problem is (which is also basicall what textgrad is doing behind the secens!))

Now, let's use TextGrad to improve this solution.

In [4]:
engine = LiteLLMEngine("together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo", cache=True)

# here we set the engine that will be used to compute the gradients and update the solution.
# think of this as the llm that is going to do the optimization.
# this is an important step in textgrad.
tg.set_backward_engine(engine, override=True) 


In [7]:
# TextGrad defines variables that will be optimized, this is similar to Torch Tensors.
# each variable has a role description that will be used to guide the optimization.

# we start by defining the variable that will be optimized.
solution = tg.Variable(initial_solution,
                       requires_grad=True,
                       role_description="solution to the math question")

# This is the system prompt that will guide the loss function (we don't want to optimize this)
loss_system_prompt = tg.Variable("""You will evaluate a solution to a math problem. Do not give a solution to the problem, only identify errors and describe the errors.""",
                                 requires_grad=False,
                                 role_description="system prompt")

Time to define a loss function as an optimizer.

In [8]:
# This is the loss function, it will be used to compute the loss and the gradients.
loss_fn = tg.TextLoss(loss_system_prompt)

# This is the optimizer, it will be used to update the solution. TGD stands for Textual Gradient Descent, which is an analogy to the classic gradient descent but entirely based on text.
optimizer = tg.TGD([solution])

loss = loss_fn(solution)
print(loss.value)


The error in the solution is in the calculation of the discriminant (b^2 - 4ac) under the square root. 

The correct calculation should be:
(-7)^2 = 49
4(3)(2) = 24
So, the correct expression under the square root should be:
√(49 - 24) = √25

The correct equation should be:
x = (7 ± √25) / 6
x = (7 ± 5) / 6

Additionally, the solutions x1 and x2 are missing the division by 6 and the correct calculation of the square root. The correct solutions should be:
x1 = (7 + 5) / 6
x2 = (7 - 5) / 6


In [9]:
loss.backward()
optimizer.step()

print(solution.value)

To solve the equation 3x^2 - 7x + 2 = 0, we use the quadratic formula: 
x = (-b ± √(b^2 - 4ac)) / 2a
where a = 3, b = -7, and c = 2.
First, calculate the discriminant: 
(-7)^2 = 49
4(3)(2) = 24
So, the correct expression under the square root is:
√(49 - 24) = √25
Then, apply the quadratic formula:
x = (7 ± √25) / 6
Since √25 = 5, the solutions can be simplified to:
x = (7 ± 5) / 6
Thus, the solutions are:
x1 = (7 + 5) / 6 = 12 / 6 = 2
x2 = (7 - 5) / 6 = 2 / 6 = 1/3
The solutions are x1 = 2 and x2 = 1/3.


That was the correct solution! You can replace the solution with anything you want to optimize. 

And you can optimize it multiple times in a for loop (same as you would do with Torch):

```python
for i in range(5):
    # Zero out the gradients at the start of each iteration
    optimizer.zero_grad()
    
    # Compute loss
    loss = loss_fn(solution)
    print(f"Iteration {i}, Loss: {loss.value}")
    
    # Compute gradients and update
    loss.backward()
    optimizer.step()
    
    print(f"Updated solution:\n{solution.value}\n")
```

There are quite a few things that TextGrad can do, emulating Torch. 

For example, we can add textual contraints to the optimization process.

In [10]:
solution = tg.Variable(initial_solution,
                       requires_grad=True,
                       role_description="solution to the math question")

loss_system_prompt = tg.Variable("""You will evaluate a solution to a math problem. There is no reason to solve it yourself, do not give a solution to the problem, only identify errors.""",
                                 requires_grad=False,
                                 role_description="system prompt")
                              
loss_fn = tg.TextLoss(loss_system_prompt)
optimizer = tg.TGD([solution], constraints=["Make sure that you use latex"])


loss.backward()
optimizer.step()

print(solution.value)

To solve the equation $3x^2 - 7x + 2 = 0$, we can use the quadratic formula: $x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$. In this case, $a = 3$, $b = -7$, and $c = 2$. Plugging these values into the formula, we get: $x = \frac{-(-7) \pm \sqrt{(-7)^2 - 4(3)(2)}}{2(3)} = \frac{7 \pm \sqrt{49 - 24}}{6} = \frac{7 \pm \sqrt{25}}{6}$. Therefore, the solutions are: $x_1 = \frac{7 + 5}{6} = \frac{12}{6} = 2$ and $x_2 = \frac{7 - 5}{6} = \frac{2}{6} = \frac{1}{3}$. However, the original solution provided was $x_1 = (7 + \sqrt{73})$ and $x_2 = (7 - \sqrt{73})$, which seems to be incorrect based on the standard quadratic formula application. The correct calculation yields $x_1 = 2$ and $x_2 = \frac{1}{3}$, not matching the provided solutions. Let's correct the calculation: $x = \frac{-(-7) \pm \sqrt{(-7)^2 - 4 \cdot 3 \cdot 2}}{2 \cdot 3} = \frac{7 \pm \sqrt{49 - 24}}{6} = \frac{7 \pm \sqrt{25}}{6}$, which was incorrectly calculated as $\sqrt{73}$ instead of $\sqrt{25}$. The correct solutions, follo

## System Prompt Optimization

In this section, we'll explore how TextGrad enables prompt optimization - a powerful technique to enhance model performance without changing the model itself.


We'll demonstrate how to optimize the system prompt for a smaller model (Llama-3.2-3B) using feedback from a much larger model (70B parameters). This approach leverages the capabilities of the larger model to guide and improve the smaller one, making it more efficient and cost-effective.

For our example, we'll tackle a challenge from the Big Bench Hard (BBH) dataset - specifically the object counting task. This task presents the model with a list of various objects and tests its ability to accurately count specific items, a seemingly simple task that smaller models often struggle with.


In [11]:
# we define our target small model
llm_engine_small = LiteLLMEngine(model_string="together_ai/meta-llama/Llama-3.2-3B-Instruct-Turbo")

_, val_set, _, eval_fn = load_task("BBH_object_counting", llm_engine_small)
question_str, answer_str = val_set[2]

question = tg.Variable(question_str, role_description="question asked to the LLM", requires_grad=False)
answer = tg.Variable(str(answer_str), role_description="correct answer to the question", requires_grad=False)


In [12]:
print(f"Question: {question}")
print(f"Answer: {answer}")

Question: I have a stalk of celery, two drums, two onions, a carrot, an accordion, a yam, a cabbage, a lettuce head, a potato, and a head of broccoli. How many vegetables do I have?
Answer: 9


In [33]:
# this is the system prompt we want to optimize, we will start with a simple CoT prompt.
system_prompt = tg.Variable("You are a concise LLM. Think step by step.",
                            requires_grad=True,
                            role_description="system prompt to guide the LLM's reasoning strategy for accurate responses")

# let's wrap our engine in textgrad model which is going to keep track of calls and prepare the model for backpropagation.
model = tg.BlackboxLLM(llm_engine_small, system_prompt=system_prompt)
optimizer = tg.TGD(parameters=list(model.parameters()))

prediction = model(question)

In [29]:
print(prediction)

To find the number of vegetables, let's identify the vegetables in the list:

1. Celery
2. Onion
3. Carrot
4. Cabbage
5. Lettuce
6. Potato
7. Broccoli

There are 7 vegetables in the list.


Ok, that was wrong 😢 let's see if we can improve this small model!

### Loss

For our prompt optimization, we'll use one of TextGrad's built-in loss functions.

This specialized function acts like a smart evaluator, comparing our variables (like questions and answers) and providing structured feedback. Each variable gets assigned a specific "role" so the system understands what it's looking at - think of it as giving clear job descriptions to each piece of data.

One important tip: pay attention to the order of *these roles*! The sequence matters significantly for how the relationships between variables are interpreted. Get this right, and TextGrad will generate precisely the feedback needed to refine your prompts.

In [34]:
evaluation_instruction = """Below is a question from a question-answering task, the ground truth answer, and reasoning with the final prediction.
Is the final prediction correct, i.e. the same as the ground truth answer? Say only 1 (yes) or 0 (no). 
Return your response within <ACCURACY> </ACCURACY> tags. e.g.<ACCURACY> 0 </ACCURACY> or <ACCURACY> 1 </ACCURACY>"""

eval_instruction = tg.Variable(evaluation_instruction, requires_grad=False, role_description="evaluation instruction for the task")

role_descriptions = [
    "Question for the task",
    "Ground truth answer",
    "Reasoning and prediction from the language model"
]

loss_fn = MultiFieldTokenParsedEvaluation(
    eval_instruction,
    role_descriptions=role_descriptions,
    parse_tags=["<ACCURACY>", "</ACCURACY>"]
)

In [35]:
for i in range(5):

    optimizer.zero_grad()

    loss = loss_fn([question, answer, prediction])
    loss.backward()
    optimizer.step()
    prediction = model(question)

    if "<ACCURACY> 1 </ACCURACY>" in loss.value:
        print("We got it!")
        break
    else:
        print(f"We are not there yet... {loss.value}")


We are not there yet... <ACCURACY> 0 </ACCURACY>
We are not there yet... <ACCURACY> 0 </ACCURACY>
We are not there yet... <ACCURACY> 0 </ACCURACY>
We got it!


What is the system prompt that helped us getting the correct result?

In [21]:
print(model.system_prompt.value)

You are a concise LLM. Think step by step. Ensure to consider all types of vegetables, refer to a broad definition of vegetables, and take into account the quantity of each item. Provide a clear and concise explanation for your counting, including any assumptions or definitions used. Prioritize accuracy and completeness in your response, while maintaining conciseness where possible. Reflect on your counting process to identify potential biases or oversights and consider how your approach could be improved for future similar tasks.


This is a very specific use-case and in general, you would prefer to optimize on multiple examples to avoid ending up overfitting on single
examples. There is a full tutorial about this here.

# Agents Writing Code

Let's now look at the full extent of TextGrad flexibility. We will write a loss function for test-time optimization. This is one of the most advanced usages of TextGrad so a few pieces ended up being involved in this process. We'll see how we can optimize code generation directly.

One of the most popular usecases nowadays is having agents writing code. However, most often these agents do not give you optimized outputs. 

We will simplify part of the interaction with the language model to make the example easier to follow.


### SideNote
Note that what we will be doing will happen completely in an unsupervised way. TextGrad supports supervised optimization (e.g., you get an llm output, you evaluate that using an external function (e.g., accuracy, and you tell the LLM that result and it will be used to optimize, you can find an example of this in another tutorial))


### Some Evaluation Code (Plase open, read carefully)

The code below contains a safety mechanism that prevents automatic execution. Before running,
 you'll need to review the code and uncomment an exception. For best practices, we recommend
running this in a sandboxed environment like Google Colab.

In [46]:
# We'll use below utilities to run a python function.
from IPython.core.interactiveshell import InteractiveShell

def run_function_in_interpreter(func_code):
    #raise Exception("This function will run the code returned by GPT-4o. Remove this if you'd like to run the code!")
    interpreter = InteractiveShell.instance()
    
    interpreter.run_cell(func_code, store_history=False, silent=True)
    
    func_name = func_code.split("def ")[1].split("(")[0].strip()
    func = interpreter.user_ns[func_name]
    
    return func


def test_longest_increasing_subsequence(fn):
    nums = [10, 22, 9, 33, 21, 50, 41, 60]
    assert fn(nums) == 5

    nums = [7, 2, 1, 3, 8, 4, 9, 6, 5]
    assert fn(nums) == 4

    nums = [5, 4, 3, 2, 1]
    assert fn(nums) == 1

    nums = [1, 2, 3, 4, 5]
    assert fn(nums) == 5

    nums = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
    assert fn(nums) == 4

    nums = [10, 9, 2, 5, 3, 7, 101, 18]
    assert fn(nums) == 4

    nums = [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]
    assert fn(nums) == 6

    nums = [7, 7, 7, 7, 7, 7, 7]
    assert fn(nums) == 1

    nums = [20, 25, 47, 35, 56, 68, 98, 101, 212, 301, 415, 500]
    assert fn(nums) == 11

    nums = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
    assert fn(nums) == 1

    print("✅ All test cases passed!")

# Generate a random test case
def generate_random_test_case(size, min_value, max_value):
    return [random.randint(min_value, max_value) for _ in range(size)]

# Test the function with a random test case
size = 10000  # Adjust the size as needed
min_value = 1
max_value = 10000

nums = generate_random_test_case(size, min_value, max_value)

# When evaluating the code, we will run it through this function
# this will print the results and the runtime
def test_lis_implementation(code):
    longest_increasing_subsequence = run_function_in_interpreter(code)

    start_time = time.time()
    lis = longest_increasing_subsequence(nums)
    end_time = time.time()

    runtime = end_time - start_time

    print(f"Test Case Size: {size}")
    print(f"Longest Increasing Subsequence Length: {lis}")
    print(f"Runtime: {runtime:.5f} seconds")

    # Test for all test cases
    test_longest_increasing_subsequence(longest_increasing_subsequence)

    return runtime

## Coding Problems

In [37]:
problem_text = """Longest Increasing Subsequence (LIS)

Problem Statement:
Given a sequence of integers, find the length of the longest subsequence that is strictly increasing. 
A subsequence is a sequence that can be derived 
from another sequence by deleting some or no elements without changing the order of the remaining elements.

Input:
The input consists of a list of integers representing the sequence.

Output:
The output should be an integer representing the length of the longest increasing subsequence.

Only return the function implementation using the following format:

```python
def longest_increasing_subsequence(nums):
[... your code ...]
```
"""

problem = tg.Variable(problem_text, requires_grad=False, role_description="problem statement")

In [48]:
# We are using a pretty big model here, so one would expect the code to be good!
# we are also overriding the backward engine
engine = LiteLLMEngine("together_ai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo", cache=True)
tg.set_backward_engine(engine, override=True)

model = tg.BlackboxLLM(engine)

answer = model(problem)

first_answer = answer.value.split("```python")[1].split("```")[0]

print(first_answer)



def longest_increasing_subsequence(nums):
    """
    This function calculates the length of the longest increasing subsequence in a given list of integers.

    Args:
    nums (list): A list of integers.

    Returns:
    int: The length of the longest increasing subsequence.
    """
    
    # If the input list is empty, return 0
    if not nums:
        return 0
    
    # Initialize a list to store the lengths of the longest increasing subsequences ending at each position
    dp = [1] * len(nums)
    
    # Iterate over the input list
    for i in range(1, len(nums)):
        # For each element, compare it with all previous elements
        for j in range(i):
            # If the current element is greater than the previous element, update the length of the subsequence
            if nums[i] > nums[j]:
                dp[i] = max(dp[i], dp[j] + 1)
    
    # Return the maximum length found
    return max(dp)



Is this answer good?

In [39]:
test_lis_implementation(first_answer)

Test Case Size: 10000
Longest Increasing Subsequence Length: 192
Runtime: 2.41174 seconds
✅ All test cases passed!


It is! but maybe we can do better? this looks rather slow? Let's implement an entire textgrad pipeline this time.

In [50]:
# Code is the variable of interest we want to optimize -- so requires_grad=True
code = tg.Variable(value=first_answer,
                 requires_grad=True,
                 role_description="code instance to optimize")

# We are not interested in optimizing the problem -- so requires_grad=False
problem = tg.Variable(problem_text, 
                    requires_grad=False, 
                    role_description="the coding problem statement")

# Let TGD know to update code!
optimizer = tg.TGD(parameters=[code])

### Custom Losses In TextGrad

This is how you write a full new loss in TextGrad, it requires putting together a few details.

In [51]:
# The system prompt that will guide the behavior of the loss function.
loss_system_prompt = "You are a smart language model that evaluates code snippets. You do not solve problems or propose new code snippets, only evaluate existing solutions critically and strong feedback."
loss_system_prompt = tg.Variable(loss_system_prompt, requires_grad=False, role_description="system prompt to the loss function")

# The instruction that will be the prefix
instruction = """Think about the problem and the code snippet. What is the runtime complexity?"""

# The format string and setting up the call
format_string = "{instruction}\nProblem: {{problem}}\nCurrent Code: {{code}}"
format_string = format_string.format(instruction=instruction)

fields = {"problem": None, "code": None}
formatted_llm_call = tg.autograd.FormattedLLMCall(engine=engine,
                                                  format_string=format_string,
                                                  fields=fields,
                                                  system_prompt=loss_system_prompt)

# The loss function
def loss_fn(problem: tg.Variable, code: tg.Variable) -> tg.Variable:
    inputs = {"problem": problem, "code": code}
    
    return formatted_llm_call(inputs=inputs,
                              response_role_description=f"evaluation of the {code.get_role_description()}")



We are ready to see some magic!

In [52]:
for i in range(5):

    runtime = test_lis_implementation(code.value)
    if runtime < 1.5:
        print("We got it!")
        break
    else:
        print(f"We are not there yet... {runtime}")

    optimizer.zero_grad()

    loss = loss_fn(problem, code)
    loss.backward()
    optimizer.step()
    

Test Case Size: 10000
Longest Increasing Subsequence Length: 194
Runtime: 2.37487 seconds
✅ All test cases passed!
We are not there yet... 2.3748748302459717
Test Case Size: 10000
Longest Increasing Subsequence Length: 194
Runtime: 2.40329 seconds
✅ All test cases passed!
We are not there yet... 2.4032862186431885
Test Case Size: 10000
Longest Increasing Subsequence Length: 194
Runtime: 1.12705 seconds
✅ All test cases passed!
We got it!


We got faster code! Let's see it!

In [55]:
print(code.value)

def longest_increasing_subsequence(nums):
    """
    This function calculates the length of the longest increasing subsequence in a given list of integers.

    Args:
    nums (list): A list of integers.

    Returns:
    int: The length of the longest increasing subsequence.

    Raises:
    ValueError: If the input list contains non-integer values.
    """

    # Check if the input list contains only integers
    if not all(isinstance(num, int) for num in nums):
        raise ValueError("Input list must contain only integers, but found non-integer value")

    # If the input list is empty, return 0
    if not nums:
        return 0

    # Initialize a list to store the lengths of the longest increasing subsequences ending at each position
    lengths = [1] * len(nums)

    # Initialize a list to store the previous element in the longest increasing subsequence ending at each position
    prev_elements = [None] * len(nums)

    # Iterate over the input list
    for i in range(1, len(n

# Other Possibilies

TextGrad can be applied to many different use-cases. Some examples are described in the paper:

* Long Chain Optimization
* System Prompt Optimization
* Molecule Optimization
* Radioteraphy Optimization