In [1]:
import json
import os
from dynamic_cheatsheet.language_model import LanguageModel
from dotenv import load_dotenv

load_dotenv("config.env")

True

In [2]:
# Initialize the language model
model = LanguageModel(model_name="gemini/gemini-2.0-flash")

In [3]:
# Load the generator prompt
with open("prompts/generator_prompt.txt", "r") as f:
    generator_prompt = f.read()

# Load the curator prompt
with open("prompts/curator_prompt_for_dc_cumulative.txt", "r") as f:
    curator_prompt = f.read()

In [4]:
# Let's start with a problem from Game of 24
input_txt = R"""Let's play a game called 24. You'll be given four integers, and your objective is to use each number only once, combined with any of the four arithmetic operations (addition, subtraction, multiplication, and division) and parentheses, to achieve a total of 24. For example, if the input is 4, 7, 8, and 8, the output could be (7 - (8 / 8)) * 4 = 24. Please present a single expression that evaluates to 24.

Question #1:
5 6 6 8
""".strip()

In [5]:
# Round 1: Let's see how the model performs with an empty cheatsheet
results_dict_round1 = output_dict = model.advanced_generate(
    approach_name="DynamicCheatsheet_Cumulative",
    input_txt=input_txt,
    cheatsheet="(empty)",
    generator_template=generator_prompt,
    cheatsheet_template=curator_prompt
)

In [6]:
# Let's see the model's output along with the final cheatsheet
results_dict_round1

{'input_txt': "Let's play a game called 24. You'll be given four integers, and your objective is to use each number only once, combined with any of the four arithmetic operations (addition, subtraction, multiplication, and division) and parentheses, to achieve a total of 24. For example, if the input is 4, 7, 8, and 8, the output could be (7 - (8 / 8)) * 4 = 24. Please present a single expression that evaluates to 24.\n\nQuestion #1:\n5 6 6 8",
 'steps': [{'round': 0,
   'generator_prompt': '# GENERATOR (PROBLEM SOLVER)\n\nInstruction: You are an expert problem-solving assistant tasked with analyzing and solving various questions using a combination of your expertise and provided reference materials. Each task will include:\n1. A specific question or problem to solve\n2. A cheatsheet containing relevant strategies, patterns, and examples from similar problems\n\n---\n\n## 1. ANALYSIS & STRATEGY\n\n- Carefully analyze both the question and cheatsheet before starting\n- Search for and id

In [7]:
# Round 2: Let's see how the model performs with the new cheatsheet
new_cheatsheet_round1 = results_dict_round1['final_cheatsheet']

In [8]:
# Here is the new cheatsheet
print(new_cheatsheet_round1)

Version: 1.1

SOLUTIONS, IMPLEMENTATION PATTERNS, AND CODE SNIPPETS
<memory_item>
<description>
Solving the 24 game: Given four numbers, use each number exactly once with the operations +, -, *, and / to obtain 24. (Q1)
</description>
<example>
Numbers: 5, 6, 6, 8. Solution: (8 - 5) * 6 + 6 = 24
</example>
</memory_item>

GENERAL META-REASONING STRATEGIES
<memory_item>
<description>
Strategy for solving the 24 game: Try different combinations of operations and parentheses to arrive at 24. Start by combining numbers to get intermediate values close to 24 or that can be easily combined to reach 24. (Q1)
</description>
<example>
1.  Try simple combinations first: (a + b) * (c / d), (a + b) * (c - d), etc.
2.  If simple combinations don't work, try more complex arrangements with nested parentheses.
3.  Look for opportunities to create intermediate values that are close to 24.
4.  Consider using division to reduce larger numbers or multiplication to increase smaller numbers.
</example>
</me

In [9]:
# Round 2: Let's see how the model performs with the new cheatsheet
# Note that we are using the same input text as in Round 1
results_dict_round2 = output_dict = model.advanced_generate(
    approach_name="DynamicCheatsheet_Cumulative",
    input_txt=input_txt,
    cheatsheet=new_cheatsheet_round1,
    generator_template=generator_prompt,
    cheatsheet_template=curator_prompt
)

In [10]:
# Now, we have the correct answer!
results_dict_round2

{'input_txt': "Let's play a game called 24. You'll be given four integers, and your objective is to use each number only once, combined with any of the four arithmetic operations (addition, subtraction, multiplication, and division) and parentheses, to achieve a total of 24. For example, if the input is 4, 7, 8, and 8, the output could be (7 - (8 / 8)) * 4 = 24. Please present a single expression that evaluates to 24.\n\nQuestion #1:\n5 6 6 8",
 'steps': [{'round': 0,
   'generator_prompt': '# GENERATOR (PROBLEM SOLVER)\n\nInstruction: You are an expert problem-solving assistant tasked with analyzing and solving various questions using a combination of your expertise and provided reference materials. Each task will include:\n1. A specific question or problem to solve\n2. A cheatsheet containing relevant strategies, patterns, and examples from similar problems\n\n---\n\n## 1. ANALYSIS & STRATEGY\n\n- Carefully analyze both the question and cheatsheet before starting\n- Search for and id

In [11]:
# Here is the new cheatsheet
new_cheatsheet_round2 = results_dict_round2['final_cheatsheet']

In [12]:
print(new_cheatsheet_round2)

Version: 1.2

SOLUTIONS, IMPLEMENTATION PATTERNS, AND CODE SNIPPETS
<memory_item>
<description>
Solving the 24 game: Given four numbers, use each number exactly once with the operations +, -, *, and / to obtain 24. (Q1)
</description>
<example>
Numbers: 5, 6, 6, 8. Solution: (8 - 5) * 6 + 6 = 24
</example>
</memory_item>
** Count: 2

<memory_item>
<description>
Solving the 24 game: Given four numbers, use each number exactly once with the operations +, -, *, and / to obtain 24. (Q1)
</description>
<example>
Numbers: 4, 7, 8, 8. Solution: (7 - (8 / 8)) * 4 = 24
</example>
</memory_item>
** Count: 1

GENERAL META-REASONING STRATEGIES
<memory_item>
<description>
Strategy for solving the 24 game: Try different combinations of operations and parentheses to arrive at 24. Start by combining numbers to get intermediate values close to 24 or that can be easily combined to reach 24. (Q1)
</description>
<example>
1.  Try simple combinations first: (a + b) * (c / d), (a + b) * (c - d), etc.
2.  If

In [13]:
# Just for fun, let's see how the model performs with the new cheatsheet; so, we run the same input text again
# Hopefully, the model will make use of the new cheatsheet and generate a more efficient solution
results_dict_round3 = output_dict = model.advanced_generate(
    approach_name="DynamicCheatsheet_Cumulative",
    input_txt=input_txt,
    cheatsheet=new_cheatsheet_round2,
    generator_template=generator_prompt,
    cheatsheet_template=curator_prompt
)

In [14]:
# Correct answer again!
results_dict_round3

{'input_txt': "Let's play a game called 24. You'll be given four integers, and your objective is to use each number only once, combined with any of the four arithmetic operations (addition, subtraction, multiplication, and division) and parentheses, to achieve a total of 24. For example, if the input is 4, 7, 8, and 8, the output could be (7 - (8 / 8)) * 4 = 24. Please present a single expression that evaluates to 24.\n\nQuestion #1:\n5 6 6 8",
 'steps': [{'round': 0,
   'generator_prompt': '# GENERATOR (PROBLEM SOLVER)\n\nInstruction: You are an expert problem-solving assistant tasked with analyzing and solving various questions using a combination of your expertise and provided reference materials. Each task will include:\n1. A specific question or problem to solve\n2. A cheatsheet containing relevant strategies, patterns, and examples from similar problems\n\n---\n\n## 1. ANALYSIS & STRATEGY\n\n- Carefully analyze both the question and cheatsheet before starting\n- Search for and id

In [15]:
print(results_dict_round3['final_output'])

Okay, I understand the game. I need to use the numbers 5, 6, 6, and 8 exactly once, with the operations +, -, *, and /, and parentheses to obtain 24.

From the cheatsheet, I see examples of solving the 24 game and a general strategy. I'll start by trying simple combinations and then move to more complex ones if needed.

Here's my attempt:

1.  Try simple combinations:
    *   (5 + 6) * (6 / 8) = 11 * 0.75 = 8.25 (Not 24)
    *   (5 * 6) - (6 + 8) = 30 - 14 = 16 (Not 24)
    *   (8 - 6) * (5 + 6) = 2 * 11 = 22 (Close!)
    *   (8 + 6) + (5 + 6) = 14 + 11 = 25 (Close!)
    *   (8 * 6) / (5 - 6) = 48 / -1 = -48 (Not 24)
    *   (8 * 5) - (6 + 6) = 40 - 12 = 28 (Close!)
    *   (6 * 6) - (8 + 5) = 36 - 13 = 23 (Very close!)

2.  Let's try to manipulate the last one: (6 * 6) - (8 + 5) = 23. We need to add 1. How can we get 1?
    *   We can't easily get 1.

3.  Let's go back to (8 - 6) * (5 + 6) = 22. We need to add 2.
    *   We can't easily get 2.

4.  Let's try another approach. We want 

In [16]:
# QED.