In [16]:
prompt_template = """You need to generate python code for a synthetic procedural dataset. The dataset is similar to OpenAI's GSM8K which contains grade-school level math questions in natural language.

Here is a the SOURCE item which you should translate into a python generator:

```json
{0}
```

As you can see we have already `question_annotated` and `answer_annotated`, but they are not really native python functions. Variable assignments are on the `question_annotated` field just after `#init:` 
Some variables have `$` prefix, Some do not. These variables have some random values assigned to them through the use of `sample()` and `range()`. In addition, Since these are mathematical questions, there are also some conditions in the `question_annotatated` field marked by `#conditions:`. Your job is to do calculations making sure that these conditions are adhered to.

Could you generate python code which would generate synthetic questions and answers,  Essentially, you need to make sense of the question and adhere to the variable assignments and conditions in `question_annotated` field. Beside the question and answer I also need some metadata e.g. in a dict about the variables used(Some of the variables don't have intelligible name. e.g x, g, y etc. try to give make them intelligible name).

I would like to use the generator function later to generate many different variants of questions and answers based on the same template while ensuring mathematical accuracy and consistency and inline with the conditions specified in the `question_annotated` field.
To control the difficulty I want to provide a floating point `difficulty` factor which could be used to scale the numeric ranges .. but please ensure the values integers (e.g cast back to int). If there are variables for which no values are provided like male_names, objects, names, fraction_alnum etc. please generate a list of values to sample from that fits in based on the context of the question.

1. To make it modular and testable let's split the generator into one function called `generate_from_variables()` which gets the input variables and generates the question and answer texts. It should calculate the answer value from the inputs and the main randomized generator `generate_example()` (see below). 

2. The generator function should have a signature like`def generate_example(rng: Random, difficulty: float = 1.0) -> dict`.

The output dict should contain:
{{
  'question': '<the generated question>',
  'answer': '<the_final_answer>',  # here only the final answer, e.g. the number
  'metadata': {{
    'difficulty': difficulty,
    'answer_value:': <numeric_answer_value>,
    'answer_cot': '<full_long_form_answer>' # chain of thought, similar to 'answer' in the SOURCE
    'variables': {{
        ...  # the variable used
    }}
  }}
}}

3. Write a simple `original_example()` function which calls `generate_from_variables()` and passes the original input values from SOURCE use in the json example above (in order to compare the output).

Your task:

- Generate reasonable random values for all the variables in line with the conditions in the `question_annotated` field
- Ensure mathematical consistency (results of divisions need to be integers)
- Create natural language question and answer texts
- Include metadata about the variables and solution


Here are useful examples of json input and python output:

{1}

Just generate the three python functions for the SOURCE dataset item - no additional explanation.
"""

In [17]:
from pathlib import Path

def write_python_code_with_prefix_and_number(code_string: str, num: int) -> str:
    """Writes Python code to (prefix + num) file"""
    
    file_path = Path(f"../reasoning_gym/arithmetic/gsm_symbolic/generator_{num}.py")
    
    try:
        with open(file_path, 'w', encoding='utf-8') as f:
            f.write(code_string)
        print(f"Python code written to {file_path.absolute()}")
        return file_path
    except Exception as e:
        print(f"Error writing to file: {e}")
        return None

In [18]:
def read_file_to_string(file_path: str) -> str:
    """
    Reads file content and returns as string
    
    Args:
        file_path: Path to the file to read
        
    Returns:
        str: File contents as string
        
    Raises:
        FileNotFoundError: If file doesn't exist
        IOError: If reading fails
    """
    try:
        with open(file_path, 'r', encoding='utf-8') as file:
            return file.read()
    except FileNotFoundError:
        raise FileNotFoundError(f"File not found: {file_path}")
    except IOError as e:
        raise IOError(f"Error reading file {file_path}: {str(e)}")

In [19]:
# create open-router client, place your OPENROUTER_API_KEY in .env file
# .env contents:
# OPENROUTER_API_KEY=sk-or-v1- ...

%load_ext dotenv
%dotenv
import os
import re
from pathlib import Path
from typing import Any, Iterable, Optional
import json
from openai import OpenAI
from openai.types.chat import ChatCompletion, ChatCompletionMessageParam
import time

def llm_generate(
    client: OpenAI,
    messages: Iterable[ChatCompletionMessageParam],
    sampling_params: dict[str, Any],
) -> ChatCompletion:
    max_retry = 3
    for trial in range(max_retry):
        try:
            return client.chat.completions.create(
                messages=messages,
                **sampling_params,
            )
        except Exception as e:
            print("failure response:", e)
            time.sleep(trial * trial)  # quadratic backoff
            if trial == max_retry - 1:
                raise

open_router_client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.getenv("OPENROUTER_API_KEY"),
    timeout=90.0,
)

sampling_params = {
    "model": "anthropic/claude-3.5-sonnet",
    "max_tokens": 4096,
}

The dotenv extension is already loaded. To reload it, use:
  %reload_ext dotenv


In [20]:
def generate_simple_request(user_prompt: str, developer_prompt: Optional[str] = None) -> list[dict]:
    prompt = []
    if developer_prompt is not None:
        prompt.append( { "role": "system", "content": developer_prompt } )
    
    prompt.append( { "role": "user", "content": user_prompt })
    return prompt
    

def eval_prompt_template(input: str, cot: str):
    
    user_request = prompt_template.format(input, cot)
    input_messages = generate_simple_request(user_prompt=user_request)
    output =  llm_generate(open_router_client, input_messages, sampling_params)

    response = output.choices[0].message.content

    return response
    

# clone the gsm-symbolic from apple somewhere and set the path here, `git clone https://github.com/apple/ml-gsm-symbolic.git``
path_to_gsmsym = Path("../reasoning_gym/data/gsm_data/symbolic/")
print("Reading templates from path: ", path_to_gsmsym.absolute())

template_files = list(path_to_gsmsym.glob("*.json"))
print("Number of files: ", len(template_files))

gsm_symbolic_cot_file_path = Path("./gsm-symbolic-cot.txt")
gsm_symbolic_cot = read_file_to_string(gsm_symbolic_cot_file_path)

# Time to generate python code for all the gsm-symbolic templates
print("Generating python code for GSM symbolic templates\n")
for i, file in enumerate(template_files):
    response_text = eval_prompt_template(file.read_text(), gsm_symbolic_cot)
    # extract python source section
    result_match = re.search(r"^```.*\n((.*\n)+)```", response_text, flags=re.MULTILINE)
    python_source = result_match.group(1)
    write_python_code_with_prefix_and_number(python_source, i)

print("Python code generation for templates completed!\n")

Reading templates from path:  /Users/abdulhakeemadefioye/Desktop/deep-learning/reasoning-gym/notebooks/../reasoning_gym/data/gsm_data/symbolic
Number of files:  100
Generating python code for GSM symbolic templates

Python code written to /Users/abdulhakeemadefioye/Desktop/deep-learning/reasoning-gym/notebooks/../reasoning_gym/arithmetic/gsm_symbolic/generator_0.py
Python code written to /Users/abdulhakeemadefioye/Desktop/deep-learning/reasoning-gym/notebooks/../reasoning_gym/arithmetic/gsm_symbolic/generator_1.py
Python code written to /Users/abdulhakeemadefioye/Desktop/deep-learning/reasoning-gym/notebooks/../reasoning_gym/arithmetic/gsm_symbolic/generator_2.py
Python code written to /Users/abdulhakeemadefioye/Desktop/deep-learning/reasoning-gym/notebooks/../reasoning_gym/arithmetic/gsm_symbolic/generator_3.py
Python code written to /Users/abdulhakeemadefioye/Desktop/deep-learning/reasoning-gym/notebooks/../reasoning_gym/arithmetic/gsm_symbolic/generator_4.py
Python code written to /

In [11]:
# WARNING: We are now executing the llm response without sandbox environment!

scope = {}  # eval generated python code here

try:
    exec(python_source, scope, scope)
except Exception as err:
    raise


exec("output = original_example()", scope, scope)
generated_data = scope["output"]
print(generated_data['question'])


original_data = json.loads(template_files[54].read_text())
print(original_data['question'])


John and Jack have 30 minutes to walk to school together. It takes them 6 minutes to get to the corner where the library is. It takes them another 13 minutes to get to the fire station. How much longer do they have to get to school without being late?
John and Jack have 30 minutes to walk to school together. It takes them 6 minutes to get to the corner where the library is. It takes them another 13 minutes to get to the fire station. How much longer do they have to get to school without being late?
