In [10]:
# import stuff
import argparse
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from datasets import load_dataset

In [31]:
# constant and variables
MODEL_PATH = "data/Qwen2.5-GRPO-3B"
CHECKPOINT = "checkpoint-10"
DATASET = "Maxwell-Jia/AIME_2024"
CUDA_IDX = 1

In [8]:
# load model and tokenizer
def load_model_and_tokenizer(model_path, checkpoint = None):
    global CUDA_IDX
    if checkpoint:
        model_path = f"{model_path}/{checkpoint}"
    print(f"Loading model from {model_path}...")
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(
        model_path,
        torch_dtype="auto",
        device_map=f"cuda:{CUDA_IDX}"
    )
    CUDA_IDX += 1
    model.eval()
    return model, tokenizer


In [32]:
model_10, tokenizer_10 = load_model_and_tokenizer(MODEL_PATH, CHECKPOINT)
final_model, final_tokenizer = load_model_and_tokenizer(MODEL_PATH)
dataset = load_dataset(DATASET, split="train")

Generating train split: 30 examples [00:00, 4195.98 examples/s]


In [41]:
def make_conversation(example):
    prompt = []

    prompt.append({"role": "system", "content": "You are a helpful AI Assistant, designed to provided well-reasoned and detailed responses. You FIRST think about the reasoning process step by step and then provide the user with the answer. Please enclose your final answer in the box: \\boxed{Your Answer}. Please stop generation immediately after outputing the box."})

    prompt.append({"role": "user", "content": example["Problem"]})
    return prompt

In [42]:
sample_idx = 15
sample = dataset[sample_idx]
print("Question:", sample['Problem'])
print("\nGold Answer:", sample['Solution'])
print("\nChat template applied: ", tokenizer_10.apply_chat_template(make_conversation(sample), tokenize=False, add_generation_prompt=True))

Question: Let $p$ be the least prime number for which there exists a positive integer $n$ such that $n^{4}+1$ is divisible by $p^{2}$. Find the least positive integer $m$ such that $m^{4}+1$ is divisible by $p^{2}$.

Gold Answer: Note that $n^4 + 1 \equiv 0 \pmod{p}$ means $\text{ord}_{p}(n) = 8 \mid p-1.$ The smallest prime that does this is $17$ and $2^4 + 1 = 17$ for example. Now let $g$ be a primitive root of $17^2.$ The satisfying $n$ are of the form, $g^{\frac{p(p-1)}{8}}, g^{3\frac{p(p-1)}{8}}, g^{5\frac{p(p-1)}{8}}, g^{7\frac{p(p-1)}{8}}.$ So if we find one such $n$, then all $n$ are $n, n^3, n^5, n^7.$ Consider the $2$ from before. Note $17^2 \mid 2^{4 \cdot 17} + 1$ by LTE. Hence the possible $n$ are, $2^{17}, 2^{51}, 2^{85}, 2^{119}.$ Some modular arithmetic yields that $2^{51} \equiv \boxed{110}$ is the least value.

Chat template applied:  <|im_start|>system
You are a helpful AI Assistant, designed to provided well-reasoned and detailed responses. You FIRST think about the

In [None]:
def completion(model, tokenizer, sample):
    problem = tokenizer.apply_chat_template(
        make_conversation(sample),
        tokenize=False,
        add_generation_prompt=True
    )
    inputs = tokenizer(problem, return_tensors="pt").to(model.device)
    generated_ids = model.generate(
        **inputs,
        max_new_tokens=2048
    )
    completion = tokenizer.decode(generated_ids[0][inputs.input_ids.shape[-1]:])
    return completion

In [36]:
print(f"Checkpoint 10 completion:\n{completion(model_10, tokenizer_10, sample)}")

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


Checkpoint 10 completion:
 To solve this problem, we need to find the smallest prime number \( p \) such that there exists a positive integer \( n \) for which \( n^4 + 1 \) is divisible by \( p^2 \). Once we find this prime \( p \), we need to find the smallest positive integer \( m \) such that \( m^4 + 1 \) is divisible by \( p^2 \).

Let's break down the problem into manageable steps:

1. **Find the smallest prime \( p \) such that \( n^4 + 1 \equiv 0 \pmod{p^2} \) for some \( n \).**
2. **Once we find \( p \), find the smallest \( m \) such that \( m^4 + 1 \equiv 0 \pmod{p^2} \).**

We can use Python and the sympy library to help us with this. Let's start by writing the code to find the smallest prime \( p \).

```python
import sympy as sp

# Function to find the smallest prime p such that n^4 + 1 is divisible by p^2
def find_smallest_prime():
    p = 2
    while True:
        # Check if there exists an n such that n^4 + 1 is divisible by p^2
        for n in range(1, p**2):
     

In [37]:
print(f"Final Model completion:\n{completion(final_model, final_tokenizer, sample)}")

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


Final Model completion:
 To solve this problem, we need to find the smallest prime number \( p \) such that there exists a positive integer \( n \) for which \( n^4 + 1 \) is divisible by \( p^2 \). Once we find this prime \( p \), we need to find the smallest positive integer \( m \) such that \( m^4 + 1 \) is divisible by \( p^2 \).

First, let's consider the expression \( n^4 + 1 \). We can factorize it as follows:
\[ n^4 + 1 = (n^2 + \sqrt{-1})(n^2 - \sqrt{-1}) = (n^2 + i)(n^2 - i). \]
For \( n^4 + 1 \) to be divisible by \( p^2 \), both \( n^2 + i \) and \( n^2 - i \) must be divisible by \( p \). This means that \( n^2 \equiv -i \pmod{p} \) and \( n^2 \equiv i \pmod{p} \). Therefore, \( -i \equiv i \pmod{p} \), which implies \( 2i \equiv 0 \pmod{p} \). This means that \( p \) must be a divisor of 2, but since \( p \) is a prime number, the only possible value for \( p \) is 2. However, we need to check if there exists an \( n \) such that \( n^4 + 1 \) is divisible by \( 2^2 = 4 