In [7]:
from datasets import load_dataset
from prompting import gen_prompt

ds = load_dataset("TIGER-Lab/MMLU-Pro")

In [1]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-14B")

tokenizer_config.json:   0%|          | 0.00/3.07k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/7.03M [00:00<?, ?B/s]

In [9]:
example_question = gen_prompt(ds['test'][0])

example_text = r"""Please answer the following multiple choice question. After thinking, respond directly with the letter of your answer.

Question:
The symmetric group $S_n$ has $
\factorial{n}$ elements, hence it is not true that $S_{10}$ has 10 elements.
Find the characteristic of the ring 2Z.

Options:
A:  0
B: 30
C: 3
D: 10
E: 12
F: 50
G: 2
H: 100
I: 20
J: 5

Okay, so I need to find the characteristic of the ring 2Z. Let me recall what the characteristic of a ring is. The characteristic is the smallest positive integer n such that n times the multiplicative identity equals zero. If no such n exists, the characteristic is zero. But wait, 2Z is the ring of even integers under addition and multiplication. Hmm, but 2Z doesn't have a multiplicative identity element, right? Because in the integers Z, the multiplicative identity is 1, but 1 isn't in 2Z. So if the ring doesn't have a multiplicative identity, how does the characteristic work here?

Wait, the definition of characteristic for a ring might not require the existence of a multiplicative identity. Let me check. Oh right, for a ring without unity, the characteristic is defined as the smallest positive integer n such that n*a = 0 for all elements a in the ring. If no such n exists, then the characteristic is 0. So even if there's no 1 in the ring, we can still determine the characteristic by seeing if there's a common n that annihilates all elements when multiplied by n.

So for 2Z, let's see. Take any element a in 2Z. So a is an even integer, say a = 2k for some integer k. Then n*a would be n*(2k) = 2k*n. For this to be zero for all k, we need 2kn = 0 for all k. But since we're dealing with integers (not modulo anything), the only way this can happen is if n is zero. Wait, but in the integers, there's no positive integer n such that multiplying by n gives zero. So even for 2Z, which is a subset of Z, there's no positive integer n where n*a =0 for all a in 2Z. Because, for example, take a=2. Then n*2=0 would imply n=0, but n has to be positive. So there's no such n, which means the characteristic is zero. Therefore, the answer should be option A: 0.

But wait, I should double-check. Maybe I'm confusing additive order with the ring's characteristic. The characteristic is the smallest positive integer n such that n*1=0 in the ring. But since 2Z doesn't have 1, does that matter? So in rings without unity, the definition uses the additive order. Alternatively, if all elements of the ring have additive order dividing n, then the characteristic is the least such n. If there's no such n, then it's 0. Since 2Z is a subset of Z, and Z has characteristic 0, maybe 2Z also has characteristic 0. Because adding 2 to itself n times would be 2n. For 2n=0, n must be 0, which isn't allowed. So yes, the characteristic is 0. So the answer is A.

The characteristic of a ring is the smallest positive integer nn such that n⋅a=0n⋅a=0 for all elements aa in the ring. If no such nn exists, the characteristic is 00. The ring 2Z2Z, consisting of all even integers, does not have a multiplicative identity. However, the characteristic is determined by the additive structure. For any element a∈2Za∈2Z, a=2ka=2k where k∈Zk∈Z. Multiplying aa by a positive integer nn gives n⋅a=2knn⋅a=2kn, which equals 00 only if n=0n=0. Since nn must be positive and no such nn annihilates all elements in 2Z2Z, the characteristic is 00.

Answer: A: 0"""

Answer the following multiple choice question, selecting from the answer A through to J. After thinking, reply directly with your answer. 

The last character of your response must be the letter you have chosen.

Question:

Typical advertising regulatory bodies suggest, for example that adverts must not: encourage _________, cause unnecessary ________ or _____, and must not cause _______ offence.

Options:

A: Safe practices, Fear, Jealousy, Trivial
B: Unsafe practices, Distress, Joy, Trivial
C: Safe practices, Wants, Jealousy, Trivial
D: Safe practices, Distress, Fear, Trivial
E: Unsafe practices, Wants, Jealousy, Serious
F: Safe practices, Distress, Jealousy, Serious
G: Safe practices, Wants, Fear, Serious
H: Unsafe practices, Wants, Fear, Trivial
I: Unsafe practices, Distress, Fear, Serious

Answer:
    <think>



In [10]:
tokens = tokenizer(example_text)
len(tokens['input_ids'])

925

Try async with exponential backoff

In [51]:
import os
import asyncio
import random
import uuid
from pathlib import Path
from typing import List, Any
from together import AsyncTogether

async_client = AsyncTogether()
messages = [
    "What are the top things to do in San Francisco?",
    "What country is Paris in?",
] * 10

# Configuration for exponential backoff
MAX_RETRIES = 6
BASE_DELAY = 1  # starting delay in seconds
MAX_DELAY = 60 * 5  # maximum delay in seconds
JITTER_FACTOR = 0.1  # adds randomness to avoid thundering herd problem
RATE_LIMIT = 90.0 / 60 # queries per second

async def with_exponential_backoff(task_func, *args, **kwargs):
    """Execute a task with exponential backoff on rate limit errors."""
    retries = 0

    while True:
        try:
            print(f"Querying API")
            return await task_func(*args, **kwargs)
        except Exception as e:
            # Check if it's a rate limit error (429)
            is_rate_limit = hasattr(e, 'status_code') and e.status_code == 429
            is_overloaded = hasattr(e, 'status_code') and e.status_code == 503
            
            if not (is_rate_limit or is_overloaded) or retries >= MAX_RETRIES:
                # If not a rate limit error or max retries reached, re-raise
                raise
            
            # Calculate delay with exponential backoff and jitter
            delay = min(MAX_DELAY, BASE_DELAY * (2 ** retries))
            # Add jitter (± 10% randomness)
            jitter = delay * JITTER_FACTOR
            delay = delay + random.uniform(-jitter, jitter)
            
            print(f"Rate limited. Retrying in {delay:.2f} seconds (retry {retries + 1}/{MAX_RETRIES})...")
            await asyncio.sleep(delay)
            retries += 1

async def make_chat_completion(message: str) -> Any:
    """Make a single chat completion with retry logic."""
    response = await with_exponential_backoff(
        async_client.chat.completions.create,
        model="deepseek-ai/DeepSeek-R1-Distill-Qwen-14B",
        temperature=0.6, # Default temperature seems to lead to loops
        messages=[{"role": "user", "content": message}],
    )

    # Concatenate the initial prompt and the response
    text = message + response.choices[0].message.content

    # save this to a file
    filepath = Path().cwd()/ "responses" / f"{uuid.uuid4()}.txt"
    filepath.write_text(text)

    # return the response
    return response

async def async_chat_completion(messages: List[str], stagger_delay: float = 1.0):
    """Process multiple chat completions with concurrency control."""
    # Create tasks with retries
    tasks = []

    # Create tasks one at a time with staggered starts
    for i, message in enumerate(messages):
        print(f"Starting request {i+1}/{len(messages)}...")
        
        # Create the task and add it to our list
        task = asyncio.create_task(make_chat_completion(message))
        tasks.append(task)
        
        # Wait before starting the next one to achieve true staggering
        if i < len(messages) - 1:  # No need to wait after the last message
            await asyncio.sleep(stagger_delay)
    
    # Use gather to run concurrently
    responses = await asyncio.gather(*tasks, return_exceptions=True)
    
    # Process responses
    for i, response in enumerate(responses):
        if isinstance(response, Exception):
            print(f"Request {i+1} failed after all retries: {str(response)}")
        else:
            print(f"Request {i+1} succeeded:")
            print(response.choices[0].message.content)
            print("-" * 50)

    return responses

from time import time

# start = time()
# await async_chat_completion(messages, stagger_delay=1 / RATE_LIMIT)
# end = time()

# print(f"Took {end - start} seconds")


In [55]:
start = time()
res = await async_chat_completion(
    [
        gen_prompt(ds['test'][i])
        for i in range(len(ds['test']))
    ], 
    stagger_delay=1/RATE_LIMIT,
)
end = time()

print(f"Took {end - start} seconds.")
res

Starting request 1/12032...
Querying API
Starting request 2/12032...
Querying API
Starting request 3/12032...
Querying API
Starting request 4/12032...
Querying API
Starting request 5/12032...
Querying API
Starting request 6/12032...
Querying API
Starting request 7/12032...
Querying API
Starting request 8/12032...
Querying API
Starting request 9/12032...
Querying API
Starting request 10/12032...
Querying API
Starting request 11/12032...
Querying API
Starting request 12/12032...
Querying API
Starting request 13/12032...
Querying API
Starting request 14/12032...
Querying API
Starting request 15/12032...
Querying API
Starting request 16/12032...
Querying API
Starting request 17/12032...
Querying API
Starting request 18/12032...
Querying API
Starting request 19/12032...
Querying API
Starting request 20/12032...
Querying API
Starting request 21/12032...
Querying API
Starting request 22/12032...
Querying API
Starting request 23/12032...
Querying API
Starting request 24/12032...
Querying API
S

Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7e83a039afc0>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7e83a0d731d0>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7e83a0283b30>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7e83a0283b60>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7e83a0281d30>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7e83a0281760>


Starting request 1703/12032...
Querying API
Starting request 1704/12032...
Querying API
Starting request 1705/12032...
Querying API
Starting request 1706/12032...
Querying API
Starting request 1707/12032...
Querying API
Starting request 1708/12032...
Querying API
Starting request 1709/12032...
Querying API
Starting request 1710/12032...
Querying API
Starting request 1711/12032...
Querying API
Starting request 1712/12032...
Querying API
Starting request 1713/12032...
Querying API
Starting request 1714/12032...
Querying API
Starting request 1715/12032...
Querying API
Starting request 1716/12032...
Querying API
Starting request 1717/12032...
Querying API
Starting request 1718/12032...
Querying API
Starting request 1719/12032...
Querying API
Starting request 1720/12032...
Querying API
Starting request 1721/12032...
Querying API
Starting request 1722/12032...
Querying API
Starting request 1723/12032...
Querying API
Starting request 1724/12032...
Querying API
Starting request 1725/12032...
Q

IOPub data rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_data_rate_limit`.

Current values:
ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
ServerApp.rate_limit_window=3.0 (secs)



TypeError: Object of type CIMultiDictProxy is not JSON serializable

In [38]:
type(res[0].choices[0].message.content)

str

In [43]:
from pathlib import Path
Path(__dir__)

NameError: name '__dir__' is not defined

PosixPath('/home/will/src/ai/mats')

In [48]:
import uuid


4

In [54]:
len(ds['test'])

12032