# Notebook for loading and submitting questions
***

## Question JSON Format

Each question is a Python dict with the following keys:

- **uuid**: unique identifier
- **ChemIQ**: Boolean whether question is part of main ChemIQ benchmark
- **question_category**, **sub_category**  
- **meta_data**: e.g. `smiles`, `smiles_random`, `carbon_count`  
- **prompt**: the question text shown to users  
- **answer**: the expected answer  
- **answer_format**, **answer_range**, **verification_method**  

To submit a question, send its `prompt` and keep track of the `uuid`.  


In [1]:
import json
from collections import Counter
from pathlib import Path

# Load all questions
lines = Path('questions/chemiq.jsonl').read_text(encoding='utf-8').splitlines()
data = [json.loads(line) for line in lines]

# Summarise totals
total = len(data)
counts = Counter()
for q in data:
    counts[(q['question_category'],q['sub_category'])] += 1

print(f"Total questions (n={total}):")
for (category, sub_category), count in sorted(counts.items()):
    print(f" - {category!r}, {sub_category!r}: {count}")

chemiq_questions = [q for q in data if q.get('ChemIQ', False)]

Total questions (n=816):
 - 'atom_mapping', 'random': 92
 - 'atom_mapping', 'semi-canonical': 92
 - 'counting_carbon', 'counting': 50
 - 'counting_ring', 'counting': 48
 - 'nmr_elucidation', 'small': 46
 - 'nmr_elucidation', 'zinc_2d': 50
 - 'reaction', 'synthetic_canonical': 45
 - 'reaction', 'synthetic_random': 45
 - 'sar', 'integer': 20
 - 'sar', 'noise': 20
 - 'shortest_path', 'canonical': 54
 - 'shortest_path', 'random': 54
 - 'smiles_to_iupac', 'zinc_canonical': 100
 - 'smiles_to_iupac', 'zinc_random': 100


## Example question

In [2]:
print(f"{'='*20} PROMPT {'='*20}")
print(chemiq_questions[0]["prompt"])
print(f"{'='*20} ANSWER {'='*20}")
print(chemiq_questions[0]["answer"])

Write the SMILES string of the molecule consistent with this data.

Formula: C6H4BrNO

1H NMR: δ 8.02 (ddd, J = 8.53, 2.05, 0.46 Hz, 2H), 7.5 (ddd, J = 8.53, 1.26, 0.46 Hz, 2H).

13C NMR: δ 163.95 (1C, s), 132.71 (2C, s), 122.19 (2C, s), 120 (1C, s).

COSY (δH, δH): (7.5, 8.02).

HSQC (δH, δC): (7.5, 132.71), (8.02, 122.19).

HMBC (δH, δC): (8.02, 163.95), (7.5, 163.95), (8.02, 132.71), (7.5, 132.71), (8.02, 122.19), (7.5, 122.19), (8.02, 120), (7.5, 120).

Only write the SMILES string. Do not write stereochemistry. Do not write any comments.
O=Nc1ccc(Br)cc1


# Running benchmark using OpenAI API
***

## Create batch submission file

In [None]:
import os

# API Batch file
batch_submission_file = 'gpt-4o-2024-11-20-submission.jsonl'

os.makedirs(os.path.dirname(batch_submission_file), exist_ok=True)
with open(batch_submission_file, 'w') as f:
    for question in chemiq_questions:
        question_id = question["uuid"]
        prompt = question["prompt"]
        record = {
            "custom_id": question_id,
            "method": "POST",
            "url": "/v1/chat/completions",
            "body": {
                "model": "gpt-4o-2024-11-20",
                "messages": [{"role": "user", "content": prompt}],
            }
        }
        f.write(json.dumps(record) + "\n")

print(f"Successfully wrote batch requests to {batch_submission_file}. Number of questions = {len(chemiq_questions)}")

## Submit batch to OpenAI API

In [None]:
import os
import openai
from openai import OpenAI

# Set your OpenAI API key
# This will cause an error if you have not set your OpenAI key. Either set it as an environment variable or add it below
openai.api_key = os.environ["OPENAI_API_KEY"]
client = OpenAI(api_key=openai.api_key)

In [None]:
# Submit batch to OpenAI API
# Uncomment if you want to submit questions. Currently commented out for safety.

# batch_input_file = client.files.create(
#     file=open(batch_submission_file, "rb"),
#     purpose="batch"
# )

# batch_input_file_id = batch_input_file.id
# created_batch = client.batches.create(
#     input_file_id=batch_input_file_id,
#     endpoint="/v1/chat/completions",
#     completion_window="24h",
#     metadata={
#         "description": batch_submission_file,
#     }
# )
# print(created_batch)
# # Keep track of the Batch ID if submitting multiple different models
# print(f"Batch ID: {created_batch.id}")

## Download results

In [None]:
# batch_results_file = "gpt-4o-2024-11-20-results.jsonl"

# # Get results
# # Uncomment to retrive submitted questions and write results (note you need to wait for completion).

# batch_result = client.batches.retrieve(created_batch.id)
# print(batch_result)

# if batch_result.error_file_id:
#     error_file_response = client.files.content(batch_result.error_file_id)

# if batch_result.output_file_id:
#     output_file_response = client.files.content(batch_result.output_file_id)

#     # Decode the binary content to a UTF-8 string
#     data_str = output_file_response.content.decode('utf-8')

#     # Split the decoded string by lines and parse each non-empty line as JSON
#     results = [json.loads(line) for line in data_str.splitlines() if line.strip()]

# # Write results
# os.makedirs(os.path.dirname(batch_results_file), exist_ok=True)
# with open(batch_results_file, "w", encoding="utf-8") as f:
#     for record in results:
#         # dump each dict as a JSON string, followed by newline
#         f.write(json.dumps(record, ensure_ascii=False) + "\n")

# print(f"Wrote {len(results)} records to {batch_results_file}")
