
# Code-to-Prompt Semantic Matching Experiment (Manim CodeGen)

This notebook demonstrates:
- Loading a Manim code-generation dataset from Hugging Face Datasets.
- Using OpenAI's GPT model to match a code snippet to its most semantically similar prompt.
- Validating and displaying the results.

**Instructions:**  
Set your OpenAI API key file path and other parameters in the configuration section below.

## 1. Imports & Configuration

In [None]:
import pprint
import openai
import torch
import re

from typing import List, Optional
from datasets import load_dataset


### Configuration Variables


In [5]:
# Set these variables as needed
DATASET_NAME = "generaleoley/manim-codegen"
DATASET_SPLIT = "train"
REFERENCE_INDEX = 1          # Index of code example to test
SUBSET_START = 900           # Start index for candidate prompt subset
SUBSET_END = 1000            # End index (exclusive)
OPENAI_API_KEY_PATH = "open_ai_API.txt"
OPENAI_MODEL = "gpt-4o-mini"
OPENAI_MAX_TOKENS = 150
OPENAI_TEMPERATURE = 0.2

### 2. Environment & Resource Check

In [7]:
def check_cuda_and_device():
    """Check and display CUDA/GPU availability."""
    print("CUDA Available:", torch.cuda.is_available())
    if torch.cuda.is_available():
        device_id = torch.cuda.current_device()
        print(f"Current device ID: {device_id}")
        print(f"Current device name: {torch.cuda.get_device_name(device_id)}")

check_cuda_and_device()

CUDA Available: True
Current device ID: 0
Current device name: NVIDIA TITAN Xp


### 3. API Key Handling

In [None]:
def load_api_key(path: str) -> str:
    """Reads the OpenAI API key from a file."""
    try:
        with open(path, "r") as f:
            return f.read().strip()
    except FileNotFoundError:
        raise RuntimeError(f"API key file '{path}' not found.")
    except Exception as e:
        raise RuntimeError(f"Error loading API key: {e}")

openai.api_key = load_api_key(OPENAI_API_KEY_PATH)

### 4. Load Dataset

In [8]:
print("Loading dataset... (may take a moment)")
data = load_dataset(DATASET_NAME, split=DATASET_SPLIT)
print(f"Dataset loaded: {DATASET_NAME}, split: {DATASET_SPLIT}")
print(f"Total samples in split: {len(data)}")

Loading dataset... (may take a moment)
Dataset loaded: generaleoley/manim-codegen, split: train
Total samples in split: 1622


### 5. Select Reference Code and Candidate Prompts


In [None]:
reference = REFERENCE_INDEX
code_string = data[reference]['answer']

print("\n--- Reference Code Snippet ---\n")
print(code_string)

subset = data.select(range(SUBSET_START, SUBSET_END))
print(f"\nSubset selected: [{SUBSET_START}, {SUBSET_END}) ({len(subset)} prompts)")

### 6. Prepare LLM Prompt for Semantic Matching

In [None]:
def build_comparison_prompt(code: str, subset_examples) -> str:
    """
    Builds a prompt asking the LLM to select the most similar query to the code from the subset.
    """
    content = f"The following code is provided:\n\n{code}\n\n"
    content += (
        "Please identify which query in the list below is the most similar to the code in terms of purpose. The options are:\n\n"
    )
    for i, example in enumerate(subset_examples):
        content += f"Query {i+1}: {example['query']}\n"
    return content

messages = [
    {
        "role": "user",
        "content": build_comparison_prompt(code_string, subset),
    }
]

### 7. Call OpenAI API to Find Most Similar Prompt

In [None]:
def call_openai_model(messages: List[dict], model: str, max_tokens: int, temperature: float) -> str:
    """
    Calls the OpenAI API and returns the model's response.
    """
    response = openai.chat.completions.create(
        model=model,
        messages=messages,
        max_tokens=max_tokens,
        temperature=temperature,
    )
    return response.choices[0].message.content

print("\nQuerying OpenAI model for semantic match...")
open_gen = call_openai_model(messages, OPENAI_MODEL, OPENAI_MAX_TOKENS, OPENAI_TEMPERATURE)

print("\n--- LLM Response ---\n")
print(open_gen)

### 8. Extract Query Index from LLM Response


In [None]:
def extract_query_number(response_text: str) -> Optional[int]:
    """
    Extracts the query index number from LLM response text.
    Returns None if not found.
    """
    match = re.search(r'Query (\d+)', response_text)
    if match:
        return int(match.group(1))
    return None

query_number = extract_query_number(open_gen)
if query_number is not None:
    print(f"\nThe closest query (in subset) is: Query {query_number}")
else:
    print("\nNo query number found in the response.")


### 9. Display Matched Query and Find Its Position in Full Dataset

In [None]:
if query_number is not None:
    matched_query_text = subset[query_number - 1]['query']
    print("\n--- Matched Query from Subset ---\n")
    print(matched_query_text)

    # Find its original index in the full dataset
    def find_paragraph_index(paragraphs: List[str], reference_paragraph: str) -> Optional[int]:
        """Finds the index of a reference paragraph in a list."""
        try:
            return paragraphs.index(reference_paragraph)
        except ValueError:
            print("Reference paragraph not found in the list.")
            return None

    all_queries = data['query']
    index_position = find_paragraph_index(all_queries, matched_query_text)
    if index_position is not None:
        print(f"\nReference paragraph found at index position: {index_position} for reference: {reference}")
        print("\n--- Full Prompt from Dataset ---\n")
        pprint.pprint(data[index_position]['query'])
else:
    print("Could not match query; please review LLM output.")


CUDA Available: True
Current device ID: 0
Current device name: NVIDIA TITAN Xp
Loading dataset... (may take a moment)
Dataset loaded: generaleoley/manim-codegen, split: train
Total samples in split: 1622

--- Reference Code Snippet ---

\n class Main(Scene):
    def construct(self):
            self.play(Transform(text,Text("animate.flip").shift(UP*2.5)), run_time=0.5)
            triangle = Triangle()
            self.play(triangle.animate.flip())
            self.remove(triangle)

Subset selected: [900, 1000) (100 prompts)

Querying OpenAI model for semantic match...

--- LLM Response ---

The code you provided is focused on creating a simple animation using the Manim library, specifically demonstrating a transformation effect (flipping a triangle) and a text transformation. The most similar query in terms of purpose would be one that also involves creating an animation with a transformation effect.

Among the provided queries, **Query 91** is the most similar to your code. It involv

### 10. Conclusion & Next Steps

- The notebook demonstrated code-to-prompt semantic matching using LLMs.
- You can change the `REFERENCE_INDEX` or `SUBSET_START/END` at the top to experiment with different examples or candidate pools.

 **For further exploration:**  
 - Batch process multiple code snippets for evaluation metrics.
 - Use sentence embeddings (e.g., with `SentenceTransformer`) for additional semantic similarity scoring.
 - Visualize results across many examples to benchmark LLM performance.