# Running the prompts against several models for code quality comparison

Models:
- GPT 4.1
- GPT 4o
- GPT o3
- GPT o1
- gpt temperature defaults to 1, more creative (range 0 to 2)
- Claude Sonnet 3.7
- Claude Sonnet 4.0
- temperature defaults to 1, more creative
- Deepseek Chat (DeepSeek-V3.1 at time of writing) default temperature 1 (https://api-docs.deepseek.com/api/create-chat-completion)

Keeping default temperature to simulate consumer usage in chat apps like ChatGPT or Anthropics interface
Adding "Only output one alternative to solve this." in order to prevent the model to output multiple suggestions which are harder to evaluate.

Annotating the prompts by characteristics:
- Personification score
- request type (question vs command)
- length?
-

In [8]:
import sqlite3
import pandas as pd
import os
from openai import OpenAI
from anthropic import Anthropic
import dotenv

dotenv.load_dotenv()

db_path = "../../giicg.db"
if not os.path.exists(db_path):
    raise FileNotFoundError(f"Database file does not exist: {db_path}")

conn = sqlite3.connect(db_path)
prompts = pd.read_sql("SELECT * FROM translated_scratch_prompts", conn)
prompts = prompts.drop(columns=['conversational', 'code', 'other'])
prompts

Unnamed: 0,conversation_id,message_id,role,message_text,gender,user_id,programming_language,language
0,6,5,user,I want to use Dummy Hot encoding to replace th...,Woman (cisgender),16,python,en
1,7,43,user,whats the best way to encode and compress a ja...,Man (cisgender),25,javascript,en
2,8,47,user,I have a pandas dataframe like this:\ndata\tpe...,Woman (cisgender),28,python,en
3,10,57,user,"as a NLP and LLM researcher, I am recently dow...",Non-binary,30,python,en
4,12,65,user,Blender and Python. I have a collection of hun...,Man (cisgender),34,python,en
5,13,126,user,"how to run a Python future without blocking, i...",Man (cisgender),46,python,en
6,15,242,user,hey can you write me a short python script for...,Woman (cisgender),48,python,en
7,18,266,user,how can I combine two grib files in jupyter no...,Woman (cisgender),60,python,de
8,20,268,user,I work with Python and have to read an NMEA fi...,Woman (cisgender),65,python,de
9,21,290,user,please write method to unzip file in python,Woman (cisgender),73,python,en


## Models and CLients

In [None]:
gpt_client = OpenAI()
#"o3-2025-04-16", "gpt-5-chat-latest", "gpt-4.1-2025-04-14", "gpt-4o-2024-08-06"
gpt_models = ["chatgpt-4o-latest", "o3-2025-04-16", "gpt-5-chat-latest", "gpt-4.1-2025-04-14"]

deepseek_client = OpenAI(api_key=os.environ.get("DEEPSEEK_API_KEY"), base_url="https://api.deepseek.com")
deepseek_models = ["deepseek-chat"]

anthropic_models = [ "claude-3-7-sonnet-20250219", "claude-sonnet-4-20250514"]
claude_client = Anthropic(
    api_key=os.environ.get("ANTHROPIC_API_KEY"),
)


def prompt_gpt(client, model, prompt):
    response = client.responses.create(
        model=model,
        input=prompt,
    )
    return response.output_text


def prompt_deepseek(client, model, prompt):
    response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "user", "content": f"{prompt}. Only output one alternative to solve this.",}
            ],
            stream=False
            #max_tokens=1000,
            #temperature=0.0
        )

    return response.choices[0].message.content


def prompt_claude(client, model, prompt):
    response = client.messages.create(
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": f"{prompt}. Only output one alternative to solve this.",
        }
    ],
    model=model,
    #temperature=0.0  #Note that even with temperature of 0.0, the results will not be fully deterministic.
    )
    return "".join(block.text for block in response.content if hasattr(block, "text"))

In [14]:
#for model in gpt_models:
 #   prompts[model] = prompts["message_text"].apply(lambda x: prompt_gpt(gpt_client, model, x))

for model in deepseek_models:
    prompts[model] = prompts["message_text"].apply(lambda x: prompt_deepseek(deepseek_client, model, x))

for model in anthropic_models:
    prompts[model] = prompts["message_text"].apply(lambda x: prompt_claude(claude_client, model, x))

## Parse Code Snippets

In [25]:
import re

def parse_code_snippets(text):
    """
    Extract code snippets marked with triple backticks and any of these language tags:
    python, javascript, js, typescript, ts.
    Joins all found code into a single string separated by newlines.
    """
    pattern = r"```(?:python|javascript|js|typescript|ts|html)[ \t]*\n(.*?)(?=```)"
    matches = re.findall(pattern, text, re.DOTALL | re.IGNORECASE)
    code_snippets = [m.strip() for m in matches]
    return "\n".join(code_snippets)


columns = ["chatgpt-4o-latest", "o3-2025-04-16", "gpt-5-chat-latest", "gpt-4.1-2025-04-14", "claude-3-7-sonnet-20250219", "claude-sonnet-4-20250514", "deepseek-chat"]

for column in columns:
    prompts[f"code_{column}"] = prompts[column].apply(parse_code_snippets)

prompts

Unnamed: 0,conversation_id,message_id,role,message_text,gender,user_id,programming_language,language,chatgpt-4o-latest,o3-2025-04-16,...,deepseek-chat,claude-3-7-sonnet-20250219,claude-sonnet-4-20250514,code_chatgpt-4o-latest,code_o3-2025-04-16,code_gpt-5-chat-latest,code_gpt-4.1-2025-04-14,code_claude-3-7-sonnet-20250219,code_claude-sonnet-4-20250514,code_deepseek-chat
0,6,5,user,I want to use Dummy Hot encoding to replace th...,Woman (cisgender),16,python,en,To use **Dummy Hot Encoding** (one-hot encodin...,sk-learn’s OneHotEncoder is the tool that prod...,...,You can use `sklearn.preprocessing.OneHotEncod...,# One-Hot Encoding for 'embarked' column using...,You can use `OneHotEncoder` from sklearn:\n\n`...,import pandas as pd\nfrom sklearn.preprocessin...,,import pandas as pd\nfrom sklearn.preprocessin...,import pandas as pd\n\n# Example data\ndf = pd...,from sklearn.preprocessing import OneHotEncode...,from sklearn.preprocessing import OneHotEncode...,from sklearn.preprocessing import OneHotEncode...
1,7,43,user,whats the best way to encode and compress a ja...,Man (cisgender),25,javascript,en,To encode and compress a JavaScript object int...,Below is a pragmatic “recipe” that most teams ...,...,The best way is to use `JSON.stringify` combin...,# Best Way to Encode and Compress a JavaScript...,Use JSON.stringify + btoa + encodeURIComponent...,"import { compressToEncodedURIComponent, decomp...",import pako from 'pako';\nimport base64url...,"const obj = { name: ""Alice"", score: 123, activ...","const obj = { foo: ""bar"", nums: [1,2,3], valid...",function encodeObjectForUrl(obj) {\n // 1. Co...,// Encode\nconst encoded = encodeURIComponent(...,function encodeObjectToUrlSafe(obj) {\n const...
2,8,47,user,I have a pandas dataframe like this:\ndata\tpe...,Woman (cisgender),28,python,en,Absolutely! Here's how you can **split the `pe...,Below is one straightforward way to turn the t...,...,You can use pandas' `str.extractall()` with a ...,I'll help you extract the persona column into ...,You can use this approach to split the persona...,import pandas as pd\n\n# Assuming your datafra...,,import pandas as pd\n\n# Example dataframe\nda...,"import pandas as pd\n\n# Example, use your rea...",import pandas as pd\n\n# Function to parse per...,# Split the persona column and create a dictio...,# Extract all key-value pairs using regex\ndf_...
3,10,57,user,"as a NLP and LLM researcher, I am recently dow...",Non-binary,30,python,en,Your idea is **excellent and very practical**—...,Your idea is both useful and perfectly feasibl...,...,Your idea is excellent and highly feasible for...,# Evaluation of PDF Auto-Renaming Idea\n\nYour...,Your idea is excellent and very practical for ...,for pdf_file in folder:\n doc = fitz.open(p...,,import os\nimport re\nfrom PyPDF2 import PdfRe...,import os\nfrom pdfminer.high_level import ext...,import os\nimport re\nimport pdfplumber\nfrom ...,import os\nimport re\nfrom pathlib import Path...,import os\nimport re\nimport PyPDF2\nimport sp...
4,12,65,user,Blender and Python. I have a collection of hun...,Man (cisgender),34,python,en,You can automate this in **Blender** using **P...,Below is a self-contained script that you can ...,...,Here's a Python script that creates cube insta...,# Cube Instance Replication Script for Blender...,Here's a Blender Python script that creates cu...,import bpy\nimport mathutils\n\n# === Configur...,import bpy\nimport mathutils\n\n# ------------...,import bpy\n\n# Name of the source collection\...,import bpy\n\n# --- SETTINGS ---\nSOURCE_COLLE...,import bpy\nimport mathutils\nfrom math import...,import bpy\nimport bmesh\nfrom mathutils impor...,import bpy\nimport bmesh\nfrom mathutils impor...
5,13,126,user,"how to run a Python future without blocking, i...",Man (cisgender),46,python,en,"In Python's `asyncio`, running a `Future` or c...",A coroutine (what you get back when you call a...,...,One way to run a Python future without blockin...,# Running a Python Future Without Blocking\n\n...,You can use `asyncio.create_task()` to run a c...,import asyncio\nimport inspect\n\nasync def my...,import asyncio\n\nasync def job():\n print(...,import asyncio\nfrom types import coroutine\n\...,"async def my_coro():\n print(""Start"")\n ...",import asyncio\n\nasync def main():\n # Cre...,import asyncio\n\nasync def my_coroutine():\n ...,import asyncio\n\nasync def run_future_with_ti...
6,15,242,user,hey can you write me a short python script for...,Woman (cisgender),48,python,en,Sure! Here's a short Python script you can use...,Here’s a compact cell you can drop straight in...,...,Here's a Python script that reads the CSV file...,Here's a short Python script for your Jupyter ...,```python\nimport pandas as pd\n\n# Read the C...,import pandas as pd\n\n# Read the CSV file\ndf...,import pandas as pd\n\n# 1. Read the data\ndf ...,import pandas as pd\n\n# Read in the CSV\ndf =...,import pandas as pd\n\n# Read the CSV file\ndf...,import pandas as pd\nimport nltk\nfrom nltk.to...,import pandas as pd\n\n# Read the CSV file\ndf...,import pandas as pd\nimport re\n\n# Read the C...
7,18,266,user,how can I combine two grib files in jupyter no...,Woman (cisgender),60,python,de,Combining two GRIB files in a Jupyter Notebook...,Below is a “cook-book” that works in a Jupyter...,...,To combine two GRIB files in Jupyter Notebook ...,# Combining Two GRIB Files in Jupyter Notebook...,You can use the `xarray` library to combine tw...,import xarray as xr\n\n# Load the two GRIB fil...,,import xarray as xr\n\n# Open two GRIB files\n...,!pip install xarray cfgrib\nimport xarray as x...,import pygrib\n\n# Define input and output fil...,import xarray as xr\n\n# Open the two GRIB fil...,import xarray as xr\nimport cfgrib\n\n# Load b...
8,20,268,user,I work with Python and have to read an NMEA fi...,Woman (cisgender),65,python,de,Reading an NMEA file (like `Task3_NMEA.txt`) u...,Below is a short “cook-book” for reading an NM...,...,Read the NMEA file using Python's built-in `op...,# Reading an NMEA File in Python\n\nHere's a s...,Here's one way to read an NMEA file in Python:...,"# Open the file\nwith open('Task3_NMEA.txt', '...","import pynmea2\n\nfile_name = ""Task3_NMEA.txt""...",# Read NMEA file and parse lines\ndef read_nme...,filename = 'Task3_NMEA.txt'\n\nwith open(filen...,"def read_nmea_file(file_path):\n """"""\n R...",def read_nmea_file(filename):\n nmea_data =...,"with open('Task3_NMEA.txt', 'r') as file:\n ..."
9,21,290,user,please write method to unzip file in python,Woman (cisgender),73,python,en,Sure! Here's a simple Python method to unzip a...,"Below is a standalone, reusable helper that un...",...,Here's a Python method to unzip files using th...,```python\nimport zipfile\n\ndef unzip_file(zi...,```python\nimport zipfile\nimport os\n\ndef un...,import zipfile\nimport os\n\ndef unzip_file(zi...,import os\nimport zipfile\nfrom pathlib import...,import zipfile\nimport os\n\ndef unzip_file(zi...,import zipfile\nimport os\n\ndef unzip_file(zi...,"import zipfile\n\ndef unzip_file(zip_path, ext...",import zipfile\nimport os\n\ndef unzip_file(zi...,import zipfile\nimport os\n\ndef unzip_file(zi...


## Pivot columns for saving

In [26]:

#model_columns = ["chatgpt-4o-latest", "o3-2025-04-16", "gpt-5-chat-latest", "gpt-4.1-2025-04-14", "claude-3-7-sonnet-20250219", "claude-sonnet-4-20250514"]
# code_columns = [f"code_{col}" for col in columns if f"code_{col}" in prompts.columns]
# print(code_columns)

melted = prompts.melt(
    id_vars=["conversation_id", "message_id", "message_text", "gender", "user_id", "programming_language"],
    value_vars=columns,
    var_name="model_version",
    value_name="model_reply"
)

code_columns = [f"code_{col}" for col in columns if f"code_{col}" in prompts.columns]

if code_columns:
    code_melted = prompts.melt(
        id_vars=["conversation_id", "message_id", "message_text", "gender", "user_id", "programming_language"],
        value_vars=code_columns,
        var_name="model_version_code",
        value_name="code"
    )

    code_melted["model_version"] = code_melted["model_version_code"].str.replace("code_", "")

    melted = melted.merge(
        code_melted[["conversation_id", "message_id", "model_version", "code"]],
        on=["conversation_id", "message_id", "model_version"],
        how="left"
    )
else:

    melted["code"] = None


melted = melted.reset_index(drop=True)
melted["code_reply_id"] = melted.index + 1


final_df = melted[
    ["code_reply_id", "conversation_id", "message_id", "message_text", "gender", "user_id",
     "model_reply", "code", "model_version", "programming_language"]
]


final_df

Unnamed: 0,code_reply_id,conversation_id,message_id,message_text,gender,user_id,model_reply,code,model_version,programming_language
0,1,6,5,I want to use Dummy Hot encoding to replace th...,Woman (cisgender),16,To use **Dummy Hot Encoding** (one-hot encodin...,import pandas as pd\nfrom sklearn.preprocessin...,chatgpt-4o-latest,python
1,2,7,43,whats the best way to encode and compress a ja...,Man (cisgender),25,To encode and compress a JavaScript object int...,"import { compressToEncodedURIComponent, decomp...",chatgpt-4o-latest,javascript
2,3,8,47,I have a pandas dataframe like this:\ndata\tpe...,Woman (cisgender),28,Absolutely! Here's how you can **split the `pe...,import pandas as pd\n\n# Assuming your datafra...,chatgpt-4o-latest,python
3,4,10,57,"as a NLP and LLM researcher, I am recently dow...",Non-binary,30,Your idea is **excellent and very practical**—...,for pdf_file in folder:\n doc = fitz.open(p...,chatgpt-4o-latest,python
4,5,12,65,Blender and Python. I have a collection of hun...,Man (cisgender),34,You can automate this in **Blender** using **P...,import bpy\nimport mathutils\n\n# === Configur...,chatgpt-4o-latest,python
...,...,...,...,...,...,...,...,...,...,...
233,234,70,1524,can you write a test that tests whether parse ...,Woman (cisgender),90,Here's a test that verifies `parse_obj` works ...,import pytest\nfrom pydantic import Validation...,deepseek-chat,python
234,235,72,1534,can you write me some python code to count the...,Woman (cisgender),11,Here's a Python function to count the number o...,import json\n\ndef count_json_objects(json_dat...,deepseek-chat,python
235,236,74,1538,I want to merge these two dataframes: \ndf_tra...,Woman (cisgender),28,To merge these two dataframes based on the 'cl...,"merged_df = pd.merge(df_train_claim, df_train_...",deepseek-chat,python
236,237,78,1598,"0,tcp,http,SF,215,45076,0,0,0,0,0,1,0,0,0,0,0,...",Woman (cisgender),73,To determine the best k for k-means clustering...,from sklearn.cluster import KMeans\n from sk...,deepseek-chat,python


## Save to database

In [27]:
final_df.to_sql("translated_scratch_prompts_with_code", conn, if_exists="replace", index=False)

238