# Running the prompts against several models for code quality comparison

Models:
- GPT 4.1
- GPT 4o
- GPT o3
- GPT o1
- Claude Sonnet 3.7
- Claude Sonnet 4.0
- Deepseek

Annotating the prompts by characteristics:
- Personification score
- request type (question vs command)
- length?
-

In [23]:
import sqlite3
import pandas as pd
import os

db_path = "../../giicg.db"
if not os.path.exists(db_path):
    raise FileNotFoundError(f"Database file does not exist: {db_path}")

conn = sqlite3.connect(db_path)
prompts = pd.read_sql("SELECT * FROM translated_scratch_prompts", conn)
prompts = prompts.drop(columns=['conversational', 'code', 'other'])
prompts

Unnamed: 0,conversation_id,message_id,role,message_text,gender,user_id,language
0,6,5,user,I want to use Dummy Hot encoding to replace th...,Woman (cisgender),16,en
1,7,43,user,whats the best way to encode and compress a ja...,Man (cisgender),25,en
2,8,47,user,I have a pandas dataframe like this:\ndata\tpe...,Woman (cisgender),28,en
3,10,57,user,"as a NLP and LLM researcher, I am recently dow...",Non-binary,30,en
4,12,65,user,Blender and Python. I have a collection of hun...,Man (cisgender),34,en
5,13,126,user,"how to run a Python future without blocking, i...",Man (cisgender),46,en
6,15,242,user,hey can you write me a short python script for...,Woman (cisgender),48,en
7,18,266,user,how can I combine two grib files in jupyter no...,Woman (cisgender),60,de
8,20,268,user,I work with Python and have to read an NMEA fi...,Woman (cisgender),65,de
9,21,290,user,please write method to unzip file in python,Woman (cisgender),73,en


## Open AI & Deepseek

In [25]:
from openai import OpenAI

gpt_client = OpenAI()
#"o3-2025-04-16", "gpt-5-chat-latest", "gpt-4.1-2025-04-14", "gpt-4o-2024-08-06"
gpt_models = ["chatgpt-4o-latest", "o3-2025-04-16"]

deepseek_client = OpenAI(api_key=os.environ.get("DEEPSEEK_API_KEY"), base_url="https://api.deepseek.com")
deepseek_models = ["deepseek-chat"]

def prompt_gpt(client, model, prompt):
    response = client.responses.create(
        model=model,
        input=prompt
    )
    return response.output_text

for model in gpt_models:
    prompts[model] = prompts["message_text"].apply(lambda x: prompt_gpt(gpt_client, model, x))

def prompt_deepseek(client, model, prompt):
    response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "user", "content": prompt}
            ],
            #max_tokens=1000,
            #temperature=0.0
        )

    return response.choices[0].message.content


for model in deepseek_models:
    prompts[model] = prompts["message_text"].apply(lambda x: prompt_deepseek(deepseek_client, model, x))

prompts



KeyboardInterrupt: 

In [26]:
prompts

Unnamed: 0,conversation_id,message_id,role,message_text,gender,user_id,language,chatgpt-4o-latest,o3-2025-04-16
0,6,5,user,I want to use Dummy Hot encoding to replace th...,Woman (cisgender),16,en,To perform **dummy hot encoding** (a variation...,With scikit-learn you use the class sklearn.pr...
1,7,43,user,whats the best way to encode and compress a ja...,Man (cisgender),25,en,To encode and compress a **JavaScript object**...,Below is a proven pattern that balances simpli...
2,8,47,user,I have a pandas dataframe like this:\ndata\tpe...,Woman (cisgender),28,en,Certainly! You can extract the information fro...,Yes. \nTreat every line break as the separato...
3,10,57,user,"as a NLP and LLM researcher, I am recently dow...",Non-binary,30,en,That's a **great and practical idea**—and you'...,Your idea is perfectly feasible and—judging fr...
4,12,65,user,Blender and Python. I have a collection of hun...,Man (cisgender),34,en,To replicate a collection of objects in **Blen...,Below is a self-contained script you can paste...
5,13,126,user,"how to run a Python future without blocking, i...",Man (cisgender),46,en,To run a Python `Future` or coroutine *without...,In `asyncio` you don’t need any special API to...
6,15,242,user,hey can you write me a short python script for...,Woman (cisgender),48,en,Sure! Below is a short Python script you can u...,"Here’s a minimal, self-contained cell you can ..."
7,18,266,user,how can I combine two grib files in jupyter no...,Woman (cisgender),60,de,To combine two GRIB files in a Jupyter Noteboo...,Below are three practical ways you can “combin...
8,20,268,user,I work with Python and have to read an NMEA fi...,Woman (cisgender),65,de,Great! NMEA (National Marine Electronics Assoc...,Sure – could you upload or paste a short porti...
9,21,290,user,please write method to unzip file in python,Woman (cisgender),73,en,Sure! Here's a simple method in Python to unzi...,"Below is a compact, production-ready helper th..."


## Anthropic

In [29]:
from anthropic import Anthropic
import dotenv
dotenv.load_dotenv()
#"claude-sonnet-4-20250514",
models = [ "claude-3-7-sonnet-20250219"]
claude_client = Anthropic(
    api_key=os.environ.get("ANTHROPIC_API_KEY"),
)

def prompt_claude(client, model, prompt):
    response = client.messages.create(
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": prompt,
        }
    ],
    model=model,
    )
    return response.content

for model in models:
    prompts[model] = prompts["message_text"].apply(lambda x: prompt_claude(claude_client, model, x))

In [30]:
prompts

Unnamed: 0,conversation_id,message_id,role,message_text,gender,user_id,language,chatgpt-4o-latest,o3-2025-04-16,claude-3-7-sonnet-20250219
0,6,5,user,I want to use Dummy Hot encoding to replace th...,Woman (cisgender),16,en,To perform **dummy hot encoding** (a variation...,With scikit-learn you use the class sklearn.pr...,"[TextBlock(citations=None, text=""# Using One-H..."
1,7,43,user,whats the best way to encode and compress a ja...,Man (cisgender),25,en,To encode and compress a **JavaScript object**...,Below is a proven pattern that balances simpli...,"[TextBlock(citations=None, text=""# Encoding an..."
2,8,47,user,I have a pandas dataframe like this:\ndata\tpe...,Woman (cisgender),28,en,Certainly! You can extract the information fro...,Yes. \nTreat every line break as the separato...,"[TextBlock(citations=None, text=""I can help yo..."
3,10,57,user,"as a NLP and LLM researcher, I am recently dow...",Non-binary,30,en,That's a **great and practical idea**—and you'...,Your idea is perfectly feasible and—judging fr...,"[TextBlock(citations=None, text='# Evaluation ..."
4,12,65,user,Blender and Python. I have a collection of hun...,Man (cisgender),34,en,To replicate a collection of objects in **Blen...,Below is a self-contained script you can paste...,"[TextBlock(citations=None, text='# Blender Scr..."
5,13,126,user,"how to run a Python future without blocking, i...",Man (cisgender),46,en,To run a Python `Future` or coroutine *without...,In `asyncio` you don’t need any special API to...,"[TextBlock(citations=None, text='# Running a P..."
6,15,242,user,hey can you write me a short python script for...,Woman (cisgender),48,en,Sure! Below is a short Python script you can u...,"Here’s a minimal, self-contained cell you can ...","[TextBlock(citations=None, text='# Python Scri..."
7,18,266,user,how can I combine two grib files in jupyter no...,Woman (cisgender),60,de,To combine two GRIB files in a Jupyter Noteboo...,Below are three practical ways you can “combin...,"[TextBlock(citations=None, text=""# Combining G..."
8,20,268,user,I work with Python and have to read an NMEA fi...,Woman (cisgender),65,de,Great! NMEA (National Marine Electronics Assoc...,Sure – could you upload or paste a short porti...,"[TextBlock(citations=None, text='# Reading an ..."
9,21,290,user,please write method to unzip file in python,Woman (cisgender),73,en,Sure! Here's a simple method in Python to unzi...,"Below is a compact, production-ready helper th...","[TextBlock(citations=None, text='# Python Meth..."


## Parse Code Snippets

In [32]:
import re

def parse_python_code_snippets(text):
    """
    Parse Python code snippets from text that are surrounded by three backticks
    and prefixed with 'python'.
    """
    pattern = r'```python\s*\n(.*?)\n```'
    matches = re.findall(pattern, text, re.DOTALL) # Find all matches using DOTALL flag to match across multiple lines
    code_snippets = [match.strip() for match in matches] # Clean up the matches by stripping leading/trailing whitespace

    return code_snippets

columns = ["chatgpt-4o-latest", "o3-2025-04-16"]

for column in columns:
    prompts[f"code_{column}"] = prompts[column].apply(parse_python_code_snippets)

prompts

Unnamed: 0,conversation_id,message_id,role,message_text,gender,user_id,language,chatgpt-4o-latest,o3-2025-04-16,claude-3-7-sonnet-20250219,code_chatgpt-4o-latest,code_o3-2025-04-16
0,6,5,user,I want to use Dummy Hot encoding to replace th...,Woman (cisgender),16,en,To perform **dummy hot encoding** (a variation...,With scikit-learn you use the class sklearn.pr...,"[TextBlock(citations=None, text=""# Using One-H...",[import pandas as pd\nfrom sklearn.preprocessi...,[import pandas as pd\nfrom sklearn.preprocessi...
1,7,43,user,whats the best way to encode and compress a ja...,Man (cisgender),25,en,To encode and compress a **JavaScript object**...,Below is a proven pattern that balances simpli...,"[TextBlock(citations=None, text=""# Encoding an...",[],[]
2,8,47,user,I have a pandas dataframe like this:\ndata\tpe...,Woman (cisgender),28,en,Certainly! You can extract the information fro...,Yes. \nTreat every line break as the separato...,"[TextBlock(citations=None, text=""I can help yo...",[import pandas as pd\n\n# Your existing datafr...,[import pandas as pd\n\n# df is your original ...
3,10,57,user,"as a NLP and LLM researcher, I am recently dow...",Non-binary,30,en,That's a **great and practical idea**—and you'...,Your idea is perfectly feasible and—judging fr...,"[TextBlock(citations=None, text='# Evaluation ...","[from pathlib import Path\n\npdf_dir = Path(""/...","[#!/usr/bin/env python3\nimport os, re, json, ..."
4,12,65,user,Blender and Python. I have a collection of hun...,Man (cisgender),34,en,To replicate a collection of objects in **Blen...,Below is a self-contained script you can paste...,"[TextBlock(citations=None, text='# Blender Scr...",[import bpy\nimport mathutils\n\n# === CONFIGU...,[import bpy\nfrom mathutils import Vector\n\n#...
5,13,126,user,"how to run a Python future without blocking, i...",Man (cisgender),46,en,To run a Python `Future` or coroutine *without...,In `asyncio` you don’t need any special API to...,"[TextBlock(citations=None, text='# Running a P...",[import asyncio\n\nasync def my_coro():\n p...,"[import asyncio, time\n\nasync def job():\n ..."
6,15,242,user,hey can you write me a short python script for...,Woman (cisgender),48,en,Sure! Below is a short Python script you can u...,"Here’s a minimal, self-contained cell you can ...","[TextBlock(citations=None, text='# Python Scri...",[import pandas as pd\n\n# Read the CSV file\nd...,[import pandas as pd\n\n# 1. Load the CSV\ndf ...
7,18,266,user,how can I combine two grib files in jupyter no...,Woman (cisgender),60,de,To combine two GRIB files in a Jupyter Noteboo...,Below are three practical ways you can “combin...,"[TextBlock(citations=None, text=""# Combining G...",[import xarray as xr\n\n# Open both GRIB files...,[import pygrib\n\ndef append_gribs(input_files...
8,20,268,user,I work with Python and have to read an NMEA fi...,Woman (cisgender),65,de,Great! NMEA (National Marine Electronics Assoc...,Sure – could you upload or paste a short porti...,"[TextBlock(citations=None, text='# Reading an ...","[# Open and read the NMEA file\nfilename = ""Ta...",[]
9,21,290,user,please write method to unzip file in python,Woman (cisgender),73,en,Sure! Here's a simple method in Python to unzi...,"Below is a compact, production-ready helper th...","[TextBlock(citations=None, text='# Python Meth...",[import zipfile\nimport os\n\ndef unzip_file(z...,[import os\nimport shutil\nimport zipfile\nfro...
