# Code Generator

The requirement: use a Frontier model to generate high performance C++ code from Python code


<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Reminder: OPTIONAL to execute C++ code or Rust code</h2>
            <span style="color:#f71;">As an alternative, you can run it on the website given yesterday</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h1 style="color:#900;">Important Note</h1>
            <span style="color:#900;">
            In this lab, I use high end models GPT 5, Claude 4.5 Sonnet, Gemini 2.5 Pro, Grok 4, which are the slightly higher priced models. The costs are still low, but if you'd prefer to keep costs ultra low, please pick lower cost models like gpt-5-nano.
            </span>
        </td>
    </tr>
</table>

In [1]:
# imports

import os
import io
import sys
from dotenv import load_dotenv
from openai import OpenAI
import gradio as gr
import subprocess
from IPython.display import Markdown, display


In [2]:
load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
grok_api_key = os.getenv('GROK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')
openrouter_api_key = os.getenv('OPENROUTER_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if grok_api_key:
    print(f"Grok API Key exists and begins {grok_api_key[:4]}")
else:
    print("Grok API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

if openrouter_api_key:
    print(f"OpenRouter API Key exists and begins {openrouter_api_key[:6]}")
else:
    print("OpenRouter API Key not set (and this is optional)")



In [3]:
# Connect to client libraries

openai = OpenAI()

anthropic_url = "https://api.anthropic.com/v1/"
gemini_url = "https://generativelanguage.googleapis.com/v1beta/openai/"
grok_url = "https://api.x.ai/v1"
groq_url = "https://api.groq.com/openai/v1"
ollama_url = "http://localhost:11434/v1"
openrouter_url = "https://openrouter.ai/api/v1"

anthropic = OpenAI(api_key=anthropic_api_key, base_url=anthropic_url)
gemini = OpenAI(api_key=google_api_key, base_url=gemini_url)
grok = OpenAI(api_key=grok_api_key, base_url=grok_url)
groq = OpenAI(api_key=groq_api_key, base_url=groq_url)
ollama = OpenAI(api_key="ollama", base_url=ollama_url)
openrouter = OpenAI(api_key=openrouter_api_key, base_url=openrouter_url)



In [7]:
models = ["gpt-5", "claude-sonnet-4-5-20250929", "gemini-2.5-pro", "glm-4.7:cloud", "deepseek-v3.2:cloud"]

clients = {"gpt-5": openai, "claude-sonnet-4-5-20250929": anthropic, "gemini-2.5-pro": gemini, "glm-4.7:cloud": ollama, "deepseek-v3.2:cloud": ollama}

# Want to keep costs ultra-low? Replace this with models of your choice, using the examples from yesterday

# Test DeepSeek V3.2 Cloud connection (optional - uncomment to test)
test_response = ollama.chat.completions.create(
    model="deepseek-v3.2:cloud",
    messages=[{"role": "user", "content": "Say hello in one sentence."}],
    max_tokens=50
)
print("✅ DeepSeek V3.2 Cloud is working!")
print(test_response.choices[0].message.content)

In [None]:
from system_info import retrieve_system_info, rust_toolchain_info
import os

# Add Rust to PATH so rust_toolchain_info() can find it
cargo_bin = os.path.expanduser("~/.cargo/bin")
if cargo_bin not in os.environ.get("PATH", ""):
    os.environ["PATH"] = f"{cargo_bin}:{os.environ.get('PATH', '')}"

system_info = retrieve_system_info()
rust_info = rust_toolchain_info()
rust_info

In [11]:
message = f"""
Here is a report of the system information for my computer.
I want to run a Rust compiler to compile a single rust file called main.rs and then execute it in the simplest way possible.
Please reply with whether I need to install a Rust toolchain to do this. If so, please provide the simplest step by step instructions to do so.

If I'm already set up to compile Rust code, then I'd like to run something like this in Python to compile and execute the code:
```python
compile_command = # something here - to achieve the fastest possible runtime performance
compile_result = subprocess.run(compile_command, check=True, text=True, capture_output=True)
run_command = # something here
run_result = subprocess.run(run_command, check=True, text=True, capture_output=True)
return run_result.stdout
```
Please tell me exactly what I should use for the compile_command and run_command.
Have the maximum possible runtime performance in mind; compile time can be slow. Fastest possible runtime performance for this platform is key.
Reply with the commands in markdown.

System information:
{system_info}

Rust toolchain information:
{rust_info}
"""

response = openai.chat.completions.create(model=models[0], messages=[{"role": "user", "content": message}])
display(Markdown(response.choices[0].message.content))

## For C++, overwrite this with the commands from yesterday, or for Rust, use the new commands

Or just use the website like yesterday:

 https://www.programiz.com/cpp-programming/online-compiler/

In [24]:
# Rust compile command for maximum runtime performance
# Note: Ensure Rust is in PATH by running: source "$HOME/.cargo/env" in your shell
# Or use the full path: os.path.expanduser("~/.cargo/bin/rustc")
compile_command = [
    "rustc",
    "-C", "opt-level=3",
    "-C", "lto=fat",
    "-C", "codegen-units=1",
    "-C", "target-cpu=native",
    "-C", "panic=abort",
    "-C", "strip=symbols",
    "-o", "main",
    "main.rs"
]

run_command = ["./main"]


## And now, on with the main task

In [25]:
language = "Rust" # or "C++"
extension = "rs" if language == "Rust" else "cpp"

system_prompt = f"""
Your task is to convert Python code into high performance {language} code.
Respond only with {language} code. Do not provide any explanation other than occasional comments.
The {language} response needs to produce an identical output in the fastest possible time.
"""

def user_prompt_for(python):
    return f"""
Port this Python code to {language} with the fastest possible implementation that produces identical output in the least time.
The system information is:
{system_info}
Your response will be written to a file called main.{language} and then compiled and executed; the compilation command is:
{compile_command}
Respond only with {language} code.
Python code to port:

```python
{python}
```
"""

In [26]:
def messages_for(python):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(python)}
    ]
 

In [27]:
def write_output(code):
    with open(f"main.{extension}", "w") as f:
        f.write(code)

In [28]:
def port(model, python):
    client = clients[model]
    reasoning_effort = "high" if 'gpt' in model else None
    response = client.chat.completions.create(model=model, messages=messages_for(python), reasoning_effort=reasoning_effort)
    reply = response.choices[0].message.content
    reply = reply.replace('```cpp','').replace('```rust','').replace('```','')
    return reply

In [29]:
def run_python(code):
    globals_dict = {"__builtins__": __builtins__}

    buffer = io.StringIO()
    old_stdout = sys.stdout
    sys.stdout = buffer

    try:
        exec(code, globals_dict)
        output = buffer.getvalue()
    except Exception as e:
        output = f"Error: {e}"
    finally:
        sys.stdout = old_stdout

    return output

In [30]:
# Use the commands from GPT 5

def compile_and_run(code):
    write_output(code)
    
    # Ensure Rust is in PATH (needed for notebook environments)
    import os
    env = os.environ.copy()
    cargo_bin = os.path.expanduser("~/.cargo/bin")
    if cargo_bin not in env.get("PATH", ""):
        env["PATH"] = f"{cargo_bin}:{env.get('PATH', '')}"
    
    try:
        # Compile the Rust code
        compile_result = subprocess.run(
            compile_command, 
            check=True, 
            text=True, 
            capture_output=True, 
            env=env
        )
        
        # Run the compiled binary
        run_result = subprocess.run(
            run_command, 
            check=True, 
            text=True, 
            capture_output=True, 
            env=env
        )
        return run_result.stdout
    except subprocess.CalledProcessError as e:
        # Return both stdout and stderr for better error messages
        error_msg = f"An error occurred:\n"
        if e.stderr:
            error_msg += f"STDERR: {e.stderr}\n"
        if e.stdout:
            error_msg += f"STDOUT: {e.stdout}\n"
        return error_msg
    except FileNotFoundError as e:
        return f"Error: Rust compiler not found. Please ensure Rust is installed.\n{e}"

In [31]:
python_hard = """# Be careful to support large numbers

def lcg(seed, a=1664525, c=1013904223, m=2**32):
    value = seed
    while True:
        value = (a * value + c) % m
        yield value
        
def max_subarray_sum(n, seed, min_val, max_val):
    lcg_gen = lcg(seed)
    random_numbers = [next(lcg_gen) % (max_val - min_val + 1) + min_val for _ in range(n)]
    max_sum = float('-inf')
    for i in range(n):
        current_sum = 0
        for j in range(i, n):
            current_sum += random_numbers[j]
            if current_sum > max_sum:
                max_sum = current_sum
    return max_sum

def total_max_subarray_sum(n, initial_seed, min_val, max_val):
    total_sum = 0
    lcg_gen = lcg(initial_seed)
    for _ in range(20):
        seed = next(lcg_gen)
        total_sum += max_subarray_sum(n, seed, min_val, max_val)
    return total_sum

# Parameters
n = 10000         # Number of random numbers
initial_seed = 42 # Initial seed for the LCG
min_val = -10     # Minimum value of random numbers
max_val = 10      # Maximum value of random numbers

# Timing the function
import time
start_time = time.time()
result = total_max_subarray_sum(n, initial_seed, min_val, max_val)
end_time = time.time()

print("Total Maximum Subarray Sum (20 runs):", result)
print("Execution Time: {:.6f} seconds".format(end_time - start_time))
"""

In [32]:
from styles import CSS

with gr.Blocks(css=CSS, theme=gr.themes.Monochrome(), title=f"Port from Python to {language}") as ui:
    with gr.Row(equal_height=True):
        with gr.Column(scale=6):
            python = gr.Code(
                label="Python (original)",
                value=python_hard,
                language="python",
                lines=26
            )
        with gr.Column(scale=6):
            cpp = gr.Code(
                label=f"{language} (generated)",
                value="",
                language="cpp",
                lines=26
            )

    with gr.Row(elem_classes=["controls"]):
        python_run = gr.Button("Run Python", elem_classes=["run-btn", "py"])
        model = gr.Dropdown(models, value=models[0], show_label=False)
        convert = gr.Button(f"Port to {language}", elem_classes=["convert-btn"])
        cpp_run = gr.Button(f"Run {language}", elem_classes=["run-btn", "cpp"])

    with gr.Row(equal_height=True):
        with gr.Column(scale=6):
            python_out = gr.TextArea(label="Python result", lines=8, elem_classes=["py-out"])
        with gr.Column(scale=6):
            cpp_out = gr.TextArea(label=f"{language} result", lines=8, elem_classes=["cpp-out"])

    convert.click(fn=port, inputs=[model, python], outputs=[cpp])
    python_run.click(fn=run_python, inputs=[python], outputs=[python_out])
    cpp_run.click(fn=compile_and_run, inputs=[cpp], outputs=[cpp_out])

ui.launch(inbrowser=True)


## RESULTS!

Qwen 2.5 Coder: FAIL  
Gemini 2.5 Pro: FAIL  
DeepSeek Coder v2: FAIL  
Qwen3 Coder 30B: FAIL  
Claude Sonnet 4.5: FAIL    
GPT-5: FAIL    

3rd place: GPT-oss-20B: 0.000341  
2nd place: Grok 4: 0.000317  
**1st place: OpenAI GPT-OSS 120B: 0.000304**  

In [None]:
print(f"In Ed's experimenet, the GPT-OSS 120B model outcome is {33.755209/0.000304:,.0f} times faster than the Python code.")

### Quick Check: What's Available in Ollama?

### Check Your Current Models:


### Test Your Current Setup

Your notebook is already configured to use `deepseek-v3.2:cloud`. You can test it right now with your existing code! The cloud version works perfectly for most use cases and doesn't require any local hardware.

**If you want to try a smaller LOCAL DeepSeek model**, you could use:
- `deepseek-coder-v2:latest` (8.9 GB - already installed!)
- `deepseek-r1:latest` (4.7 GB - already installed!)

These run locally but are smaller models with different capabilities.


## ✅ Pricing & Configuration Status

### Is the Cloud Version FREE?

**YES!** The `deepseek-v3.2:cloud` model is **FREE** to use through Ollama's cloud service:

- **Free Tier:** Unlimited access to cloud models (with reasonable usage limits)
- **No Credit Card Needed:** Completely free to use
- **Pro/Max Plans:** Available at $20/month and $100/month for higher limits, but free tier is sufficient for most use cases

### ⚠️ Authentication Required for Cloud Models

**IMPORTANT:** Cloud models require you to sign in to Ollama (it's free, just need an account).

**To fix authentication errors:**

1. **Run this command in your terminal:**
   ```bash
   ollama signin
   ```

2. **Visit the URL it provides** (it will look like `https://ollama.com/connect?name=...&key=...`)

3. **Sign in or create a free account** on Ollama's website

4. **Verify it worked** - try using the model again in your notebook

**Alternative:** If you don't want to sign in, you can use local models instead:
- `deepseek-coder-v2:latest` (8.9 GB - already installed, runs locally)
- `deepseek-r1:latest` (4.7 GB - already installed, runs locally)

### Do You Have the Right Configuration?

**YES!** Your configuration is **perfectly set up**:

✅ **Ollama is installed** (version 0.13.5)  
✅ **Ollama is running** (confirmed via API check)  
✅ **DeepSeek V3.2 Cloud is installed** (`deepseek-v3.2:cloud`)  
✅ **Client is configured correctly:**
   - Base URL: `http://localhost:11434/v1`
   - Model name: `deepseek-v3.2:cloud`
   - Client mapping: `{"deepseek-v3.2:cloud": ollama}`

**Once you sign in, you're all set!** 🎉


In [6]:
# Get your Ollama sign-in link
import subprocess

result = subprocess.run(["ollama", "signin"], capture_output=True, text=True)
print(result.stdout)
print("\n" + "="*60)
print("📝 INSTRUCTIONS:")
print("="*60)
print("1. Copy the URL above (starts with https://ollama.com/connect)")
print("2. Open it in your browser")
print("3. Sign in or create a free Ollama account")
print("4. Come back and try using deepseek-v3.2:cloud again!")
print("="*60)


## RESULTS - Rust Code Execution Times

Ranked from fastest to slowest:

**🥇 1st place: Gemini 2.5 Pro** - 0.000684 seconds  
**🥈 2nd place: GLM 4.7** - 0.000701 seconds  
**🥉 3rd place: Claude Sonnet 4.5** - 0.000910 seconds  
**4th place: GPT-5** - 0.001007 seconds  
**5th place: DeepSeek V3.2** - 0.525210 seconds
