# Python C extension generator

Use a Frontier model to generate a high performance Python C extension code from Python code.

Python C extension modules allows to integrate C coded and compiled modules into Python applications.

* [Python C Extensions](https://docs.python.org/3.13/extending/index.html)
* [Python's C API](https://docs.python.org/3.13/c-api/index.html)

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../../../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h1 style="color:#910;">Important Note</h1>
            <span style="color:#910;">
            In this lab, I use GPT-4o or GPT-5, which are slightly higher priced models.
            </span>
        </td>
    </tr>
</table>

In [1]:
# Imports.

import io
import os
import subprocess
import sys
from time import perf_counter
from timeit import timeit

import gradio as gr
from dotenv import load_dotenv
from openai import OpenAI
from pydantic import BaseModel

In [2]:
# Load environment variables from '.env' file.

load_dotenv(override=True)

True

In [3]:
# Initialize client and set the default LLM model to use.

# OPENAI_MODEL = "gpt-4o"
OPENAI_MODEL = "gpt-5"

openai = OpenAI()

In [4]:
# Define Pydantic model class for GPT response parsing.

class Extension_codes(BaseModel):
    """Pydantic model of a response containing the generated C code, the 'setup.py' code and an usage example."""
    c_code: str
    setup: str
    usage: str

In [5]:
# Define a function to print the optimization codes.

def print_optimization(optimization):
    """Print the optimization codes."""
    print(f"C CODE:\n{optimization.c_code}")
    print("---------------------------")
    print(f"setup.py:\n{optimization.setup}")
    print("---------------------------")
    print(f"USAGE:\n{optimization.usage}")

In [6]:
# Define a function to write outputs to a file with a given filename.

def write_file(data, filename):
    """Write data to a file with the specified filename."""
    with open(filename, "w") as file:
        file.write(data)

In [7]:
# Define a function to write the optimization codes to files.

def write_optimization(optimization, module_name):
    """Write the optimization codes to files."""
    write_file(optimization.c_code, f"{module_name}.c")
    write_file(optimization.setup, "setup.py")
    write_file(optimization.usage, "usage_example.py")

In [8]:
# Define system message for the LLM with instructions for generating the C extension code.

system_message = """
You are an assistant that reimplements Python code in high performance C extensions for Python.
Your responses must always be a JSON with the following structure:

{
    "c_code": "Optimized C extension for Python code",
    "setup": "The 'setup.py' code to compile the C extension for Python",
    "usage": "An example of usage of the C extension for Python code with time measurement and comparing with the original Python code"
}

Use comments sparingly and do not provide any explanation other than occasional comments.
The C extension for Python needs to produce an identical output in the fastest possible time.
Make sure the C extension for Python code is correct and can be compiled with 'python setup.py build' and used in Python.
The usage example must include a time measurement and a comparison with the original Python code.
Do not include any additional text or explanation outside the JSON structure.
Make sure the JSON is correctly formatted.
"""

In [9]:
# Define user prompt template and function to fill it.

def user_prompt_for(python_code, module_name):
    user_prompt = f"""
    Reimplement this Python code as a C extension for Python with the fastest possible implementation that produces identical output in the least time.
    Respond only with C extension for Python code, do not explain your work other than a few code comments.
    The module name, used to import, must be "{module_name}", the generated C file will be named "{module_name}.c".
    Pay attention to number types to ensure no int overflows.
    Remember to #include all necessary C packages such as iomanip or <python.h>

    The target architecture is {sys.platform}, take that in mind while generating the C code, specially
    when choosing types to use, and use the appropriate compiler flags.
    Make sure to use the Python C API correctly and manage memory properly to avoid leaks or crashes.

    Here is the Python code to reimplement:

    {python_code}"""
    return user_prompt

In [10]:
# Define function to create the messages for the LLM.

def messages_for(python_code, module_name):
    """Create the messages for the LLM given the Python code and the desired module name."""
    return [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_prompt_for(python_code, module_name)}]

In [11]:
# Test the messages function and print the messages.

for message in messages_for("print('Hello World')", "say_hello"):
    print(f"{message['role'].upper()}: {message['content']}")
    print("--------------------------------")

SYSTEM: 
You are an assistant that reimplements Python code in high performance C extensions for Python.
Your responses must always be a JSON with the following structure:

{
    "c_code": "Optimized C extension for Python code",
    "setup": "The 'setup.py' code to compile the C extension for Python",
    "usage": "An example of usage of the C extension for Python code with time measurement and comparing with the original Python code"
}

Use comments sparingly and do not provide any explanation other than occasional comments.
The C extension for Python needs to produce an identical output in the fastest possible time.
Make sure the C extension for Python code is correct and can be compiled with 'python setup.py build' and used in Python.
The usage example must include a time measurement and a comparison with the original Python code.
Do not include any additional text or explanation outside the JSON structure.
Make sure the JSON is correctly formatted.

-------------------------------

In [12]:
# Define optimization function using OpenAI's GPT model.

def optimize_gpt(python_code, module_name, model=OPENAI_MODEL):
    """Optimize the given Python code by generating a C extension for Python with the specified module name using the specified LLM model."""
    response = openai.chat.completions.parse(
        model=model,
        messages=messages_for(python_code, module_name),
        response_format=Extension_codes).choices[0].message.parsed
    return response

# Start with a math function that calculates ***π*** using the Leibniz formula.

This formula implies the iterative approximation of *π* using an alternating series,
the more iterations the more the precision but with a cost of more computation.
* [Leibniz formula for π](https://en.wikipedia.org/wiki/Leibniz_formula_for_%CF%80)

This is a good candidate to get a noticeable improvement by coding and compiling it into a Python C extension. 

> NOTE:
>
> We are creating an importable module not an executable program so the code to be optimized must contain only declarations such as DEF or CLASS.

In [13]:
# Define the Python function to be converted to a C extension and its module name.

module_name = "calculate_pi"

calculate_pi_code = f"""
def leibniz_pi(iterations):
    result = 1.0
    for i in range(1, iterations+1):
        j = i * 4 - 1
        result -= (1/j)
        j = i * 4 + 1
        result += (1/j)
    return result * 4
"""

# Define a function to test the performance of the calculus function.

def test_pi_calculation(calculus_function ,iterations=100_000_000):
    """Test the performance of the given calculus function."""
    start_time = perf_counter()
    result = calculus_function(iterations)
    end_time = perf_counter()
    print(f"Result: {result:.12f}")
    print(f"Execution Time: {(end_time - start_time):.6f} seconds")

# Execute function declaration.
exec(calculate_pi_code)

In [20]:
# Run original python code and time it.

test_pi_calculation(leibniz_pi, 100_000_000)

Result: 3.141592658589
Execution Time: 20.556854 seconds


In [21]:
# Average timing the original Python code running it several times.
# (Increase 'iterations' for better timing)

print("Timing...")
iterations = 5
average = timeit(lambda: leibniz_pi(100_000_000), number=iterations) / iterations
print(f"Python average execution time: {average:.6f} seconds")

Timing...
Python average execution time: 21.158541 seconds


In [14]:
# Request code optimization using GPT.

optimization = optimize_gpt(calculate_pi_code, module_name)

In [15]:
# Print generated extension code.

print_optimization(optimization)

C CODE:
#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <math.h>
#include <float.h>
#include <limits.h>
#include <stdint.h>

static PyObject* leibniz_pi(PyObject* self, PyObject* args) {
    PyObject* iterations_obj;
    if (!PyArg_ParseTuple(args, "O", &iterations_obj)) {
        return NULL;
    }

    long long n_signed;
    int overflow = 0;
    n_signed = PyLong_AsLongLongAndOverflow(iterations_obj, &overflow);
    if (n_signed == -1 && PyErr_Occurred() && overflow == 0) {
        return NULL;
    }

    unsigned long long n = 0ULL;
    if (overflow < 0) {
        n = 0ULL;
    } else if (overflow > 0) {
        unsigned long long tmp = PyLong_AsUnsignedLongLong(iterations_obj);
        if (tmp == (unsigned long long)-1 && PyErr_Occurred()) {
            return NULL;
        }
        n = tmp;
    } else {
        if (n_signed <= 0) {
            n = 0ULL;
        } else {
            n = (unsigned long long)n_signed;
        }
    }

    double result = 1.0;
    if (n == 0U

In [16]:
# Write the generated code to files.
# (Will overwrite existing files)

write_optimization(optimization, module_name)

# Compiling C Extension and executing

The python setup command may fail inside Jupyter lab, if that's the case try it directly on the command line.

There are two cells with WINDOWS ONLY, those are to manage the fact windows comes with two command lines,
the old CMD (MS-DOS style) and the new POWERSHELL (Unix style).

It is controlled by the COMSPEC environment variable.\
*(Using this variable is completely innocuous on UNIX systems, they will simply ignore it)*

Most of command lines present here are Unix style but the building one requires CMD so
we switch to CMD before compiling to later restore the preset one.

In [21]:
# Clean previous builds.
# (Make sure to run this cell before running the compile cell a second time only)
# (May cast errors if no previous build exists)

!rm -r build/

In [17]:
# [WINDOWS ONLY]
# Set COMSPEC to cmd.exe to avoid issues with some C compilers on Windows.
# (Remember to restore original COMSPEC after compilation and testing)
preset_comspec = os.environ.get("COMSPEC")
os.environ["COMSPEC"] = "C:\\Windows\\System32\\cmd.exe"

In [None]:
# Compile the C extension.
# (Will fail no C compiler is installed)
# (In case of errors, try directly on the command line)

!python setup.py build_ext --inplace

In [19]:
# [WINDOWS ONLY]
# Restore original COMSPEC.

os.environ["COMSPEC"] = preset_comspec

In [20]:
# Run the usage example to test the compiled C extension.
exec(optimization.usage)

Iterations: 5000000
C extension result: 3.1415927535897814
Python result:      3.1415927535897814
Absolute difference: 0.0
C extension time: 0.037515 s
Python time:      1.046732 s
Speedup: 27.90x


In [21]:
# Import newly created C extension and compare performance with original Python code.

from calculate_pi import leibniz_pi as c_leibniz_pi

print("Testing original Python code:")
test_pi_calculation(leibniz_pi, 100_000_000)
print("Testing C extension code:")
test_pi_calculation(c_leibniz_pi, 100_000_000)


Testing original Python code:
Result: 3.141592658589
Execution Time: 20.350486 seconds
Testing C extension code:
Result: 3.141592658589
Execution Time: 0.759571 seconds


# Lets try with a more complex code

Now we define three functions that together implements the calculation of the "total maximum subarray sum"
by finding the largest sum of a contiguous subarray within a given array of numbers.

* [Maximum subarray problem](https://en.wikipedia.org/wiki/Maximum_subarray_problem)

This algorithm requires much more computation and steps than the previous one, we may expect a heavy
improvement by coding and compiling it into a Python C extension. 

> NOTE:
>
> We are creating an importable module not an executable program so the code to be optimized must contain only declarations such as DEF or CLASS.

In [22]:
# Define the Python function to be converted to a C extension and its module name.

module_name = "python_hard"

python_hard_code = """
# Be careful to support large number sizes

def lcg(seed, a=1664525, c=1013904223, m=2**32):
    value = seed
    while True:
        value = (a * value + c) % m
        yield value

def max_subarray_sum(n, seed, min_val, max_val):
    lcg_gen = lcg(seed)
    random_numbers = [next(lcg_gen) % (max_val - min_val + 1) + min_val for _ in range(n)]
    max_sum = float('-inf')
    for i in range(n):
        current_sum = 0
        for j in range(i, n):
            current_sum += random_numbers[j]
            if current_sum > max_sum:
                max_sum = current_sum
    return max_sum

def total_max_subarray_sum(n, initial_seed, min_val, max_val):
    total_sum = 0
    lcg_gen = lcg(initial_seed)
    for _ in range(20):
        seed = next(lcg_gen)
        total_sum += max_subarray_sum(n, seed, min_val, max_val)
    return total_sum
"""

# Define a function to test the performance of the calculus function.

def test_subarray_sum(calculus_function ,n=1000, initial_seed=42, min_val=-10, max_val=10):
    """Test the performance of the given calculus function."""
    start_time = perf_counter()
    result = calculus_function(n, initial_seed, min_val, max_val)
    end_time = perf_counter()
    print("Total Maximum Subarray Sum (20 runs):", result)
    print("Execution Time: {:.6f} seconds".format(end_time - start_time))


# Execute function declarations.
exec(python_hard_code)

In [None]:
# Run original python code and time it.

test_subarray_sum(total_max_subarray_sum, 10000, 42, -10, 10)

Total Maximum Subarray Sum (20 runs): 10980
Execution Time: 61.362418 seconds


In [23]:
# Request code optimization using GPT.

optimization = optimize_gpt(python_hard_code, module_name)

In [24]:
# Print generated extension code.

print_optimization(optimization)

C CODE:
#include <Python.h>
#include <stdint.h>
#include <stdlib.h>
#include <limits.h>
#include <math.h>

// LCG step with 32-bit wrap-around
static inline uint32_t lcg_next(uint32_t *state) {
    *state = (uint32_t)(1664525u * (*state) + 1013904223u);
    return *state;
}

static inline int add_overflow_int64(int64_t a, int64_t b, int64_t *res) {
    if ((b > 0 && a > INT64_MAX - b) || (b < 0 && a < INT64_MIN - b)) return 1;
    *res = a + b;
    return 0;
}

// Kadane for int64 array with overflow detection; returns PyLong or NULL (on overflow -> signal via *overflowed)
static PyObject* kadane_int64(const int64_t *arr, Py_ssize_t n, int *overflowed) {
    if (n <= 0) {
        return PyFloat_FromDouble(-INFINITY);
    }
    int64_t meh = arr[0];
    int64_t msf = arr[0];
    for (Py_ssize_t i = 1; i < n; ++i) {
        int64_t x = arr[i];
        if (meh > 0) {
            int64_t tmp;
            if (add_overflow_int64(meh, x, &tmp)) { *overflowed = 1; return NULL; }
            me

In [25]:
# Write the generated extension code to files.
# (Will overwrite existing files)

write_optimization(optimization, module_name)

In [26]:
# Clean previous builds.
# (Make sure to run this cell before running the compile cell a second time only)
# (May cast errors if no previous build exists)

!rm -r build/

In [27]:
# [WINDOWS ONLY]
# Set COMSPEC to cmd.exe to avoid issues with some C compilers on Windows.
# (Remember to restore original COMSPEC after compilation and testing)
preset_comspec = os.environ.get("COMSPEC")
os.environ["COMSPEC"] = "C:\\Windows\\System32\\cmd.exe"

In [None]:
# Compile the C extension.
# (Will fail no C compiler is installed)
# (In case of errors, try directly on the command line)

!python setup.py build_ext --inplace

In [29]:
# [WINDOWS ONLY]
# Restore original COMSPEC.

os.environ["COMSPEC"] = preset_comspec

In [30]:
# Run the usage example to test the compiled C extension.
exec(optimization.usage)

max_subarray_sum equality: True
Python time: 0.010335999992094003
C ext time: 1.4399993233382702e-05
total_max_subarray_sum equality: True
Python total time: 0.21065390000876505
C ext total time: 0.00012310000602155924


In [31]:
# Import newly created C extension and compare performance with original Python code.

from python_hard import total_max_subarray_sum as c_total_max_subarray_sum

print("Testing original Python code:")
test_subarray_sum(total_max_subarray_sum, 10000, 42, -10, 10)
print("Testing C extension code:")
test_subarray_sum(c_total_max_subarray_sum, 10000, 42, -10, 10)

Testing original Python code:
Total Maximum Subarray Sum (20 runs): 10980
Execution Time: 57.275276 seconds
Testing C extension code:
Total Maximum Subarray Sum (20 runs): 10980
Execution Time: 0.002317 seconds


# Let's build a Gradio service

In [32]:
# Define a function to call the optimization process and return the generated codes.

def optimize(python_code, module_name, model):
    """Call the optimization process and return the generated codes."""
    optimization = optimize_gpt(python_code, module_name, model)
    return optimization.c_code, optimization.setup, optimization.usage

In [33]:
# Define a function to execute Python code and capture its output.

def execute_python(code):
    """Execute the given Python code and capture its output."""
    try:
        output = io.StringIO()
        sys.stdout = output
        exec(code)
    finally:
        sys.stdout = sys.__stdout__
    return output.getvalue()

In [34]:
# Extension compilation function.

def build_extension():
    """Compile the C extension using 'setup.py' and return the compilation output."""
    # Set default COMSPEC to cmd.exe on Windows to avoid issues with some C compilers.
    preset_comspec = os.environ.get("COMSPEC")
    os.environ["COMSPEC"] = "C:\\Windows\\System32\\cmd.exe"
    try:
        compile_cmd = ["python", "setup.py", "build_ext", "--inplace"]
        compile_result = subprocess.run(compile_cmd, env=os.environ,
                                        check=True, text=True, capture_output=True)
    except subprocess.CalledProcessError as ex:
        raise Exception(f"An error occurred while building:\n{ex.stdout}\n{ex.stderr}")
    finally:  # The 'finally' clauses executes always whether there was an exception or not.
        # Restore original COMSPEC.
        os.environ["COMSPEC"] = preset_comspec
    return compile_result.stdout

In [35]:
# Extension compilation function.

def generate_extension(c_code, setup_code, usage_code, module_name):
    """Build and install the C extension from the provided codes."""
    try:  # Write the provided codes to their respective files.
        write_file(c_code, f"{module_name}.c")
        write_file(setup_code, "setup.py")
    except Exception as ex:
        return f"An error occurred while writing files:\n{ex}"
    # Build the extension and capture the output.
    try:
        build_output = build_extension()
    except Exception as ex: # If build fails, return the error message.
        return str(ex)
    # Return the combined output of build and install processes.
    return build_output

In [36]:
# Extension testing function.

def test_extension(usage_code):
    """Test the installed C extension by executing the provided usage code and capturing its output."""
    try:  # Write the provided codes to their respective files.
        write_file(usage_code, "usage_example.py")
    except Exception as ex:
        return f"An error occurred while writing test file:\n{ex}"
    try:
        output = execute_python(usage_code)
    except Exception as ex:
        return f"An error occurred while testing the extension:\n{ex}"
    return output

In [37]:
# Define custom CSS for Gradio interface.

css = """
.python {background-color: #306998;}
.c_ext {background-color: #050;}
"""

In [38]:
# Define default codes for the interface.

default_p_code = """
def hello_world():
    print("Hello, World!")
"""
# default_p_code = python_hard_code  # Run the declaration cell before use.
# default_p_code = calculate_pi_code  # Run the declaration cell before use.

default_c_code = r"""
#include <Python.h>

// Function to be called from Python
static PyObject* zz_hello_world(PyObject* self, PyObject* args) {
    printf("Hello, World!\n");
    Py_RETURN_NONE;
}

// Method definition structure
static PyMethodDef zz_my_methods[] = {
    {"hello_world", zz_hello_world, METH_VARARGS, "Print 'Hello, World!'"},
    {NULL, NULL, 0, NULL}  // Sentinel
};

// Module definition
static struct PyModuleDef zz_my_module = {
    PyModuleDef_HEAD_INIT,
    "zz_my_module",
    "Extension module that prints Hello, World!",
    -1,
    zz_my_methods
};

// Module initialization function
PyMODINIT_FUNC PyInit_zz_my_module(void) {
    return PyModule_Create(&zz_my_module);
}
"""

default_setup = """
from setuptools import setup, Extension

module = Extension(
    'zz_my_module',
    sources=['zz_my_module.c'],
)

setup(
    name='zz_my_module',
    version='1.0',
    description='This is a custom C extension module.',
    ext_modules=[module]
)
"""

default_test = """
import time
import zz_my_module

def python_hello_world():
    print("Hello, World!")

start = time.time()
python_hello_world()
end = time.time()
print(f"Python function execution time: {end - start:.6f} seconds")

start = time.time()
zz_my_module.hello_world()
end = time.time()
print(f"C extension execution time: {end - start:.6f} seconds")
"""

In [39]:
# We will use gradio auto reload feature, this way we do not need to restart the app to see changes in the code.
# * https://www.gradio.app/guides/developing-faster-with-reload-mode

%load_ext gradio

# This mandatory requires naming the 'gr.Blocks' interface as 'demo'.
# Now, each time we edit the code, we just need to re-run Gradio interface cell to see the changes in the app.
# The '.launch()' method is not needed anymore.

In [None]:
%%blocks

with gr.Blocks(css=css) as demo:
    gr.Markdown("## Convert code from Python to C++")
    with gr.Row():
        module_name = gr.Textbox(label="Module name:", lines=1, value="zz_my_module")
        model = gr.Dropdown(["gpt-4o", "gpt-5"], label="Select model", value="gpt-4o")
    with gr.Row():
        python = gr.Textbox(label="Python code:", lines=30, value=default_p_code, elem_classes=["python"])
        c_code = gr.Textbox(label="C Extension code:", lines=30, value=default_c_code, elem_classes=["c_ext"])
    with gr.Row():
        get_extension = gr.Button("Generate extension code")
    with gr.Row():
        setup_code = gr.Textbox(label="Compilation code:", lines=10, value=default_setup, elem_classes=["python"])
        usage_code = gr.Textbox(label="Test compare code:", lines=10, value=default_test, elem_classes=["python"])
    with gr.Row():
        compile_ext = gr.Button("Compile extension")
    with gr.Row():
        c_ext_out = gr.TextArea(label="C Extension result:", elem_classes=["c_ext"])
    with gr.Row():
        test_run = gr.Button("Test code")
    with gr.Row():
        test_out = gr.TextArea(label="Test result:", elem_classes=["python"])

    get_extension.click(optimize, inputs=[python, module_name, model], outputs=[c_code, setup_code, usage_code])
    compile_ext.click(generate_extension, inputs=[c_code, setup_code, usage_code, module_name ], outputs=[c_ext_out])
    test_run.click(test_extension, inputs=[usage_code], outputs=[test_out])
