TurboPython

A compiled Python dialect that eliminates core sources of CPython overhead — without sacrificing Python's syntax or readability.

This is work in progress. The syntax can still change as we find more things that one could improve.

Difference to other existing projects

While there are many projects that try to make python faster. This one is unique in that it extends python with loads of new features that improve spead and that it is a drop in replacement to python. You can just replace python with tython and start optimizing the places where you know that speed is an issue while keeping the python syntax. Like this migration becomes a breeze.

How It Works

TurboPython addresses a number independent axes of CPython overhead. Each has its own opt-in syntax. See docs/turbopython_syntax.md for the full language reference.

1. Strict Static Typing

Type annotations are enforced at compile time, not ignored like Python hints. The compiler emits unboxed native arithmetic — no object headers, no dynamic dispatch.

@native
def distance(x: float, y: float) -> float:
    return (x**2 + y**2) ** 0.5
# Emits: double distance(double x, double y) { return sqrt(x*x + y*y); }

Unannotated functions fall back to normal CPython — typing is opt-in.

2. Value Types — `struct`

A new struct keyword creates stack-allocated, contiguous-memory types with no heap allocation, no refcount, no GC overhead.

struct Vec3:
    x: float
    y: float
    z: float

points: array[Vec3, 1000]  # 24,000 bytes, one contiguous block
                           # vs. Python list: 1000 pointers + 1000 heap objects

Memory comparison: Vec3 = 24 bytes. Equivalent Python object ≈ 200+ bytes.

3. Ownership and Borrowing

Rust-inspired ownership eliminates refcount overhead on the hot path. Three modes, all opt-in:

def consume(data: owned list[int]) -> int:   # caller transfers ownership — source invalidated
    return sum(data)

def analyze(data: ref list[int]) -> float:   # immutable borrow — zero-cost, no refcount
    return sum(data) / len(data)

def normalize(data: mut ref list[float]):    # exclusive mutable borrow — no data races
    total = sum(data)
    for i in range(len(data)):
        data[i] /= total

The compiler enforces: multiple ref borrows are fine; mut ref is exclusive — any overlap is a compile error.

4. Native Compilation Directives

Explicit control over what compiles to native code:

@native   # AOT compile — all types resolved at compile time
def fib(n: int) -> int: ...

@inline   # static inline in C — zero-overhead small helpers
def clamp(val: float, lo: float, hi: float) -> float:
    return min(max(val, lo), hi)

@jit      # JIT compile on first call, specialize on observed types
def flexible_sum(items): ...

const MAX_ITER: int = 256   # compile-time constant, embedded in binary

5. True Parallelism — No GIL

parallel for partitions loop iterations across all cores via OpenMP. The ownership checker statically guarantees no data races — no locks needed.

parallel for y in range(height):
    for x in range(width):
        pixels[y * width + x] = mandelbrot(cx, cy, 256)
# Emits: #pragma omp parallel for

spawn launches concurrent tasks with typed channels as the only communication mechanism:

spawn filter_stage(data, chan_filtered)
spawn transform_stage(chan_filtered, chan_results)

6. Extended Integer Types

int128 maps to GCC's __int128, giving a range of ±1.7 × 10³⁸ — suitable for large combinatorics and cryptographic primitives without external dependencies.

@native
def fib(n: int) -> int128:
    a: int128 = 0
    b: int128 = 1
    for i in range(n):
        tmp: int128 = a + b
        a = b
        b = tmp
    return a

Summary

Concept	Standard Python	TurboPython
Type enforcement	Advisory hints	Compile-time enforced
Memory layout	`class` (heap, dict-backed)	`struct` (stack, packed, no GC)
Integer range	Arbitrary precision (slow)	`int` (64-bit) or `int128` (128-bit)
Ownership	Refcounted, implicit sharing	`owned`, `ref`, `mut ref`
Compilation	Interpreted bytecode	`@native`, `@jit`, `@inline`
Constants	Convention (`UPPER_CASE`)	`const` (compile-time evaluated)
Parallelism	`threading` (GIL-bound)	`parallel for`, `spawn`, `Channel`
Typed arrays	`list` (boxed, pointer array)	`array[T, N]` (contiguous, unboxed)

Installation

Requirements: Python 3.10+, GCC with OpenMP support.

git clone https://github.com/ribalba/TurboPython.git
cd TurboPython
python -m turbopython.cli --help

No pip install needed — just run from the repo root.

Quick Start

1. Write a `.tpy` file

# benchmarks/hello.tpy
@native
def fib(n: int) -> int:
    a: int = 0
    b: int = 1
    for i in range(n):
        tmp: int = a + b
        a = b
        b = tmp
    return a

@native
def main() -> int:
    return fib(40)

2. Compile to an executable

python -m turbopython.cli compile benchmarks/hello.tpy --exe

Output:

✓ Compilation successful
  C source:   benchmarks/hello.c
  Executable: benchmarks/hello

Run it like any native binary:

./benchmarks/hello
time ./benchmarks/hello

3. Or compile to a shared library and call from Python

python -m turbopython.cli compile benchmarks/hello.tpy

import ctypes

lib = ctypes.CDLL("./benchmarks/hello.so")
lib.fib.argtypes = [ctypes.c_int64]
lib.fib.restype  = ctypes.c_int64

print(lib.fib(40))  # 102334155 — computed in native code

4. Or use the `tython` runner

./tython benchmarks/hello.tpy        # compiles + runs main()
./tython myscript.py                 # runs .py with import hook active

The Name

The name has two references. Turbo comes from Turbo Pascal — Borland's legendary 1980s compiler that made Pascal fast enough to write real software on a home computer, in part by making compilation itself instant. The parallel is intentional: TurboPython aims to make Python fast enough for systems-level work without leaving the language behind. Python is there because it stays Python — same syntax, same stdlib, same feel.

The short name is tython, which is also the name of the drop-in runner. And also the name of a planet (#lightsaber)

Benchmarks

python benchmarks/bench_mandelbrot.py
python benchmarks/bench_hello.py

Expected output (numbers vary by machine):

Mandelbrot (400×300, 256 iterations):

Pure Python  : 0.263s   (checksum: 3303274)
Compiling mandelbrot.tpy... done
TurboPython  : 0.007s   (checksum: 3303274)

Speedup      : 37.1x faster

Fibonacci (fib_sum(150) × 5000 reps, int128):

Pure Python  : 1.842s   (result: ...)
Compiling hello.tpy... done
TurboPython  : 0.031s   (result: ...)

Speedup      : 59.4x faster

Language Features

`@native` — AOT compiled functions

All types must be fully resolved at compile time. Emitted as a C symbol with unboxed arithmetic.

@native
def mandelbrot(cx: float, cy: float, max_iter: int) -> int:
    zx: float = 0.0
    zy: float = 0.0
    for i in range(max_iter):
        if zx * zx + zy * zy > 4.0:
            return i
        tx: float = zx * zx - zy * zy + cx
        zy = 2.0 * zx * zy + cy
        zx = tx
    return max_iter

`@inline` — inlined at call sites

Emitted as static inline in C. Best for small math helpers.

@inline
def vec3_dot(a: Vec3, b: Vec3) -> float:
    return a.x * b.x + a.y * b.y + a.z * b.z

`struct` — value types with no heap allocation

Stack-allocated, copied on assignment, no GC overhead. A Vec3 is exactly 24 bytes — vs 200+ bytes for an equivalent Python object.

struct Vec3:
    x: float
    y: float
    z: float

@native
def vec3_length(v: Vec3) -> float:
    return (v.x * v.x + v.y * v.y + v.z * v.z) ** 0.5

`parallel for` — multi-core loops via OpenMP

Emits #pragma omp parallel for. The ownership checker enforces that loop bodies do not share mutable state.

parallel for y in range(height):
    for x in range(width):
        cx: float = (x - width / 2.0) / (width / 4.0)
        cy: float = (y - height / 2.0) / (height / 4.0)
        total = total + mandelbrot(cx, cy, max_iter)

Ownership annotations

Rust-inspired, opt-in. Eliminates refcount overhead on the hot path.

def consume(data: owned list[int]) -> int:   # caller's variable is invalidated
    return sum(data)

def analyze(data: ref list[int]) -> float:   # immutable borrow, zero-cost
    return sum(data) / len(data)

def normalize(data: mut ref list[float]):    # exclusive mutable borrow
    total = sum(data)
    for i in range(len(data)):
        data[i] /= total

`const` — compile-time constants

const MAX_ITER: int = 1000
const PI: float = 3.14159265358979

CLI Reference

# Compile to a .so shared library
python -m turbopython.cli compile examples/mandelbrot.tpy

# Compile with verbose output (shows generated C)
python -m turbopython.cli compile examples/mandelbrot.tpy --verbose

# Compile to a specific output directory
python -m turbopython.cli compile examples/mandelbrot.tpy -o build/

# Produce a standalone executable (requires @native def main() -> int)
python -m turbopython.cli compile benchmarks/hello.tpy --exe

# Specify a custom entry-point function name
python -m turbopython.cli compile benchmarks/hello.tpy --exe --entry run

# Inspect all compilation stages without producing output
python -m turbopython.cli inspect examples/vectors.tpy

inspect prints: original source, preprocessed Python, struct layouts, function signatures with inferred C types, and the full type environment. Useful for debugging codegen.

When --exe is passed:

Validates that the entry-point function (default: main) exists
Renames it to __tp_main in the generated C to avoid clashing with C's main
Appends a int main(int argc, char** argv) wrapper
Compiles without -shared -fPIC, producing a native executable

`tython` — Drop-in Runner

tython is an executable at the repo root that acts as a Python-aware interpreter for both .py and .tpy files, with the import hook pre-installed.

# Compile and run a .tpy file — calls main() and uses its return as exit code
./tython examples/vectors.tpy
./tython benchmarks/hello.tpy

# Run a .py script — .tpy files on sys.path are importable by name
./tython myscript.py

# Inline command
./tython -c "import vectors; print(vectors.compute_total_distance(100))"

# Interactive REPL with import hook active
./tython

Inside a .py script run via tython, any .tpy file on sys.path imports transparently:

# myscript.py — no special setup needed when run via tython
import vectors
print(vectors.compute_total_distance(1000))

Import Hook

The import hook can also be used in any regular Python script without the tython runner:

from turbopython.importer import install
install()

import vectors   # finds vectors.tpy on sys.path, compiles to vectors.so
print(vectors.compute_total_distance(1000))

install() inserts a sys.meta_path finder that:

Searches sys.path for <module>.tpy when an import cannot find a .py/.pyc
Compiles the .tpy with the full TurboPython pipeline
Wraps the resulting .so in a module object with argtypes/restype set automatically from the compiled type signatures
Returns the module — the caller uses it as any normal Python module

The .so is written next to the .tpy file and reused on subsequent runs.

Examples

File	Demonstrates
examples/mandelbrot.tpy	`@native`, typed arithmetic, `struct`
examples/vectors.tpy	`struct` value types, `@inline`, `@native`, `main`
examples/nbody.tpy	`parallel for`, struct arrays, `main`
benchmarks/hello.tpy	`int128`, fibonacci, `main`
benchmarks/hello.py	Pure Python equivalent of hello.tpy
benchmarks/bench_hello.py	Fibonacci benchmark vs pure Python
benchmarks/bench_mandelbrot.py	Mandelbrot benchmark vs pure Python

Project Layout

TurboPython/
├── README.md
├── tython                     # Drop-in runner / interpreter
├── turbopython/
│   ├── __init__.py
│   ├── cli.py                 # Command-line interface
│   ├── compiler.py            # Pipeline driver
│   ├── preprocessor.py        # Stage 1: syntax → valid Python + metadata
│   ├── type_checker.py        # Stage 2: type resolution and validation
│   ├── ownership.py           # Stage 3: move/borrow checking
│   ├── codegen.py             # Stage 4: C code generation + GCC invocation
│   ├── importer.py            # sys.meta_path hook for transparent .tpy imports
│   └── test_compiler.py       # Test suite
├── examples/
│   ├── mandelbrot.tpy         # Mandelbrot fractal
│   ├── vectors.tpy            # 3D vector math
│   └── nbody.tpy              # N-body gravitational simulation
├── benchmarks/
│   ├── hello.tpy              # int128 fibonacci (compile to .so or executable)
│   ├── hello.py               # Pure Python equivalent
│   ├── bench_hello.py         # Side-by-side benchmark
│   └── bench_mandelbrot.py    # Mandelbrot benchmark
└── docs/
    ├── ARCHITECTURE.md        # Detailed pipeline design
    └── turbopython_syntax.md  # Full language reference

Type System

TurboPython	C type	Range
`int`	`int64_t`	±9.2 × 10¹⁸
`int128`	`__int128`	±1.7 × 10³⁸
`float`	`double`	64-bit IEEE 754
`bool`	`int`	0 / 1
`str`	`const char*`	read-only C string
`array[float, N]`	`double*`	contiguous heap/stack
`struct Foo`	`Foo` (typedef'd struct)	stack-allocated value type

What Stays Standard Python

Indentation-based blocks
def, class, for, if, while, with, return, yield
List/dict/set comprehensions
Standard library imports
Unannotated functions run as normal CPython

The philosophy: opt in to performance where it matters, keep everything else as dynamic and expressive as Python.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
benchmarks		benchmarks
docs		docs
examples		examples
turbopython		turbopython
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
tython		tython

Folders and files

Latest commit

History

Repository files navigation

TurboPython

Difference to other existing projects

How It Works

1. Strict Static Typing

2. Value Types — struct

3. Ownership and Borrowing

4. Native Compilation Directives

5. True Parallelism — No GIL

6. Extended Integer Types

Summary

Installation

Quick Start

1. Write a .tpy file

2. Compile to an executable

3. Or compile to a shared library and call from Python

4. Or use the tython runner

The Name

Benchmarks

Language Features

@native — AOT compiled functions

@inline — inlined at call sites

struct — value types with no heap allocation

parallel for — multi-core loops via OpenMP

Ownership annotations

const — compile-time constants

CLI Reference

tython — Drop-in Runner

Import Hook

Examples

Project Layout

Type System

What Stays Standard Python

Further Reading

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Value Types — `struct`

1. Write a `.tpy` file

4. Or use the `tython` runner

`@native` — AOT compiled functions

`@inline` — inlined at call sites

`struct` — value types with no heap allocation

`parallel for` — multi-core loops via OpenMP

`const` — compile-time constants

`tython` — Drop-in Runner

Packages