Compiling C and C++ shims alongside Pony #5390

SeanTAllen · 2026-05-29T23:27:43Z

SeanTAllen
May 29, 2026
Maintainer

What I want to make easy

A lot of the time, calling a C library from Pony needs no C at all. You write the FFI declarations, you call straight into the library, and you're done. But not always. Sometimes the library doesn't map cleanly onto Pony's FFI. Sometimes the function you need is really a macro. Sometimes you have to build up a struct the way the library expects before you can hand it over. In those spots a little C smooths it out. A shim. A handful of functions that sit between Pony and the library and make the Pony side pleasant.

This is about that case. When a shim would make things better, I want it to be easy.

Today it isn't. You compile the C yourself, with your own compiler and your own flags, into a library, and then you point Pony at the result with use "lib:..." and use "path:...". Pony never touches the C. It only links the object that fell out the other end. So every project that reaches for a shim grows a second build step that lives outside Pony, and every contributor who checks out that project has to know about it.

I want that glue to just compile. You drop a .c next to your .pony, and when ponyc builds the package, it builds the C too and links it in. No second build system. No instructions in the README about running make first.

This is a small thing, and it should stay small. It's for shims and the C that travels with a package. It is not "use Pony as a C compiler." If you have a real C project, with its own build graph and code generation and a dozen translation units leaning on each other, you should build that with the tool made for it and link the result the way you already can. More on where that ambition lives further down.

How it would work

ponyc already vendors all of LLVM. The whole monorepo is sitting right there in the submodule, clang included. We don't build clang today; we only turn on LLD, which ponyc embeds and calls directly to link every program it produces. Turning clang on is a one-line change to which LLVM projects we build.

Once clang is in the build, ponyc can compile C in process. It builds a CompilerInvocation from an argument list, runs an emit-object action, and gets back an object file produced against the same LLVM that just compiled the Pony. Same target, same ABI, by construction. Those objects join the pile of objects ponyc already hands to the linker. The link step barely changes; it already takes a pile of objects and libraries and turns them into a binary.

Discovery is a convention. Any .c, .cpp, or .cc sitting in a package directory next to the .pony files gets compiled as part of that package. The extension sets the language. Nothing to enumerate, nothing to register. The C travels with the package the same way the Pony does, so when corral pulls a dependency that ships a shim, the shim comes along and builds with no extra work.

Where the flags come from

A C file almost never compiles on convention alone. It needs to find its headers. It needs a define or two. It needs a particular C standard. So we need a way to say those things, and the natural place to say them is the same place Pony already says "link this library" and "look for it here": a use directive.

There's a real decision hiding under that, and it's most of the design. Pony's use "lib:..." and use "path:..." collect into one global pile for the whole program. That's right for linking. Every library gets handed to the linker once, at the end, and it doesn't matter which package asked for it. Compiling C is the opposite. The headers package A needs aren't the headers package B needs. If A defines a macro to one value and B defines the same macro to another, those two facts can't share a bag. So the flags for compiling a package's C have to belong to that package, not to the program as a whole. That's the one real departure from how the existing use directives behave.

For the directives themselves, I'd keep one idea per line, the way lib: and path: already split the work:

use "cinclude:./vendor/include" for a header search path
use "cdefine:FOO=1" for a preprocessor define
use "cstd:c11" for the language standard
use "cflag:..." as an escape hatch for the odd flag with no home of its own

Naming the schemes instead of taking a raw flag string buys something concrete. use "path:..." already resolves a relative path against the package's own directory, so the library gets found no matter where you run ponyc from. cinclude: should do the same. use "cinclude:./include" then means "the include directory next to this source," and it stays true regardless of the directory clang runs in. A raw -I./include can't promise that.

And because a use can carry an if guard today, platform conditioning comes for free. use "cinclude:/opt/homebrew/include" if macosx already parses and already evaluates. We let the new schemes take a guard and we're done. Cross-platform C glue, where the include paths and defines differ by operating system, is the case where that matters most, and we don't build anything to get it.

The flags that have to match the Pony side aren't negotiable and don't belong to the user at all. The target triple, the CPU, the features, position-independent code, the optimization level, debug info: ponyc already has all of these, because it just used them to compile the Pony. It feeds the same values to clang. A user who could override them could produce an object that won't link, or worse, one that links and then misbehaves. Those come from the compiler, not the directive.

C and C++ share one set of flags to start. The extension still sets the language, so a .cpp compiles as C++ and a .c as C, but they draw from the same cstd: and cinclude: and the rest. If it turns out people genuinely need different standards for C and C++ in the same package, we split the family then. Easier to add the split later than to walk back two parallel families nobody needed.

The alternative: a Pony build system

The other way to solve this is bigger. Give Pony its own build system, a build file you write, the way Zig has build.zig. A real description of how to build the native pieces of your project, with whatever logic that takes. That's also where "use Pony as a C compiler" would live, because once you have arbitrary build logic, building a real C project from inside Pony is on the table.

It's overkill for what I'm after. It cuts against how Pony works today, where this kind of information lives in the source through use directives rather than off in a separate build file, and it's far more than the shim problem needs. If we ever want a full build system, that's its own conversation, and these directives don't stand in its way.

Open questions

A few things I don't have answers for yet.

Caching. ponyc would compile these C files on every build. A second build step normally rebuilds only what changed. Do we cache the compiled objects and skip the ones whose source and flags haven't moved, and if so, where do those objects live and how do we key the cache? Get it wrong and you get either stale objects or slow builds.

Errors. When clang can't compile the C, the person staring at the failure is sitting in front of a Pony compiler. The error has to come out in a way that fits the rest of ponyc's diagnostics rather than dumping raw clang output and leaving them to work out which file it came from. How much of clang's diagnostics we route through Pony's, and how much we pass straight through, is an open call.

Compile-error tests. ponyc has a whole category of tests that assert a given program fails to compile with a given error. C that fails to compile is a new kind of compile failure. Whether it fits that existing machinery, and how, needs working out.

SeanTAllen · 2026-05-29T23:29:01Z

SeanTAllen
May 29, 2026
Maintainer Author

My initial thought on caching is: "that's a build system concern" and "given the stated goal it shouldn't be too expensive to recompile the shims every time".

0 replies

SeanTAllen · 2026-06-12T00:37:51Z

SeanTAllen
Jun 12, 2026
Maintainer Author

A few corrections after reading the actual code, mostly to the "one-line change" and the header story.

Turning clang on in the LLVM build really is one line: another entry in LLVM_ENABLE_PROJECTS next to lld. Using it is not. Today ponyc links only LLD's static libraries. Calling clang in process means linking clang's frontend, codegen, driver, sema, and the rest into ponyc too, plus paying for clang in make libs build time and in the size of the ponyc binary. The one line is where the build work starts, not where it ends.

The header story is better than I first thought. I worried that compiling C would force us to reconstruct the compiler driver's include-path logic, since building a bare CompilerInvocation skips the driver. It would, but that is the same job we already took on for the linker. When we embedded LLD we stopped shelling out to cc and took over what the driver used to do: locating the dynamic linker, the libc startup objects, and the system library directories, per platform, across sysroots. That code lives in src/libponyc/codegen/genexe.cc today (find_libc_crt_dir and its neighbors). System include paths are the mirror image, and they hang off the same sysroot detection. Bounded, known work.

One piece is genuinely new. Clang's builtin headers (stddef.h, stdarg.h, stdbool.h, and friends) ship in clang's resource directory, not in the system. Nothing on the link side needs them, so there's no existing code to lean on. We'd ship that directory with ponyc and point clang at it. Without it, a shim that includes <stddef.h> won't compile, and almost every shim includes something from there.

None of this moves the shape of the proposal. It moves "turn clang on" from trivial to a contained piece of build work, and it confirms the header search is reachable by reusing what the embedded linker already does.

0 replies

SeanTAllen · 2026-06-12T03:00:53Z

SeanTAllen
Jun 12, 2026
Maintainer Author

The v1 implementation plan is written up in #5468 — C-only to start, grounded in the actual code: the use directive schemes (cinclude:/cdefine:), .c discovery, where the compile pass slots into the pass pipeline (before reach), the clang invocation reusing the same target settings as the Pony, and the two-stage rollout (build clang first, then the feature).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compiling C and C++ shims alongside Pony #5390

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Compiling C and C++ shims alongside Pony #5390

Uh oh!

SeanTAllen May 29, 2026 Maintainer

What I want to make easy

How it would work

Where the flags come from

The alternative: a Pony build system

Open questions

Replies: 3 comments

Uh oh!

SeanTAllen May 29, 2026 Maintainer Author

Uh oh!

SeanTAllen Jun 12, 2026 Maintainer Author

Uh oh!

SeanTAllen Jun 12, 2026 Maintainer Author

SeanTAllen
May 29, 2026
Maintainer

SeanTAllen
May 29, 2026
Maintainer Author

SeanTAllen
Jun 12, 2026
Maintainer Author

SeanTAllen
Jun 12, 2026
Maintainer Author