Spike plan: split builtin Pointer into Pointer + UnsafePointer (#4925) #5427
Replies: 2 comments
-
Spike results so farThe compiler + standard-library half of the spike works. Compiler (small, as predicted): a new Standard library migrated and passing (501 stdlib + 42 files + 13 net tests green, plus a new direct
The model is closed: provenance is permanent, copy is the only bridgeThere is no conversion between the two pointer types in either direction, and that is the point:
Because there is no Migrating libponyc-wrapping tools (and ssl)
ssl carries the same two cases: OpenSSL handles ( Status / next
|
Beta Was this translation helpful? Give feedback.
-
Status snapshotCheckpoints pushed for both repos so nothing is lost while we decide how to proceed.
Everything compiles and all tests pass
Zero failures. What's migrated
API surface review (done)
Outstanding / polish
Settled (for the record)Provenance is permanent; the only bridge is Future work (separate from this spike)A runtime FFI sanitizer mode, opt-in the way a sanitizer build is. It checks any FFI call that passes a |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This is the plan for a spike to test the
Pointer/UnsafePointersplit proposed in #4925. It is a proof-of-feasibility, not a merge-ready change: the goal is to show the split works end-to-end (compiler + standard library + ponylang/ssl all compile and pass), then open a draft PR for review and refinement. Not merging short-term.Goal
Introduce a new builtin type
UnsafePointer[A]alongsidePointer[A]. The two have identical runtime/codegen behavior but different static semantics, enforced by the type system (which signatures accept which). This gives FFI authors a way to represent two genuinely different pointer provenances instead of overloading onePointerfor both.The semantic model
Pointer[A](safe) — backed by Pony-managed memory. Source:.cpointer()/.cstring()on a PonyArray/String,Pointer[A]._alloc,addressof, or an FFI function that returns/fillspony_alloc'd memory. It is the only kind that may be used or adopted in place without copying (anArray/Stringbacking store, or a network read buffer C fills and Pony then reads directly). Passed into FFI with the capability matching the real C semantics (val/boxif C only reads,refif C writes).UnsafePointer[A]— NOT backed by Pony's guarantees. Two sub-cases: (a) trust-asserted raw foreign memory; (b) an opaque foreign handle that came from FFI and is later passed back into FFI (FILE*,DIR*,addrinfo*,SSL*), often wrapped in a class with a finalizer. The ONLY way to land its data in a safe Pony object is to copy (e.g.String.copy_cstring, which allocates fresh memory and copies). Never adopted in place; should never be exposed in a public interface.Distinguishing test for from-FFI pointers: does the code adopt the pointer as a backing store without a copy (→
Pointer, source must bepony_alloc'd) or copy the bytes into freshly Pony-allocated memory (→ source may beUnsafePointer)? Example:String.from_cstringdoes_ptr = str(adopt, no copy) → takesPointer.String.copy_cpointercopies into a fresh allocation → takesUnsafePointer.Subtyping: distinct, non-interchangeable. No implicit coercion either direction. This is what makes the eventual migration a breaking change.
Key decision:
void*mappingA C
void*parameter (declaredPointer[None]in Pony today) means nothing to us, so we map it to whichever raw pointer the Pony signature actually declares — the two kinds do not interchange across avoid*boundary. Concretely,void_star_paramenforces a kind match: aPointer[None]param acceptsPointer[A]/NullablePointer[A]/USize; anUnsafePointer[None]param acceptsUnsafePointer[A]/USize. This keeps the distinction honest at the FFI boundary and makes FFI declarations self-documenting about provenance.Compiler approach (small)
Because the split does not change codegen (identical caps, identical lowering), almost no compiler site needs to tell the two types apart:
genprim_pointer_methodsresolves each intrinsic by method name on the reach type it is handed, soUnsafePointerreuses the same generator verbatim.gentype.c: the use/mem-type assignment, and the intrinsic-method dispatch) — both gain an|| str_UnsafePointer.is_pointer()gates all mean "this is a raw pointer, treat it as opaque." They move to a combinedis_raw_pointer()helper (=is_pointer || is_unsafe_pointer).void_star_param, which keepsis_pointer/is_unsafe_pointerseparate to enforce the kind match above.addressofalways yieldsPointer; no mirror needed there.Phases
UnsafePointeras a recognized raw-pointer type (packages/builtin/unsafe_pointer.pony, the three predicates, the two codegen spots, thevoid*kind-match). Non-breaking:UnsafePointerexists but is unused. Committed test exercises its intrinsics.builtin,files,net,process) and retype theUnsafePointer-bucket sites (opaque foreign handles + copy sources, plus thevoid*FFI-declaration params in lockstep). This is where the breaking change lands.The
UnsafePointernear-clone ofPointerhas one deliberate exception:_copy_to's destination stays typedPointer, because copying is precisely how unsafe data is brought into Pony-managed memory.Divergences / breaking changes
Pointerto a stdlib API we retype toUnsafePointer(notablyString.copy_cpointer/copy_cstring), or that stores/pattern-matches the retyped FFI return values, will stop compiling. Intended (it is the safety win), but it is an ecosystem migration. The stdlib + ssl pass measures that cost.Out of scope for the spike (follow-up once the design is settled)
UnsafePointerin public interfaces (it must unfold type aliases, mirroringis_literal, ortype Foo is UnsafePointer[X]evades it).Open questions to resolve during review
env.argv, thepony_ctxhandle, the Windows find-data allocation, hash-blockPointer[None]params).PointerandUnsafePointershould later diverge in API surface.Beta Was this translation helpful? Give feedback.
All reactions