Skip to content

v0.12.3: spec 1704 phases 1-8 + final gate (#544 closed, #543/#542/#510 closed)#26

Merged
tamnd merged 36 commits into
mainfrom
feat/v0.12.3-spec-1704-phase2
May 13, 2026
Merged

v0.12.3: spec 1704 phases 1-8 + final gate (#544 closed, #543/#542/#510 closed)#26
tamnd merged 36 commits into
mainfrom
feat/v0.12.3-spec-1704-phase2

Conversation

@tamnd
Copy link
Copy Markdown
Owner

@tamnd tamnd commented May 12, 2026

Summary

This PR closes spec 1704 phases 1 through 8 end-to-end and lands the full Final gate. import re + re.match + import fnmatch + import enum all run end-to-end on bare gopy.

What landed across the eight phases:

  • Phase 1 Objects/object.c full port: every row in object_methods + object_getsets lands on objectType, including __init_subclass__ / __subclasshook__ wrapped through NewClassMethod.
  • Phase 2 PyClassMethod_Type: cm_init, cm_descr_get, cm_repr, cm_memberlist, cm_getsetlist, cm_methodlist all match CPython 3.14, including __func__, __wrapped__, __isabstractmethod__, __dict__, __annotations__, __annotate__, __class_getitem__. functools_wraps propagates __module__/__name__/__qualname__/__doc__ from the wrapped callable.
  • Phase 3 PyStaticMethod_Type: full sm_* port including sm_call (CPython 3.10+ direct-call surface).
  • Phase 4 PyFunction_Type: func_* block including the new annotate slot and the descriptor get path.
  • Phase 5 PyMethod_Type (Objects/classobject.c): bound-method richcompare, hash, repr.
  • Phase 6 type_new pipeline: every type_new_* step including the PEP 487 __set_name__ and __init_subclass__ passes.
  • Phase 7 inherit_slots: every slot edge audited; built-in subclasses now inherit the full slot table.
  • Phase 8 name-op trio: STORE_NAME / LOAD_NAME / DELETE_NAME route through the namespace's dunders when the scope is a dict subclass.

Final gate, all rows pass:

  • F.0a, F.0b: hash on user metaclass user types, stable. Pinned: stdlibinit/slot_method_lookup_test.go.
  • F.1, F.2: import enum + StrEnum + iter. Pinned: stdlibinit/enum_import_test.go. Closes #544.
  • F.3: re.match end-to-end. Pinned: stdlibinit/re_match_smoke_test.go. Closes #510.
  • F.4: fnmatch.fnmatch returns True. Pinned: stdlibinit/fnmatch_smoke_import_test.go. Closes #542.
  • F.5: spec 1703 phase 7 + 8 + Final gate rows flipped to done.
  • F.6: spec 1702 picks up an enum row and the re / _sre row flips from pending to done.

What it took to clear F.3 (the last sticky one):

  1. Full bytes/bytearray methodlists wired as descriptors. Objects/bytesobject.c bytes_methods (~40 entries) and Objects/bytearrayobject.c bytearray_methods (8 mutation entries plus shared find/replace/strip/etc.). One closure per method on both types; the closure inspects args[0] so the same code returns *Bytes on bytes and *ByteArray on bytearray.
  2. Mappingproxy methodlist. Objects/descrobject.c mappingproxy_methods was unwired, so cls.__members__ didn't carry keys. Wired get / keys / values / items / copy / __reversed__ as delegations onto the wrapped mapping.
  3. dict.update keys() fast path. The previous implementation only handled *Dict source then silently fell through. CPython's PyDict_Merge does dict fast path → keys() path → pairs path. Without (2) and (3), @enum.global_enum's sys.modules[cls.__module__].__dict__.update(cls.__members__) quietly did nothing and re/__init__.py blew up with NameError: DEBUG.

Side fixes uncovered while debugging the import chain (slot lookups via lookupMethodOnSelf instead of GetAttr, objectHashDescr short-circuit, __new__ exposed in type.__dict__, KeyError sentinel) are documented in the spec's "Side fixes shipped while debugging enum import" table at website/docs/specs/1700/1704_object_protocol_full_port.md.

Also bumped actions/setup-python to v6 to clear the Node.js 20 deprecation warning across the CI matrix.

Test plan

  • go test ./... green
  • All phase gates pinned in their respective objects/*_test.go and vm/*_test.go
  • Final gate F.0a/F.0b/F.1/F.2/F.3/F.4 pinned in stdlibinit/
  • Spec 1703 + 1702 + 1704 status tables in sync with the code
  • CI green across ubuntu / macos / windows

tamnd added 4 commits May 12, 2026 19:48
Port the rest of Objects/funcobject.c's PyClassMethod_Type so a
@classmethod-decorated callable surfaces the same attributes CPython
exposes:

- __func__ and __wrapped__ both read cm_callable as read-only getsets
  (CPython exposes these via PyMemberDef; gopy collapses them onto the
  GetSetDescr machinery already used everywhere else).
- __isabstractmethod__ reflects _PyObject_IsAbstract on the wrapped
  callable: read __isabstractmethod__ off the callable, coerce to bool.
- __dict__ is now a real per-instance slot (cmDict) with a generic
  set/delete pair.
- __annotations__ and __annotate__ forward to the wrapped callable via
  descriptor_get_wrapped_attribute / descriptor_set_wrapped_attribute,
  caching the resolved value on cmDict the way CPython does.
- __class_getitem__ is bound via the existing bindClassGetitem helper.

cm_repr also catches up: it now produces <classmethod(REPR_OF_CALLABLE)>
instead of the placeholder <classmethod object> so debugging a decorated
function actually shows what was decorated.

cmInit runs functools_wraps over (__module__, __name__, __qualname__,
__doc__), and a small classMethodGetattro layers cmDict on top of the
generic attribute lookup so those copied attrs round-trip without each
one needing a dedicated getset.

Phase 2 gates from spec 1704 pass: __func__.__name__, __wrapped__ is
__func__, and __isabstractmethod__ False. The spec table on the website
is bumped to "done" and citations are refreshed against CPython 3.14
line numbers (the prior table referenced pre-3.13 offsets).
Same pattern as classmethod: sm_call so the descriptor is directly
callable, sm_repr matches <staticmethod(REPR)>, and the getsets surface
__func__, __wrapped__, __isabstractmethod__, __dict__, __annotations__,
__annotate__ plus __class_getitem__. Constructor runs functools_wraps
so wrapped attrs flow through.

The traverse test relaxes from len==1 to len>=1 because sm_dict is now
visited once functools_wraps populates it.

Also fixes the lint issues that broke PR #26 CI: nilerr on the two
IsAbstract helpers (use the //nolint:nilerr // reason form so nolintlint
is happy) and a centralises -> centralizes spelling fix.
The two leftovers were func_repr and func_traverse. Repr now matches
CPython's <function QUALNAME at 0xPTR> shape so inspect / traceback
output looks the same when two functions share a qualname. Traverse
visits every Object slot in the same order as CPython's func_traverse
so the cycle collector sees the full reference graph.

Gate 4.3 (repr.startswith("<function f at")) passes via gopy -c.
method_repr now matches CPython's <bound method QUALNAME of REPR>
shape, with the qualname-or-name fallback so builtins land on "?".
method_richcompare and method_hash port over: equality is pairwise
on (im_func, im_self), and hash mixes identity(self) with hash(func)
so two bindings off equal-but-distinct instances still hash apart.

While wiring gate 5.5 I tripped a pre-existing recursion in the
object.__repr__ slot wrapper: it called Repr(self) which routed
back through slot_tp_repr -> __repr__ -> the same wrapper. Fixed by
calling objectRepr / objectStr directly, matching CPython's
PyUnicode_FromFormat path. object.__repr__(C()) and friends now
return the default text instead of overflowing the stack.

All five Phase 5 gates pass via gopy -c.
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 12, 2026

Phases 3, 4, and 5 stacked onto this branch since the original cut. Staticmethod block is closed: sm_call so the descriptor itself is callable, the getsets surface (func, wrapped, isabstractmethod, dict, annotations, annotate, class_getitem), sm_repr matches <staticmethod(REPR)>, and the constructor runs functools_wraps so name/module/qualname/doc flow through from the wrapped callable.

Function block had two leftovers, func_repr and func_traverse. Repr now matches <function QUALNAME at 0xPTR> so inspect / traceback output looks right when two functions share a qualname (lambdas, dynamically generated wrappers). Traverse visits every Object slot in CPython's order so a future cycle collector sees the full reference graph.

PyMethod_Type also done. method_repr matches <bound method QUALNAME of REPR> with the qualname-or-name-or-?-fallback. method_richcompare ports == / != as pairwise on (im_func, im_self). method_hash mixes identity(self) with hash(func) so two bindings off equal-but-distinct instances still hash apart.

While wiring gate 5.5 I tripped a pre-existing recursion: object.repr's slot wrapper was calling Repr(self) which routed back through slot_tp_repr to the same wrapper, blowing the stack. Fixed by going straight to objectRepr / objectStr inside the wrapper, matching CPython's direct PyUnicode_FromFormat path. repr(C()) for a user class works now.

All gates 3.1-3.3, 4.1-4.3, and 5.1-5.5 pass via gopy -c. Local go test ./... and golangci-lint run ./... both green. Watching CI on the latest push.

Next up is Phase 6 (typeobject.c type_new pipeline) and Phase 7 (inherit_slots audit). Going to keep stacking onto this branch unless it gets unwieldy.

Three rows of the type_new pipeline land here:

- `Type.Qualname` plus a real `__qualname__` getset that no longer
  shadows `__name__`. `copyNamespaceToType` pulls the body's
  `__qualname__` into the field, so `class C: class D: pass` now
  reports `C.D.__qualname__ == 'C.D'` instead of just `D`.

- The class-body compiler stamps `__doc__` from the leading bare
  string literal. The store lands after the qualname store so the
  existing consts[0] qualname slot is untouched; the new
  `extractDocstring` helper avoids the conflict with the function-
  level `consumeDocstring` that pins to consts[0].

- `__init_subclass__` now receives the class-creation kwargs. The
  path runs from `__build_class__` -> `typeMetaCall` ->
  `NewUserTypeKwargs` -> `typeInitSubclass`, so a PEP 487
  hook on a Base class actually sees the `class C(Base, x=1)`
  keyword args.

Spec gates 6.1 / 6.3 / 6.4 pass; the website spec has Phase 6 marked
done and lists the side fixes shipped under it.
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 12, 2026

Phase 6 of the type_new pipeline is in. Three things that were biting downstream imports:

  • nested classes now stamp their dotted qualname. class Outer: class Inner: pass ends up with Inner.__qualname__ == "Outer.Inner" instead of the bare Inner. The compiler already had the right qualname on the inner unit, it just was not getting copied into Type.Qualname, so the getset returned Name. Added a real backing field plus a getter/setter pair, and the setter rejects writes on built-in types the way CPython's heap-type check does.
  • class body docstrings now land as __doc__. The class-body emitter was skipping the leading bare string entirely. Now it pulls the literal out (without mutating the const pool, since the qualname has already taken slot 0) and emits LOAD_CONST plus STORE_NAME __doc__ right after the qualname store.
  • PEP 487 __init_subclass__ finally threads class-creation kwargs. Added NewUserTypeKwargs and routed __build_class__ / type.__new__ / type(name, bases, ns, **kw) through it. Legacy NewUserType keeps working as the nil-kwargs case so existing Go callers do not change.

Gates 6.1, 6.3, 6.4 pass. The metaclass __init__ kwargs form is parked as a known edge case (CPython's object.__init_subclass__ rejects kwargs, so the canonical PEP 487 form uses a Base class, which is what gate 6.4 now exercises).

Phase 7 next: inherit_slots audit. Right now class L(list): pass fails on len(L()) because TpNew is not propagated from the base, and class I(int): pass fails on + for the same reason on the Number block. I will widen the slot-inheritance pass to cover the Basic / Number / Mapping / Sequence groups.

tamnd added 3 commits May 12, 2026 20:36
…04 phase 7)

A subclass of list / dict / int used to drop the base's slot table on the
floor, so len(L([1,2,3])) on `class L(list): pass` failed with
"descriptor '__len__' requires a 'list' object". inheritSlotsFromBases
only covered four basic slots (Repr / Str / Call / Hash); everything
else (TpNew, Iter, IterNext, RichCmp, DescrGet, DescrSet, Format,
TpTraverse, and the Number / Sequence / Mapping / Async tables) stayed
nil on the subclass.

Widen the pass to copy every inheritable slot. The Number / Sequence /
Mapping / Async pointers are deep-copied so fixupSubscriptSlots writes
on the subclass do not smash the base type's table. Reorder
fixupSlotDispatchers so inherit runs first, then the fixup steps
override individual slot fields when a user-supplied dunder exists.

Move the abstract-method guard into objectNew. Before this, every user
class had TpNew == nil and the guard rode along the NewInstance branch
in typeCall. With TpNew now inherited from object, the guard has to
live where CPython parks it - inside object_new.

Gates 7.1 / 7.2 / 7.3 print the expected values; the abc abstract test
still rejects Concrete() with TypeError.
The lint job tripped on cognitive complexity 49 for the function that
walked every slot field. Same logic, two helpers: inheritBasicSlots for
the scalar pointers and inheritProtocolTables for the four protocol
tables.
…ass (spec 1704 phase 8)

LOAD_NAME, STORE_NAME, and DELETE_NAME inside a class body whose
namespace is a dict subclass (the path enum hits via Meta.__prepare__)
have to go through the mapping protocol so the subclass __getitem__,
__setitem__, and __delitem__ overrides fire. lookupIn and deleteIn
were taking a *objects.Dict assertion that succeeds on subclasses too,
so the overrides were silently bypassed. Both now gate the fast path
on scope.Type() == DictType and fall back to objects.GetItem /
objects.DelItem. The deleteIn fallback also replaces the old
"unsupported scope type" error.

CPython: Python/bytecodes.c LOAD_NAME / STORE_NAME / DELETE_NAME.

Pinned by stdlibinit/name_ops_dict_subclass_test.go: one e2e gate per
opcode using a TracingDict subclass that logs every get/set/del key.
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 12, 2026

Phase 7 and phase 8 are in. CI is green on fd68a08.

Phase 7 widens inherit_slots so a built-in subclass picks up its base's full slot table. Before this, a class L(list): pass got TpNew from object instead of list, so L([1,2,3]) blew up; same story for the number/sequence/mapping/async groups. The fixup pass now runs in inherit-then-fixup order so a user dunder still wins, and the four protocol tables are deep-copied so per-subclass fixup writes can't smash the base type's struct.

One side fix worth flagging: the abstract-method guard moved from typeCall into objectNew. With every user class now inheriting TpNew from object, the old guard sat behind a nil check that never tripped. Same place CPython parks it (Objects/typeobject.c:6854 object_new).

Phase 8 is the matching VM-side fix for name ops inside a class body. When a metaclass __prepare__ returns a dict subclass (the path enum takes), the *objects.Dict type assertion in lookupIn and deleteIn succeeded on the subclass and bypassed __getitem__ / __delitem__. Both now gate the fast path on scope.Type() == DictType and fall through to objects.GetItem / objects.DelItem. deleteIn used to return "unsupported scope type" on anything but a plain Dict, which is also gone now.

Three e2e gates per phase pinned: stdlibinit/inherit_slots_test.go and stdlibinit/name_ops_dict_subclass_test.go. Lint also needed a split on inheritSlotsFromBases (the first push tripped gocognit > 30). Same logic, two helpers.

Next up: the final gate, import enum + import re + import fnmatch smoke. Spec lists those as F.1 through F.4.

tamnd added 2 commits May 12, 2026 23:11
…704 final gate prep)

Every slot dispatcher in usertype.go used to fetch its dunder via
GetAttr(self, name). That works for normal instances, but when the
receiver is a class (so type(self) is the metaclass), GetAttr returns
the descriptor unbound. The slot dispatcher then calls it with no
arguments and the user sees "__hash__() missing required argument
'self'".

Switch to a lookup_maybe_method-style helper that walks
type(self).MRO and applies descr_get with self as the bound instance.
Mirrors CPython's lookup_maybe_method (Objects/typeobject.c:2255).

Also tighten objectHashDescr: it used to call Hash(self), which means
the inherited object.__hash__, when reached via a user metaclass,
looped back through slotTpHash forever. Compute the identity hash
directly instead. Matches what object_hash does in CPython
(Objects/typeobject.c:6986).

Gates in stdlibinit/slot_method_lookup_test.go pin hash(C) on a class
with a user metaclass to a finite, stable result.
…c 1704 final gate prep)

Two side-fixes uncovered while bisecting `import enum`:

* bindCtor in builtins/init.go now also drops a __new__ wrapper into
  the type's own __dict__. Without this, `'__new__' in int.__dict__`
  was False, and enum's `_find_data_type_` could not classify
  `IntEnum(int, ReprEnum)` as having int as its data mixin. Matches
  what CPython's add_tp_new_wrapper does for every type with a
  non-NULL tp_new (Objects/typeobject.c:9952 tp_new_wrapper).

* errKeyNotFound now carries a "KeyError:" prefix so the vm's
  synthesizeException path promotes it to a real PyExc_KeyError
  instance instead of a bare Exception("KeyError"). User code in
  enum.py relies on `except KeyError` catching dict misses; with the
  bare sentinel it surfaced as Exception and slipped past every
  catch clause.

Gate in stdlibinit/builtin_new_dict_test.go pins __new__ exposure
across the builtin type roster (int, str, float, bool, list, tuple,
bytes, set, frozenset).
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 12, 2026

Two more side-fixes uncovered while chasing the enum import.

  1. bindCtor now also exposes the matching __new__ descriptor in the type's own __dict__. Without that, '__new__' in int.__dict__ was False, and enum's _find_data_type_ couldn't pick up int as the data mixin for IntEnum(int, ReprEnum). The class hit the "ReprEnum subclasses must be mixed with a data type" guard.

  2. errKeyNotFound now carries a KeyError: prefix so the vm's unwind path promotes it to a real PyExc_KeyError instead of wrapping it in a generic Exception('KeyError'). The except KeyError: clauses inside enum.py weren't matching.

Both pinned in stdlibinit/builtin_new_dict_test.go and the existing slot-dispatcher gate. Still bisecting where exactly in enum's machinery the next dict miss comes from.

tamnd added 16 commits May 12, 2026 23:34
dict.__contains__ went through Dict.GetItem and bubbled the
errKeyNotFound sentinel straight back out, so `key in d` raised
KeyError on dict subclasses (the base dict path short-circuits in the
abstract layer). enum.EnumDict.__setitem__ does `if key in self` when
storing each member, which made the very first auto() call inside a
StrEnum-derived class body explode with bare KeyError. CPython's
dict_contains returns 0/1/-1 and treats a missing key as 0; mirror
that by swallowing the sentinel and reporting False.

Gates: stdlibinit/dict_contains_subclass_test.go covers the bare
miss and the class-namespace path enum hits.
A user-defined str subclass (e.g. enum.StrEnum) needs four pieces to
match CPython:

  1. str.__new__(cls, value) must return a cls-tagged *Unicode, not a
     bare str. Done via SetStrTpNewBase: the value path stays in the
     existing StrOf, the type path re-wraps the result with newStrAs.
  2. Instances need somewhere to put instance attributes. Added an
     attrs *Dict field on Unicode, lazy-allocated on first write.
  3. Subclasses pick up that storage automatically. NewUserTypeKwargs
     now wires Getattro/Setattro to strSubclass* and inherits
     strType.TpNew, mirroring the existing IsSubtype(t, dictType) arm.
  4. Subclasses inherit str's slot behaviour. Added Python-level
     descriptors for __repr__/__str__/__hash__ plus the rich-compare
     slots on strType so MRO lookup finds them before object's
     defaults.

Three new gates in stdlibinit/str_subclass_test.go cover the subtype
tag, attribute storage, and slot inheritance.
golangci-lint's misspell pass flagged 'recognises' on the previous
commit. Switch to 'recognizes' to match the project's American-English
default.
Before this commit the RERAISE arm did fmt.Errorf(\"%s\", repr(exc)) on
the popped exception, which turned a typed exception into a plain
string. The unwind path then ran synthesizeException on that string;
since repr(KeyError('x')) starts with 'KeyError(' (not 'KeyError:'),
none of the prefix matches fired and the result was a bare Exception.
Any outer 'except KeyError' wouldn't match anymore.

That's exactly the path enum._proto_member.__set_name__ relies on:
the inner 'except TypeError' has to miss the KeyError from
_value2member_map_[value] so the outer 'except KeyError' can catch
it. Importing enum panicked at the first auto() member.

Port the arm against CPython Python/bytecodes.c:1429:

  inst(RERAISE, (values[oparg], exc_st -- values[oparg])) {
      ...
      if (oparg) { frame->instr_ptr = ... + lasti; }
      _PyErr_SetRaisedException(tstate, exc);
      goto exception_unwind;
  }

When the popped value is a real *Exception, route it through
pyerrors.Raise/excSentinel so unwind reads it back via
pyerrors.Occurred without synthesis. For oparg >= 1 (the with-block
forms), peek the lasti underneath and restore InstrPtr in bytes.
NewUserTypeMeta is the full-form constructor: it takes the metaclass
and writes Py_TYPE(t) = meta before the namespace copy and the
typeSetNames pass, so PEP 487 __set_name__ hooks that resolve
metaclass-defined methods (cls._add_member_, cls._missing_, ...) walk
the right metatype. typeNewBuiltin now routes through it instead of
calling NewUserTypeKwargs and then Init(meta) after the fact, which
was the source of enum.EnumType.__new__ failing to find _add_member_
on FlagBoundary.

CPython: Objects/typeobject.c:4153 type_new (Py_TYPE(type) = metatype)
CPython's add_operators walks slotdefs during PyType_Ready and installs
a wrapper_descriptor for every C-level slot the type defines, including
tp_descr_get and tp_descr_set. gopy has no central type-ready pass for
built-in types, so descriptor types (function, classmethod,
staticmethod, method_descriptor, getset_descriptor, member_descriptor,
property, super) only had the C-level DescrGet/DescrSet set without
ever publishing the Python-level dunder. hasattr(fn, '__get__')
returned False, which made enum._is_descriptor classify functions as
enum members and blew up _proto_member.__init__.

The new addDescriptorSlotWrappers helper ports the wrap_descr_get /
wrap_descr_set / wrap_descr_delete shape and is called from each
affected type's init pass. add_operators's "already defined wins"
behavior is preserved by skipping names already present in the type's
descriptor table.

CPython: Objects/typeobject.c:9685 wrap_descr_get
CPython: Objects/typeobject.c:9706 wrap_descr_set
CPython: Objects/typeobject.c:9721 wrap_descr_delete
CPython: Objects/typeobject.c:10989 slotdefs (descr_get/descr_set/descr_delete)
When Python code calls type() directly with a (name, bases, ns) tuple
and extra kwargs, CPython picks the most derived metaclass among
type and the metaclasses of the bases and re-dispatches to its
tp_new. enum._simple_enum runs exactly this dance with
`type(cls_name, (etype,), body, boundary=..., _simple=True)` so the
boundary kwarg lands on EnumType.__new__. Before this, gopy went
straight to NewUserTypeKwargs with type as the metaclass, the
kwargs fell through to typeInitSubclass, and object.__init_subclass__
rejected them.

The new calculateMetaclass mirrors _PyType_CalculateMetaclass; when
the winner differs from type, typeMetaCall hands off to
typeMetaclassCall so __new__/__init__ on the winning metaclass run
with the original args and kwargs.

CPython: Objects/typeobject.c:3921 _PyType_CalculateMetaclass
CPython: Objects/typeobject.c:4728 type_new (winner dispatch)
…vate

visitAttribute emitted LOAD_ATTR / STORE_ATTR / DELETE_ATTR with the
literal attr name, so a method that said `self.__next` looked up the
unmangled spelling at runtime and tripped AttributeError. nameOp also
mangled against c.scope.Name, which is the function's own name once
the codegen is inside a method, not the enclosing class.

Track u_private on Unit, inherit it from the parent unit on
enterScope, and override it to the class name when entering a class
body. Both attribute and name codegen now mangle against
c.unit().Private, matching CPython's compiler_mangle.
range previously only exposed Iter, so `range(n)[::-1]` and the
contains-fast-path used by `x in range(...)` both fell through to a
generic "not subscriptable" / iterator walk. Stdlib re hit this
during `for i in range(len(subpattern))[::-1]`.

Port rangeobject.c's compute_range_length, compute_item,
compute_range_item, range_item, range_subscript and range_contains
in full and wire them through Sequence and Mapping. Arithmetic stays
on math/big so unbounded ranges behave correctly.
int subclasses defined in Python (re._constants._NamedIntConstant,
enum's IntEnum, etc.) had no instance attribute storage, so the
constructor's `self.name = name` failed with "no __dict__". int(cls,
value) also discarded cls because bindCtor went straight to IntCtor,
producing a plain int even when the caller asked for a subclass.

Add Int.attrs and intSubclassGetAttr / intSubclassSetAttr matching the
str subclass pattern. Port a cls-aware tp_new through SetIntTpNewBase
and wire it via NewUserTypeMeta's IsSubtype switch. Also add the
bit_length / bit_count / __index__ / __int__ / __trunc__ /
__floor__ / __ceil__ / conjugate method panel from
Objects/longobject.c long_methods.
`del l[a:b]` and `del l[i]` both raised "object does not support
item deletion" because list only exposed sq_item / sq_ass_item, not
mp_ass_subscript. Port list_subscript / list_ass_subscript so the
mapping protocol routes int and slice keys through the same dispatch
CPython uses, with helpers for slice get / set / delete and the
iterable-drain used by extended-slice assignment.

PyMapping_Check returns true for CPython lists (they have
mp_subscript), so update the two abstract_mapping tests that asserted
the opposite. Tuple still has no mp_length and stands in for the
"sequence-only" assertion.
`b'\x00' * n`, `b'a' + b'b'` and `for x in b'abc'` all raised
"unsupported operand" or "not iterable" because BytesType only had
GetItem / Contains wired. Port bytes_concat, bytes_repeat, the
bytesiterobject pair and hand them to Sequence + tp_iter so the
stdlib paths in re / pickle / struct stop blowing up on the trivial
forms.
re/__init__.py reaches `import copyreg` near the bottom to register
the Pattern pickler. Without copyreg vendored the entire `import re`
fails. Pull copyreg.py from CPython 3.14 unchanged and record the
hash in MANIFEST.txt.
Drops a focused regression test in front of the F.3 import-re gate
so the next time something in slot dispatch, mangling or stdlib
vendoring regresses, the failure surfaces before it reaches the
broader regrtest suite.
fnmatch was already in the vendored modules list as a task target
(F.4 in the final-gate plan) but the actual .py wasn't checked in.
Drop the file from CPython 3.14 unchanged, line it up in MANIFEST
alphabetically, and add a one-shot `import fnmatch` smoke test that
guards the gate the same way TestImportReBareSmoke guards F.3.
CI lint flagged two issues from the previous push:
  - list.go used `err == ErrStopIteration` and chained appends with
    differing destinations
  - range.go used the same `err == ErrStopIteration` pattern in the
    iter fallback for contains.

Switch both to errors.Is, rewrite the listSetSlice merge to grow a
fresh slice instead of stacking appendAssign, and fix the misspelt
"materialises" docstring while passing through.
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 12, 2026

F.3 and F.4 are green now. import re and import fnmatch both run end-to-end through the gopy import path, gated by two new smoke tests in stdlibinit so a regression in any of the underlying subsystems lights up before we get back to the regrtest corpus.

The thing that turned out to actually be blocking re was a long tail of half-wired pieces in the object model rather than anything in _sre:

  • attribute name mangling. visitAttribute emitted LOAD_ATTR / STORE_ATTR / DELETE_ATTR with the literal name, and nameOp mangled against the function's own name. Both now route through a new u_private field on the codegen Unit, inherited from the parent on enterScope and overridden to the class name inside class bodies, matching compiler_mangle. That was what was producing the 'Tokenizer' object has no attribute '__next' error inside re/_parser.py.
  • int subclasses (re uses _NamedIntConstant(int)). No instance attribute storage and the constructor dropped cls. Ported the slot_tp_new dispatcher, added cls-aware intTpNew via SetIntTpNewBase, and gave Int an attrs dict + matching getattro/setattro that mirror the str subclass pattern.
  • range was only iterable. Ported the full subscript pipeline from rangeobject.c: compute_range_length, compute_item, compute_range_item, range_subscript, range_item, range_contains. Arithmetic stays on math/big so unbounded ranges keep working.
  • list had no mp_ass_subscript so del l[a:b] and del l[i] raised. Ported list_subscript / list_ass_subscript with slice get / set / delete helpers.
  • bytes was missing Iter / Concat / Repeat. b'\x00' * n, b'a' + b'b' and iterating bytes all work now.
  • int.bit_length / bit_count / __index__ / __int__ / __trunc__ / __floor__ / __ceil__ / conjugate from long_methods.
  • vendored copyreg.py (re registers a Pattern pickler) and fnmatch.py.

Two existing objects mapping tests asserted that list was NOT a mapping, which contradicted CPython (list does carry mp_subscript). Flipped both. Tuple stands in for the "sequence-only" assertion now.

Local + CI both green. Next up after this lands is the regrtest sweep over the corpus.

tamnd added 5 commits May 13, 2026 08:47
GET_ITER in eval_simple.go was calling tp.Iter(obj) directly. User
classes that expose __getitem__ + __len__ but no __iter__ never get
a tp.Iter slot wired up, so `for x in s` blew up with
"object is not iterable" instead of falling through to the
sequence-protocol iterator the way PyObject_GetIter does in CPython
(Objects/abstract.c:2786, the PySeqIter_New branch).

The re subsystem trips this for free: re/_compiler.py iterates
SubPattern values directly, and SubPattern only defines __getitem__.

Spec 1704 final gate table reset to reflect the current state:
F.1/F.2 are pinned passing via the existing enum import test,
F.4 passes via the fnmatch smoke test, F.3 is partial (import passes,
re.match still blocked on the bytearray methodlist full port that
ports Objects/bytearrayobject.c bytearray_methods).
setup-python@v5 still ships the Node.js 20 runner shim, which
GitHub flagged as deprecated on every run starting May. v6 is on
Node.js 24 and is otherwise drop-in for the same input keys.
Phase 2 gates were marked pass in the spec but the table didn't
point at the Go test that actually pins them. Add the
TestClassMethodGetSetGates reference so the audit trail matches
the rest of the spec.
The final gate row was still pending in the top-of-spec phase index
even though four of the six sub-gates land. Update to "partial" with
the actual breakdown so the index agrees with the detail table.
The Phase 2 functions-to-port table now references the same pinned
tests we added to the gates table last commit. Each "done" row names
the test in objects/method_test.go that exercises it, and the two
n/a rows spell out the CPython 3.14 semantics that make them n/a
(classmethod has no __set_name__ slot; PEP 487 forwarding is handled
by the descriptor protocol binding cls).
@tamnd tamnd changed the title v0.12.3: port classmethod block in full (spec 1704 phase 2) v0.12.3: spec 1704 phases 1-8 + final gate F.1/F.2/F.4 (#544 closed, #543/#542 in flight) May 13, 2026
….3 re.match gate

This is the final stretch of spec 1704 phase 2. Three pieces had to land
together before re.match could survive its way through `import re`:

1. Bytes/bytearray method panel. Both types now expose the full CPython
   3.14 methodlist as descriptors instead of relying on inline VM
   shortcuts (`objects/bytes_methods_case.go`,
   `objects/bytes_methods_descr.go`, `objects/bytes_methods_init.go`,
   `objects/bytearray_methods_descr.go`). One closure per method on
   both types inspects args[0] so a `find` / `replace` / `strip` etc.
   on a bytearray returns a bytearray, on bytes returns bytes, no
   surprise copies. Mutation methods (append / extend / insert /
   pop / remove / clear / reverse / copy) only attach to bytearray.

2. mappingproxy methodlist. `Objects/descrobject.c mappingproxy_methods`
   was unwired, so `cls.__members__` returned True for `hasattr` of
   nothing in particular but `hasattr(mp, "keys")` was False. Wired
   get / keys / values / items / copy / __reversed__ as delegations
   onto the wrapped mapping.

3. dict.update keys() fast path. The previous implementation only
   handled `*Dict` source then fell through silently for anything
   else with no keys() detection. `@enum.global_enum` does
   `sys.modules[cls.__module__].__dict__.update(cls.__members__)`,
   so without (2) and (3) the IntFlag members never landed in the
   module namespace and `re/__init__.py` blew up with
   `NameError: DEBUG`.

Pin and gate flips:

- `stdlibinit/re_match_smoke_test.go` pins the F.3 final gate
  command end-to-end.
- Spec 1704 F.3 / F.5 / F.6 all flip to `done`.
- Spec 1703 phases 7, 8 and Final gate flip to `done`.
- Spec 1702 picks up an `enum` row and the `re / _sre` row flips
  from `pending` to `done`.
@tamnd tamnd changed the title v0.12.3: spec 1704 phases 1-8 + final gate F.1/F.2/F.4 (#544 closed, #543/#542 in flight) v0.12.3: spec 1704 phases 1-8 + final gate (#544 closed, #543/#542/#510 closed) May 13, 2026
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 13, 2026

F.3 lands. The blocker chain was longer than the spec made it look:

re._compiler walks the program once and emits arr (a bytearray) plus a charmap; the cleanup loop calls charmap.find(1, q). Wiring bytearray.find made it past that, but then re/__init__.py died on NameError: DEBUG because @enum.global_enum was a no-op.

That decorator does sys.modules[cls.__module__].__dict__.update(cls.__members__). Two things were broken:

  1. cls.__members__ is a mappingproxy, and gopy's mappingproxy didn't expose keys(). CPython has a 7-entry methodlist on it (get/keys/values/items/copy/__reversed__/__class_getitem__).
  2. Even with keys() wired, our dict.update had no fallback for "thing with keys()" between the *Dict fast path and the iterable-of-pairs path. CPython's PyDict_Merge does all three.

Once those went in, re.match(r"(\d+)-(\d+)", "12-34").groups() returns ('12', '34') without any RE2 shim or Python-level patching.

Pinned by stdlibinit/re_match_smoke_test.go. Specs 1702 / 1703 / 1704 all updated to match the actual ship state.

tamnd added 2 commits May 13, 2026 09:34
Splits dictMergeFromArg into three single-purpose helpers so each one
stays under the gocognit threshold. Switches the ErrStopIteration
checks to errors.Is, drops a redundant idx initialiser, renames
bytesIsAscii to bytesIsASCII to satisfy revive, and trims the trailing
periods off two TypeError messages. translateMethod and hexMethod
mirror CPython's bytes_translate_impl / bytes_hex_impl branch tree,
same as str_methods.go, so they pick up the gocognit/gocyclo
exclusion that already exists for those parity files.
File C (Objects/classobject.c): bring PyMethod_Type up to its full
CPython surface. method___reduce___impl backs pickling, the doc
descriptor proxies to the wrapped function, method.__new__ enforces
the 2-arg callable/instance contract, and the descriptor __get__
returns self. Add the entire PyInstanceMethod_Type block: type
singleton, NewInstanceMethod, repr, call, getattro, traverse,
richcompare, descr_get, doc, and the clinic-generated __new__.

File D (Objects/funcobject.c): close the PyFunction_Type surface.
Wire FunctionType.TpNew to a port of func_new_impl with full
6-argument validation (code/globals/name/argdefs/closure/kwdefaults).
Split the getsets so __code__ / __defaults__ / __kwdefaults__ /
__annotations__ each get the typed setter from func_set_*; add the
missing __annotate__ getset that clears cached annotations on swap.

Tests cover both surfaces: 9 cases for bound and instance methods
in method_test.go, 16 for the function-attr setters and __new__
parsing block in function_test.go.

Lint adjustments are narrow: function.go joins the CPython-mirroring
exclusions (funcTpNew matches func_new_impl's switch branch-for-branch),
and the InstanceMethod field renames from func_ to function to clear
revive's var-naming.
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 13, 2026

Files C and D are done.

PyMethod_Type now has the full surface: __reduce__ for pickling, the doc descriptor that proxies to the wrapped function, the typed __new__, and __get__ returning self. The whole PyInstanceMethod_Type block went in alongside it. Nine new tests in method_test.go pin the behaviour.

For PyFunction_Type I split the getsets so __code__, __defaults__, __kwdefaults__ and __annotations__ each get the typed setter from func_set_*, plus the missing __annotate__ getset that clears the cached annotations dict on swap. FunctionType.TpNew now ports func_new_impl end-to-end: six positional args, keyword forms, the freevar-count and cell-type checks. Sixteen new cases in function_test.go walk the setter and __new__ validation tree.

Two lint adjustments: function.go joined the CPython-mirroring exclusions because funcTpNew matches func_new_impl's switch one-for-one, and InstanceMethod.func_ got renamed to function to clear revive's var-naming.

CI green on the push.

Next up is file A (Objects/object.c), then file B (typeobject.c). Both stay on this branch.

Add object_proto.go with the public PyObject_* helpers we did not yet
expose: ObjectBytes (PyObject_Bytes), ASCII (PyObject_ASCII), Not
(PyObject_Not), Callable (PyCallable_Check), SelfIter, LookupAttr /
LookupAttrString (PyObject_GetOptionalAttr), HasAttr / HasAttrString
(PyObject_HasAttrWithError), Dir (PyObject_Dir), and GetMethod
(_PyObject_GetMethod's fast path).

Each helper carries a CPython citation with line number, and the
LookupAttr/HasAttr pair shares a single isAttributeError prefix check
to keep behaviour consistent. Callable replaces the unexported
isCallable in function.go now that the protocol-level export exists.

15 tests in object_proto_test.go cover the surface: bytes passthrough
+ NULL sentinel + str fallback + Int TypeError, ASCII passthrough vs
backslash-escape, Not on truthy/falsey/None, Callable on functions and
non-callables, LookupAttr / HasAttr present/missing, and Dir's sort.
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 13, 2026

File A is in.

The piece that was actually missing from Objects/object.c was the protocol-level surface that wraps the type slots: PyObject_Bytes, PyObject_ASCII, PyObject_Not, PyCallable_Check, PyObject_SelfIter, PyObject_GetOptionalAttr, PyObject_HasAttr, PyObject_Dir, and _PyObject_GetMethod. None of those existed as exported helpers in gopy. The methodlist/getsetlist piece (object_methods / object_getsets) was already done under #545 since those tables live in typeobject.c.

New file is objects/object_proto.go. Each helper carries the CPython filename + line number. The two attribute lookups share a single isAttributeError prefix check so suppression behaves the same way in both. Callable replaces the unexported isCallable shim in function.go since the public version is now around.

15 tests pin the surface: bytes passthrough + NULL sentinel + str fallback + Int TypeError, ASCII passthrough vs backslash-escape, Not on truthy / falsey / None, Callable on functions and non-callables, LookupAttr and HasAttr present / missing, and Dir's sort invariant.

CI green.

File B (typeobject.c, 12k lines) is the last one.

Adds the missing PyType_* introspection API: GetName, GetQualName,
GetModuleName, GetFullyQualifiedName, GetFlags, SupportsWeakrefs,
GenericAlloc, GenericNew, and an exported CalculateMetaclass that
wraps the existing package-internal helper.
@tamnd
Copy link
Copy Markdown
Owner Author

tamnd commented May 13, 2026

File B (Objects/typeobject.c) is in. Added objects/type_proto.go with the PyType_* introspection surface that was missing on our side: GetName, GetQualName, GetModuleName, GetFullyQualifiedName, GetFlags, SupportsWeakrefs, GenericAlloc, GenericNew, and CalculateMetaclass (exported wrapper around the package-internal helper that type_call.go already uses).

The fully-qualified name builder follows CPython's rule: skip the module prefix when it's "builtins" or "main", otherwise join with a dot. Module lookup distinguishes heap types (look at module, AttributeError if missing) from static types (split tp_name on dot or fall back to "builtins"). TypeSupportsWeakrefs is always true for non-nil types because gopy carries the weakref list head on every Header so there's no per-type opt-in.

14 new tests, full suite green, lint clean.

@tamnd tamnd merged commit b424a14 into main May 13, 2026
6 checks passed
@tamnd tamnd deleted the feat/v0.12.3-spec-1704-phase2 branch May 13, 2026 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant