Replies: 9 comments 15 replies
-
Area 1 — raw provider responsesTodayEach backend writes a different bag of keys to
External readers: ProposalAdd a typed @dataclass
class RawProviderResponse:
"""Backend-native response payload. Provider-coupling escape hatch.
Anyone reading these fields is opting into provider-specific shape. Portable
consumers should read `mot.value`, `mot.parsed_repr`, `mot.tool_calls`, or
`mot.generation` instead.
"""
provider: str | None = None # mirrors generation.provider for self-description
response: Any | None = None # post-merge final response object
streamed_chunks: list[Any] | None = None # populated only for streaming
MigrationShip Coordination notes
Open questions on Area 1OQ1: Does I lean toward Counter-argument worth airing: a OQ2: Should With Cleaner; adds one |
Beta Was this translation helpful? Give feedback.
-
Area 3 — computation logisticsTodayThe seventeen private attrs (post-#1181) sit flat on the MOT and split naturally into three groups, one of which is already enforced by ProposalTwo internal sub-objects matching the deepcopy semantics, plus @dataclass
class _CallInfo:
"""Originating-call data. Preserved across copies because retries and
sampling routinely need it."""
action: Component | CBlock | None = None
context: list[Component | CBlock] | None = None
model_options: dict[str, Any] | None = None
generation_id: str | None = None # added by #1181
@dataclass
class _GenerationState:
"""In-flight computation machinery. Reset to a fresh empty instance on
__copy__ / __deepcopy__ — a copied MOT is a distinct (non-generating)
object and must not share queues, tasks, or thread signals."""
queue: asyncio.Queue = field(default_factory=lambda: asyncio.Queue(maxsize=20))
chunk_size: int = 3
first_chunk_received: bool = False
generate: asyncio.Task[None] | None = None
generate_type: GenerateType = GenerateType.NONE
generate_extra: asyncio.Task[Any] | None = None
cancel_hook: Callable[[], None] | None = None
process: Callable[[ModelOutputThunk, Any], Coroutine] | None = None
post_process: Callable[[ModelOutputThunk], Coroutine] | None = None
on_computed: Callable[[ModelOutputThunk], Coroutine] | None = None
start: datetime.datetime | None = NoneNote
MigrationAll For private attrs read by external callers, Tradeoffs
Interaction with
|
Beta Was this translation helpful? Give feedback.
-
Area 4 —
|
Beta Was this translation helpful? Give feedback.
-
Area 5 (new) — fold in #1191: public error / generate_log surfaceWhy include this#1191 is TodayWhen
ProposalTwo changes:
The migration path for #1191's specific case: after both changes ship, the caller code is: results = await asyncio.gather(*calls)
for r in results:
if r.error is not None:
# soft-failure
...
elif r.cancelled:
# explicit cancellation
...
else:
# ok
use(r.value)…which is what #1191 asks for. Closes #1191 as part of this work. Coordination
Open questions on Area 5OQ5: #1191 floats merging them ("mark the soft-failed empty thunk cancelled, carrying the error"). I lean toward keeping them separate:
Conflating loses information that callers may want — was this a reader cancellation or a backend failure? Different remediation. But there's a real case for converging too: from a "did this thunk produce a usable result" perspective, the answer is the same. Worth resolving here before Area 5 implementation lands. OQ6: Should Internal code mutates Probably expose the attr by reference (mutation possible but discouraged); sampling stays on the underlying private name in Alternative: expose a frozen view (e.g. dataclass |
Beta Was this translation helpful? Give feedback.
-
Appendix A — bugs, stale code, and inconsistencies found during researchThese surfaced during the audit but are not part of #909 itself. Listed here so they can either ride along in the relevant area's PR or be split out as separate fix PRs. None are blockers.
|
Beta Was this translation helpful? Give feedback.
-
|
Overall shape and sequencing look good — #1181 first, then Area 5, then 1/4, then 3 makes sense. A few per-area replies below. One thing being filed separately: promoting |
Beta Was this translation helpful? Give feedback.
-
|
Do you have a stubbed out version of what the updated MOT class would look like? |
Beta Was this translation helpful? Give feedback.
-
|
I think this proposal looks good! Happy with the direction and think that it's thought out; I don't see any blockers to implementing. Thank you! |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for all the feedback, the issues are now all open and the epic updated:
Further design questions on individual areas should go on the relevant sub-issue. Closing this discussion as resolved. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This Discussion is the design review for #909 (
ModelOutputThunkstructural cleanup). The issue itself was filed in April as a follow-up to #793 and has aged through the streaming-validation epic (#891, now closed) and the in-flight tracing refactor (#1181, approved). The shape of the cleanup has shifted with the codebase, and one related issue (#1191) is closely-enough scoped that it should be folded in.Posting here rather than on the issue so each area and the appendix can thread independently. Per-area open questions live at the bottom of each area's comment.
How to comment
TL;DR
_meta[...]keys onto a typedmot.raw: RawProviderResponsenamespace. Migratechat.py:_parseto switch onmot.raw.provider.mot._call(originating call, preserved across copies) andmot._gen(in-flight machinery, reset on copy). Make the preserve-vs-reset semantics that__deepcopy__already implies explicit. Keepmot._generate_logexposed (see Area 5) — do not bury it._thinking) — promote to publicmot.thinkingwith a property alias for_thinkingthrough one deprecation cycle.mot.generate_logas a public property, and add a publicmot.errorchannel so callers can detect a soft-failed thunk without reading private state.mot.generationper ModelOutputThunk field refactor: partition fields into sub-structures #793 precedent, not inmot.raw),ModelOutputThunkshould not be aCBlock#269 (MOT-not-CBlock — orthogonal, out of scope, called out so reviewers know).State of the world today
Reading
mellea/core/base.pydirectly. Contrasted against #909's April field list because the class has shifted since then._metastill carrieschat_response,oai_chat_response[_choice/_streamed],oai_streaming_usage,litellm_full_response,litellm_chat_response[_streamed],litellm_streaming_usage,hf_output, etc.chat.py:_parsestill branches on these keys.telemetry_spanfeat/enhanced-tracing,_meta["_telemetry_span"]is gone frombase.pyand all five backends; spans live on the OTel context owned byBackendTracingPlugin, andcancel_generation/astreamerror path fireGENERATION_ERRORhooks instead of poking_meta. Drop from #909 once #1181 merges.mot.generation(precedent for sub-structures), #942 addedcancel_generation()+_cancelled+ publiccancelledproperty (locked in the single-consumer queue contract), and #1181 will add_generation_idas another flat private attr._thinkingThe audit also surfaced things #909 didn't enumerate:
_generation_id(post-refactor(telemetry)!: move backend tracing onto plugin/hook pattern #1181) — Mellea-side hook correlation ID. Joins the area-3 field set.__deepcopy__already groups fields implicitly. It preserves_action,_context,_model_options,_generate_log,generation(originating-call data) and resets the in-flight machinery. Any area-3 grouping should mirror this distinction, not collapse it._generate_logis heavily relied on as private surface. 55 external accesses across 19 files (every backend, every sampling strategy,functional.py,session.py, multiple test files). It is not movable without a property alias, and Soft-fail errors on ModelOutputThunk are only reachable via a private attribute #1191 explicitly wants it more visible, not less.mot.cancelledexists butstreaming.py:437still readsmot._cancelled. See appendix._meta["usage"]reads inbudget_forcing_alg.pypredate themot.generation.usagemove. See appendix.Cross-cutting constraints
These shape every area below.
_metaisCBlocksurface, not just MOT surface. This proposal does not changeCBlock._meta; it only removes MOT's own writes into the dict. ReshapingCBlockis out of scope, andModelOutputThunkshould not be aCBlock#269 (which would moot the question by removing the inheritance) is also out of scope.mot.value,mot.parsed_repr,mot.tool_calls,mot.generation,mot.cancelled,mot.is_computed(), andmot._thinking,mot._meta["chat_response" | "oai_chat_response" | "usage"],mot._generate_logare all read by code outsidebase.pytoday. Any move needs property aliases through at least one minor release.cancel_generation(),mot._async_queuesingle-consumer semantics, andmot.cancelledare load-bearing forstream_with_chunking()(feat(stdlib): add stream_with_chunking() with per-chunk validation (#901) #942, merged May 2026). Area 3 cannot rename in a way that breaks those names without coordinated migration._genshape should leave room for parsed-repr stream buffers but should not block on feat(core): Phase 2 — MOT-owned streaming chunking via stream_parsed_repr (placeholder) #1013 — that issue could remain open indefinitely._generation_id— Area 3's grouping needs to know about it.DeprecationWarningare the right migration tool, but a long deprecation horizon isn't required.Related-issues coordination summary
_generation_idfrom this PR._genshape in Area 3 should accommodate Phase 2 streaming buffers. Land Area 3 first; #1013 benefits.mot.generation(e.g.generation.logprobs), notmot.raw. Per #793 precedent.mot.rawis unaffected (already a sibling of_meta, not nested under it); the rest of #909 is also unaffected.Sequencing
If this shape is approved, I'll convert #909 into an epic with sub-issues per area, fold in #1191 as Area 5, and land in this order:
_generation_idis in the Area 3 field set.generate_log+errorproperties). Smallest. Closes Soft-fail errors on ModelOutputThunk are only reachable via a private attribute #1191.mot.raw). No dependency on 3/4/5. Can ship after refactor(telemetry)!: move backend tracing onto plugin/hook pattern #1181 merges.mot.thinking). Independent. Same release as Area 1 is fine._call+_gengrouping) lands last. Largest internal rename. Original April rationale ("hold for feat: streaming validation — per-chunk requirement checking with early exit #891") is satisfied — Phase 1 is done. Recommended to land now, not wait for feat(core): Phase 2 — MOT-owned streaming chunking via stream_parsed_repr (placeholder) #1013 placeholder to elaborate; the_genshape is independently motivated by the existing copy-method maintenance pain.Per-area details (each with its open questions) and the bug appendix are in the comments below.
Beta Was this translation helpful? Give feedback.
All reactions