You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the v2 stat framework, @stat(default=...) is a binary switch: on exception you get either a StatError (no default) or a silent Ok(default) (default set). There is no "use the fallback for display and surface the error" path.
This means any @stat that wants a display-safe fallback has to choose between (a) failing the cell entirely or (b) hiding real failures from the user.
Concrete case: the ibis histogram (#686) declares default=[] so a backend that rejects the GROUP BY still renders a blank histogram cell instead of breaking the row. With today's framework, that GROUP BY failure is invisible — no error in errs, no log, nothing.
resolve_accumulator builds the errors list by scanning the accumulator for Err entries, so an Ok(default) write there is unrecoverable downstream.
Note: this is independent of #686's batch path — IbisStatPipeline Phase 1 always emits Err and ignores default, so the same @stat with default=[] behaves differently depending on whether it landed in the batch phase or the per-column phase. Worth fixing the inconsistency at the same time.
Desired behavior
accumulator[name] = Ok(default) so dependents and the rendered table see the fallback.
The exception is also surfaced in the errors list returned from process_df / process_table.
errs (v1-compat dict) gets the entry too, so existing consumers see it.
Sketch of fix
Two options:
A. Side-channel error list (small). Pass a mutable error list into _execute_stat_func; on default-fallback, append a StatError there while still writing Ok(default) to the accumulator. process_column / process_table merge it with errors derived from accumulator Err entries.
B. New StatResult variant: OkWithWarning(value, error) (cleaner, bigger). Every isinstance(x, Ok) site becomes isinstance(x, (Ok, OkWithWarning)). resolve_accumulator lifts the warning into the errors list while the value flows to dependents. Better long-term model; touches more code.
A is ~10 lines and contained. B is the right shape if we expect more "non-fatal" stat outcomes.
Problem
In the v2 stat framework,
@stat(default=...)is a binary switch: on exception you get either aStatError(no default) or a silentOk(default)(default set). There is no "use the fallback for display and surface the error" path.This means any
@statthat wants a display-safe fallback has to choose between (a) failing the cell entirely or (b) hiding real failures from the user.Concrete case: the ibis histogram (#686) declares
default=[]so a backend that rejects the GROUP BY still renders a blank histogram cell instead of breaking the row. With today's framework, that GROUP BY failure is invisible — no error inerrs, no log, nothing.Where
buckaroo/pluggable_analysis_framework/stat_pipeline.py:221-233:resolve_accumulatorbuilds the errors list by scanning the accumulator forErrentries, so anOk(default)write there is unrecoverable downstream.Note: this is independent of #686's batch path —
IbisStatPipelinePhase 1 always emitsErrand ignoresdefault, so the same@statwithdefault=[]behaves differently depending on whether it landed in the batch phase or the per-column phase. Worth fixing the inconsistency at the same time.Desired behavior
accumulator[name] = Ok(default)so dependents and the rendered table see the fallback.errorslist returned fromprocess_df/process_table.errs(v1-compat dict) gets the entry too, so existing consumers see it.Sketch of fix
Two options:
A. Side-channel error list (small). Pass a mutable error list into
_execute_stat_func; on default-fallback, append aStatErrorthere while still writingOk(default)to the accumulator.process_column/process_tablemerge it with errors derived from accumulatorErrentries.B. New
StatResultvariant:OkWithWarning(value, error)(cleaner, bigger). Everyisinstance(x, Ok)site becomesisinstance(x, (Ok, OkWithWarning)).resolve_accumulatorlifts the warning into the errors list while the value flows to dependents. Better long-term model; touches more code.A is ~10 lines and contained. B is the right shape if we expect more "non-fatal" stat outcomes.
Out of scope for this issue
defaultshould be allowed at all on stats that have downstream dependents (separate question about how upstream defaults propagate).