Add a DataFrame front door to populace-fit with explicit-weights contract#290
Open
MaxGhenis wants to merge 1 commit into
Open
Add a DataFrame front door to populace-fit with explicit-weights contract#290MaxGhenis wants to merge 1 commit into
MaxGhenis wants to merge 1 commit into
Conversation
…ract RegimeGatedQRF.fit (and the ConditionalModel protocol) now accept a plain pandas DataFrame as well as a Frame, for standalone use outside a populace stack. A bare DataFrame has no typed weights, so the operator's no-silent- unweighted-fit rule inverts from "safe default" to "no default": the caller must state weights explicitly — a weight column name, a 1-D weight vector, or weights="none" — and omitting them raises instead of silently fitting unweighted. Past weight resolution the two paths are the same model, pinned by a bit-for-bit parity test against the Frame path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
RegimeGatedQRF.fit(and theConditionalModelprotocol, plus thepopulace.fit.fitconvenience) now accept a plain pandasDataFrameas well as aFrame, mirroring the unionpredictalready takes. This is the standalone front door for using the canonical imputer outside a populace stack.The weight contract
A bare DataFrame has no typed weights, so the operator's defining rule — no silent unweighted fit — cannot ride on the
"design"default. On the DataFrame path the rule inverts from safe default to no default:weights=<column name>— a numeric weight column of the DataFrame (refused if it is also a predictor or target),weights=<1-D vector>— array/Series read positionally, validated (length, finiteness, non-negativity, positive mass),weights="none"— the only unweighted path, stated deliberately.Omitting
weights(or passing a typed-kind spelling like"design") raises with an actionable message. This makes the unweighted-training failure mode that produced the eCPS point-mass "landmines" unrepresentable at the API: you cannot forget weights, you can only decline them in writing.resolve_dataframe_fit_weights/dataframe_fit_columnslive inmodel.pybesideresolve_fit_weights/predictors_targets_entity, keeping the weight rule enforced in one module for both front doors.Behavior
"none".entity=None;predict(Frame)on it raises with guidance (predicting for DataFrames works as before, index preserved).fit's first parameter is renamedframe→frame_or_df; the repo has no keyword callers).Tests
18 new tests in
test_dataframe_fit.py: the explicit-weights requirement, typed-kind refusal, column/vector/Series equivalence, Frame↔DataFrame parity,"none"reservedness, weight and column validation matrix, entity-less predict behavior, and the convenience wrapper. Full workspace suite passes locally (3.14).Why now
First of three steps making populace-fit reusable outside populace (per the imputation-paper plan): DataFrame front door → PyPI publication of
populace-frame/populace-fit→ standalone quickstart. The paper's software section will document this API as the external-use path.🤖 Generated with Claude Code