FIX: sklearn clone() returns empty terms list (#340)#474
Open
VanshKharb wants to merge 4 commits into
Open
Conversation
- Save original initialization params before `GAM.fit()` mutates them - Override `get_params(deep=False)` to restore original params for `sklearn.clone()` - Add `__sklearn_clone__` to `Term` and `TermList` to prevent empty list duck-typing - Remove redundant `gam.set_params` calls in internal bootstrap/gridsearch - Add regression tests for cloning fitted GAMs with custom terms
d7015cb to
92a4f16
Compare
pankajbaid567
added a commit
to pankajbaid567/pyGAM
that referenced
this pull request
Mar 12, 2026
pankajbaid567
added a commit
to pankajbaid567/pyGAM
that referenced
this pull request
Mar 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix
sklearn.clone()returning empty terms list on fitted GAMs (#340)Description
This PR resolves an issue where calling
sklearn.base.clone(gam)on a fitted GAM estimator results in an identical estimator but with a missingtermslist (gam.terms == ""), breaking Scikit-Learn workflows (like grid search and cross-validation) that rely on cloning properly passing initialization arguments.Root cause:
GAM.fit()mutates string configurations liketerms='auto'intoTermListobjects in-place. Becausesklearn.clone()queriesget_params(deep=False), it received the mutated state rather than original initialization parameters, and then additionally tried to deep clone theTermListdynamically because it inheritsCore.get_params(), resulting in parameter destruction.Changes:
GAM.fit()now stores_init_distribution,_init_link,_init_callbacks, and_init_termsbefore applying data-dependent validation.get_params()Override:GAM.get_params(deep=False)now explicitly restores the original instantiation parameters soclone()receives correct input, cleanly segregating scikit-learn requirements from pyGAM's internalkeep_bestdeep=Trueroutines.__sklearn_clone__implemented onTermandTermListto explicitly forcecopy.deepcopyand preventsklearn.clonefrom attempting to reconstruct them as empty nested estimators.set_params()calls in_bootstrap_samples_of_smoothingandgridsearchthat broke identically.test_sklearn_clone_preserves_termsguarantees functionality safely across LinearGAM/LogisticGAM and customized parameters.