Describe the issue
The internal helper function _random_ss_ix (randomly uniformly sample indices from a list of indices) is duplicated across three files in the regression module:
skpro/regression/bootstrap.py (lines 208–215)
skpro/regression/enbpi.py (lines 343–350)
skpro/regression/ensemble/_bagging.py (lines 253–257)
The bootstrap.py and enbpi.py versions are identical (with a random_state parameter), while the _bagging.py version is a simplified variant (no random_state parameter, uses np.random.choice directly on the global state).
This duplication increases maintenance burden — any bug fix or enhancement must be applied in all three places.
Suggested fix
Extract a single canonical _random_ss_ix function into skpro/utils/sampling.py (or another appropriate shared utility module), and update all three files to import from there.
The consolidated version should support the random_state parameter (as in bootstrap.py / enbpi.py), making the _bagging.py version also properly support local random state.
Files affected
skpro/regression/bootstrap.py — delete local _random_ss_ix, import from utils
skpro/regression/enbpi.py — delete local _random_ss_ix, import from utils
skpro/regression/ensemble/_bagging.py — delete local _random_ss_ix, import from utils
skpro/utils/sampling.py — [NEW] shared utility module
Expected impact
- Reduced code duplication
- Easier maintenance — single point of truth for the sampling logic
- Consistent random state handling across all three estimators
I am happy to submit a PR for this.
Describe the issue
The internal helper function
_random_ss_ix(randomly uniformly sample indices from a list of indices) is duplicated across three files in the regression module:skpro/regression/bootstrap.py(lines 208–215)skpro/regression/enbpi.py(lines 343–350)skpro/regression/ensemble/_bagging.py(lines 253–257)The
bootstrap.pyandenbpi.pyversions are identical (with arandom_stateparameter), while the_bagging.pyversion is a simplified variant (norandom_stateparameter, usesnp.random.choicedirectly on the global state).This duplication increases maintenance burden — any bug fix or enhancement must be applied in all three places.
Suggested fix
Extract a single canonical
_random_ss_ixfunction intoskpro/utils/sampling.py(or another appropriate shared utility module), and update all three files to import from there.The consolidated version should support the
random_stateparameter (as inbootstrap.py/enbpi.py), making the_bagging.pyversion also properly support local random state.Files affected
skpro/regression/bootstrap.py— delete local_random_ss_ix, import from utilsskpro/regression/enbpi.py— delete local_random_ss_ix, import from utilsskpro/regression/ensemble/_bagging.py— delete local_random_ss_ix, import from utilsskpro/utils/sampling.py—[NEW]shared utility moduleExpected impact
I am happy to submit a PR for this.