sklearn
In Development
The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.
decomposition.SparseCoder
,decomposition.DictionaryLearning
, anddecomposition.MiniBatchDictionaryLearning
decomposition.SparseCoder
with algorithm='lasso_lars'decomposition.SparsePCA
where normalize_components has no effect due to deprecation.linear_model.Ridge
when X is sparse.
Details are listed in the changelog below.
(While we are trying to better inform users by providing this information, we cannot assure that this list is complete.)
- Fixed a bug that made calibration.CalibratedClassifierCV fail when given a sample_weight parameter of type list (in the case where sample_weights are not supported by the wrapped estimator).
13575
byWilliam de Vazelhes <wdevazelhes>
.
datasets.fetch_openml
now supports heterogeneous data using pandas by setting as_frame=True.13902
by Thomas Fan.- The parameter return_X_y was added to
datasets.fetch_20newsgroups
anddatasets.fetch_olivetti_faces
.14259
bySourav Singh <souravsingh>
.
decomposition.sparse_encode()
now passes the max_iter to the underlying LassoLars when algorithm='lasso_lars'.12650
by Adrin Jalali.decomposition.dict_learning()
anddecomposition.dict_learning_online()
now accept method_max_iter and pass it to sparse_encode.12650
by Adrin Jalali.decomposition.SparseCoder
,decomposition.DictionaryLearning
, anddecomposition.MiniBatchDictionaryLearning
now take a transform_max_iter parameter and pass it to eitherdecomposition.dict_learning()
ordecomposition.sparse_encode()
.12650
by Adrin Jalali.decomposition.IncrementalPCA
now accepts sparse matrices as input, converting them to dense in batches thereby avoiding the need to store the entire dense matrix at once.13960
byScott Gigante <scottgigante>
.
ensemble.HistGradientBoostingClassifier
andensemble.HistGradientBoostingRegressor
have an additional parameter called warm_start that enables warm starting.14012
byJohann Faouzi <johannfaouzi>
.ensemble.HistGradientBoostingClassifier
andensemble.HistGradientBoostingRegressor
now bin the training and validation data separately to avoid any data leak.13933
by Nicolas Hug.ensemble.VotingClassifier.predict_proba
will no longer be present when voting='hard'.14287
by Thomas Fan.ensemble.HistGradientBoostingClassifier
the training loss or score is now monitored on a class-wise stratified subsample to preserve the class balance of the original training set.14194
byJohann Faouzi <johannfaouzi>
.ensemble.AdaBoostClassifier
computes probabilities based on the decision function as in the literature. Thus, predict and predict_proba give consistent results.14114
byGuillaume Lemaitre <glemaitre>
.
linear_model.BayesianRidge
now accepts hyperparametersalpha_init
andlambda_init
which can be used to set the initial value of the maximization procedure infit
.13618
byYoshihiro Uchida <c56pony>
.linear_model.Ridge
now correctly fits an intercept when X is sparse, solver="auto" and fit_intercept=True, because the default solver in this configuration has changed to sparse_cg, which can fit an intercept with sparse data.13995
byJérôme Dockès <jeromedockes>
.- The 'liblinear' logistic regression solver is now faster and requires less memory.
14108
,14170
byAlex Henrie <alexhenrie>
. linear_model.Ridge
with solver='sag' now accepts F-ordered arrays and make a conversion instead of failing.14458
byGuillaume Lemaitre <glemaitre>
.
- Added multiclass support to
metrics.roc_auc_score
.12789
byKathy Chen <kathyxchen>
,Mohamed Maskani <maskani-moh>
, andThomas Fan <thomasjpfan>
. - Add
metrics.mean_tweedie_deviance
measuring the Tweedie deviance for a power parameterp
. Also add mean Poisson deviancemetrics.mean_poisson_deviance
and mean Gamma deviancemetrics.mean_gamma_deviance
that are special cases of the Tweedie deviance for p=1 and p=2 respectively.13938
byChristian Lorentzen <lorentzenchr>
and Roman Yurchak. - The parameter
beta
inmetrics.fbeta_score
is updated to accept the zero and float('+inf') value.13231
byDong-hee Na <corona10>
.
sklearn.model_selection
..................
model_selection.learning_curve
now accepts parameterreturn_times
which can be used to retrieve computation times in order to plot model scalability (see learning_curve example).13938
byHadrien Reboul <H4dr1en>
.
pipeline.Pipeline
now supportsscore_samples
if the final estimator does.13806
byAnaël Beaugnon <ab-anssi>
.
svm.SVC
andsvm.NuSVC
now accept abreak_ties
parameter. This parameter results inpredict
breaking the ties according to the confidence values ofdecision_function
, ifdecision_function_shape='ovr'
, and the number of target classes > 2.12557
by Adrin Jalali.
- Avoid unnecessary data copy when fitting preprocessors
preprocessing.StandardScaler
,preprocessing.MinMaxScaler
,preprocessing.MaxAbsScaler
,preprocessing.RobustScaler
andpreprocessing.QuantileTransformer
which results in a slight performance improvement.13987
by Roman Yurchak.
cluster.SpectralClustering
now accepts an_components
parameter. This parameter extends SpectralClustering class functionality to match spectral_clustering.13726
byShuzhe Xiao <fdas3213>
.
- Fixed a bug where
VarianceThreshold
with threshold=0 did not remove constant features due to numerical instability, by using range rather than variance in this case.13704
by Roddy MacSween <rlms>.
utils.safe_indexing
accepts anaxis
parameter to index array-like across rows and columns. The column indexing can be done on NumPy array, SciPy sparse matrix, and Pandas DataFrame.14035
by Guillaume Lemaitre <glemaitre>.
- Add max_fun parameter in
neural_network.BaseMultilayerPerceptron
,neural_network.MLPRegressor
, andneural_network.MLPClassifier
to give control over maximum number of function evaluation to not meettol
improvement.9274
byDaniel Perry <daniel-perry>
.
- Replace manual checks with
check_is_fitted
. Errors thrown when using a non-fitted estimators are now more uniform.13013
byAgamemnon Krasoulis <agamemnonc>
. - Port lobpcg from SciPy which implement some bug fixes but only available in 1.3+.
14195
byGuillaume Lemaitre <glemaitre>
.
These changes mostly affect library developers.
- Estimators are now expected to raise a
NotFittedError
ifpredict
ortransform
is called beforefit
; previously anAttributeError
orValueError
was acceptable.13013
by byAgamemnon Krasoulis <agamemnonc>
. - Binary only classifiers are now supported in estimator checks. Such classifiers need to have the binary_only=True estimator tag.
13875
by Trevor Stephens.