Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Added n_iter_ parameters across all iterative solvers #3360

Closed
wants to merge 5 commits into from

Conversation

MechCoder
Copy link
Member

Added a n_iter parameter across all iterative solvers that have a max_iter parameter.

Models that have the max_iter parameter.

  • AffinityPropagation
  • CCA
  • DictionaryLearning
  • ElasticNet
  • ElasticNetCV
  • FactorAnalysis
  • FastICA
  • GraphLasso
  • GraphLassoCV
  • KMeans
  • LabelPropagation
  • LabelSpreading
  • LarsCV
  • Lasso
  • LassoCV
  • LassoLars
  • LassoLarsCV
  • LassoLarsIC
  • MDS
  • MiniBatchKMeans
  • MultiTaskElasticNet
  • MultiTaskElasticNetCV
  • MultiTaskLasso
  • MultiTaskLassoCV
  • NMF
  • OrthogonalMatchingPursuitCV (Not exactly max_iter but number of active features)
  • PLSCanonical
  • PLSRegression
  • ProjectedGradientNMF
  • RandomizedLasso
  • SparsePCA

Dependent on external solvers like arpack, so cannot access the iter param

  • Isomap
  • KernelPCA
  • LocallyLinearEmbedding
  • Ridge
  • RidgeClassifier

Dependent on LibSVM

  • NuSVC
  • NuSVR
  • OneClassSVM
  • SVC
  • SVR

@@ -224,6 +224,9 @@ def k_means(X, n_clusters, init='k-means++', precompute_distances=True,
The final value of the inertia criterion (sum of squared distances to
the closest centroid for all observations in the training set).

iters: int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather call it n_iter to be consistent with the naming convention.

@ogrisel
Copy link
Member

ogrisel commented Jul 11, 2014

Please add a check somewhere appropriate in test_common.py to chack that if hasattr(model, 'n_iter_') then assert_greater(model.n_iter_, 1) after a call to fit.

@MechCoder
Copy link
Member Author

@ogrisel I'm working on it, but it would be better to complete the previous PR first. so that I can rebase on top of that, wdyt?

@ogrisel
Copy link
Member

ogrisel commented Jul 11, 2014

Yes. Feel free to do as is easiest for you.

@ogrisel ogrisel changed the title ENH: Added n_iter_ parameters across all iterative solvers [WIP] Added n_iter_ parameters across all iterative solvers Jul 12, 2014
@@ -60,6 +60,9 @@ def affinity_propagation(S, preference=None, convergence_iter=15, max_iter=200,
labels : array, shape (n_samples,)
cluster labels for each point

iters: int
number of iterations run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot change output of public functions like this :(

@fabianp
Copy link
Member

fabianp commented Jul 14, 2014

I agree with Alex that maintaining API compatibility is extremely important. In order to maintain this compatibility and at the same time return this parameters you would need to add a keyword to public functions such as return_n_iter={True, False} (similar to the return_rank keyword in scipy.linalg.pinv) .

For the private functions my opinion is that it's OK to break compatibility and just return the number of iterations as you are currently doing.

@@ -351,6 +357,9 @@ def _kmeans_single(X, n_clusters, x_squared_norms, max_iter=300,
inertia: float
The final value of the inertia criterion (sum of squared distances to
the closest centroid for all observations in the training set).

iters: int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before : in docstrings.

@vene
Copy link
Member

vene commented Jul 16, 2014

This needs careful checking because sometimes it has to be n_iter + 1 and sometimes not.

@GaelVaroquaux
Copy link
Member

What's the status on this PR. I have the feeling that it has been superseeded by another, but I am lost with all what's happening.

@MechCoder
Copy link
Member Author

I shall complete work on this as soon as the Log Reg CV PR is over.

@MechCoder
Copy link
Member Author

@vene @ogrisel I have updated the list of models accepting the max_iter param in the PR description. I think I have finished most. I shall tick them off as soon as I verify.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) when pulling 820b018 on MechCoder:return_n_iter_ into 5517bad on scikit-learn:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) when pulling 6a3922c on MechCoder:return_n_iter_ into 5517bad on scikit-learn:master.

@vene
Copy link
Member

vene commented Jul 20, 2014

OrthogonalMatchingPursuitCV (Not exactly max_iter but number of active features)

Technically that's also the number of iterations in the solver. True it's upper bounded by n_features but so is the case of Lars.

@@ -60,6 +64,9 @@ def affinity_propagation(S, preference=None, convergence_iter=15, max_iter=200,
labels : array, shape (n_samples,)
cluster labels for each point

n_iter : int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only conditionally returned, which should be explicitly specified.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) when pulling f9323b2 on MechCoder:return_n_iter_ into 5517bad on scikit-learn:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) when pulling 7b730c2 on MechCoder:return_n_iter_ into 5517bad on scikit-learn:master.

@MechCoder
Copy link
Member Author

@vene @ogrisel @agramfort I believe the PR is no longer a WIP. Please review :)

@MechCoder MechCoder changed the title [WIP] Added n_iter_ parameters across all iterative solvers [MRG] Added n_iter_ parameters across all iterative solvers Jul 20, 2014
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) when pulling 23d1f4d on MechCoder:return_n_iter_ into 5517bad on scikit-learn:master.

@MechCoder
Copy link
Member Author

@arjoly Could you please verify that the new test in the test_common.py file is right?

@@ -368,7 +385,7 @@ def _kmeans_single(X, n_clusters, x_squared_norms, max_iter=300,
distances = np.zeros(shape=(X.shape[0],), dtype=np.float64)

# iterations
for i in range(max_iter):
for n_iter in range(max_iter):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather leave i as the iteration index variable and return i + 1 instead. The docstring can keep the n_iter name for the return value.

@MechCoder
Copy link
Member Author

@ogrisel thanks for the reviews, do you have any more comments?

Estimator = estimator(alpha=0.)
else:
Estimator = estimator()
if hasattr(Estimator, "max_iter"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Estimator is an instance and should not start with a capital E. estimator is the class name and shall be the one called Estimator

if name == 'AffinityPropagation':
Estimator.fit(X)
else:
Estimator.fit(X, y_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Estimator -> estimator

@agramfort
Copy link
Member

besides LGTM as travis is happy

if name == 'LassoLars':
Estimator = estimator(alpha=0.)
else:
Estimator = estimator()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Estimator()?

if name in (['Ridge', 'SVR', 'NuSVR', 'NuSVC',
'RidgeClassifier', 'SVC', 'RandomizedLasso']):
continue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arjoly I looked into Ridge but out of the three solvers in Ridge only one returns an iteration parameter, so I left it so that it is consistent.

@MechCoder
Copy link
Member Author

@arjoly @agramfort @ogrisel done. anymore comments?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) when pulling d487873 on MechCoder:return_n_iter_ into 7a7fca8 on scikit-learn:master.

@agramfort
Copy link
Member

merged by rebase. Thanks @MechCoder !

@agramfort agramfort closed this Jul 22, 2014
@MechCoder MechCoder deleted the return_n_iter_ branch July 22, 2014 13:14
@MechCoder
Copy link
Member Author

Do you think this needs a whats_new entry, to make users know that this is available?

@ogrisel
Copy link
Member

ogrisel commented Dec 5, 2014

Do you think this needs a whats_new entry, to make users know that this is available?

Yes. It does not hurt.

@MechCoder
Copy link
Member Author

Done in a894b2e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants