Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG+2] #7908 Addressed issue in first iteration of RANSAC regression #7914
What does this implement/fix? Explain your changes.
On the first iteration of RANSAC regression, if no inliers are found, an error is produced and the code is stopped. Ideally the procedure would just skip that iteration and continue on to the next iteration where it would use a different random sample which could produce valid inliers.
Generally this error is produced when
Any other comments?
I am not certain this is the right fix. This still fails if a 0-inlier sample is drawn at any point after the first >0-inlier sample is drawn.
We could introduce a parameter to control how many 0-inlier samples are allowed through, or just get rid of this control (the main risk being lots of time wasted).
A couple of more cosmetic things we can do to improve this behaviour:
- If we continue to raise an error when no inliers are found, we should report the minimum residual found (and perhaps other summary statistics of the residual distribution) as well as the current threshold, so that the user has a means to tune the parameter.
- introduce a verbose option that reports things like the median residual, number of inliers, etc., at each iteration.
Ping @CSchoel for your opinion.
Sorry, silly me. This patch does indeed fix the issue because the check for
If this is the right patch, you need to write a test. But we may want a more configurable parameter to control the number of 0-inlier samples. Certainly, we should be giving the user more information on how to set the threshold appropriately.
If this is the fix, a test already exists that generates an appropriate error. Unless I'm missing something, there would be nothing else to do here except potentially edit the error message--again assuming we go with this fix.
However, I do think it makes sense to put a limit on the number of iterations that are tried which end in skips. Also,
I'll submit a commit that allows a user to set a
It would be useful to set these diagnostics as attributes of the model, i.e.
We may also then choose to remove them from the error message (and certainly from the warning, to help the warnings module ignore duplicates), pointing the user to these attributes instead.
Okay. Large? Not so much. Unnecessarily long in the tooth? I agree. Let's leave scope as it is currently.…
On 29 December 2016 at 05:20, mthorrell ***@***.***> wrote: I added the n_skips_no_inliers etc. attributes and adjusted the error messages and tests. I did not add anything indicating the stringency of the threshold. Should that be different PR? This one is getting rather large imo. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7914 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6w1udNqqptWBXiWpFDhR-CxM156fks5rMqhSgaJpZM4K3b8L> .
On 3 January 2017 at 05:14, mthorrell ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In sklearn/linear_model/ransac.py <#7914>: > @@ -111,6 +111,11 @@ class RANSACRegressor(BaseEstimator, MetaEstimatorMixin, RegressorMixin): max_trials : int, optional Maximum number of iterations for random sample selection. + max_skips : int, optional + Maximum number of iterations that can be skipped due to finding zero + inliers or invalid data defined by ``is_data_valid`` or invalid models + defined by ``is_model_valid``. + Is it accurate to say .. versionadded:: 0.19 or should it still be 0.18? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7914>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz69Y4ouiibM8QLXSVzox9SqGhbbY-ks5rOT6jgaJpZM4K3b8L> .