Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potentially running too many iterations in Ransac #8251

Closed
aivision2020 opened this issue Jan 31, 2017 · 11 comments
Closed

potentially running too many iterations in Ransac #8251

aivision2020 opened this issue Jan 31, 2017 · 11 comments

Comments

@aivision2020
Copy link
Contributor

@aivision2020 aivision2020 commented Jan 31, 2017

in skimage/measure/fit.py
managing the number of iterations is done with max_trials. This is compared to _dynamic_max_trials which is computed using the largest inlier group found so far. However, this comparison is done only in cases when sample_inlier_num is larger than the biggest group. In cases when a large group is found early (but not large enough to end) the best_inlier_num is updated but the max_trials is not updated and will only be checked next time a larger group is found.

the lines (926) :

            if (
                best_inlier_num >= stop_sample_num
                or best_inlier_residuals_sum <= stop_residuals_sum
                or num_trials
                    >= _dynamic_max_trials(best_inlier_num, num_samples,
                                           min_samples, stop_probability)
            ):
                break

should be

             max_trials = min(max_trials, _dynamic_max_trials(best_inlier_num, num_samples,
                                           min_samples, stop_probability))
            if (
                best_inlier_num >= stop_sample_num
                or best_inlier_residuals_sum <= stop_residuals_sum
                or num_trials > max_trials
            ):
                break 
@lesteve
Copy link
Member

@lesteve lesteve commented Feb 1, 2017

in skimage/measure/fit.py

I am confused, why are you mentioning skimage?

@lesteve
Copy link
Member

@lesteve lesteve commented Feb 1, 2017

It seems like the code you mention has some similarity with the scikit-learn code though, so maybe your issue is still relevant.

@aivision2020
Copy link
Contributor Author

@aivision2020 aivision2020 commented Feb 1, 2017

@lesteve
Copy link
Member

@lesteve lesteve commented Feb 1, 2017

I'm looking at the file:
https://github.com/scikit-image/scikit-image/blob/master/skimage/measure/fit.py

If you are using the scikit-image RANSAC code, you should open an issue in scikit-image. This is the scikit-learn tracker. They are two separate projects.

@lesteve
Copy link
Member

@lesteve lesteve commented Feb 1, 2017

As I tried to say though, the RANSAC code in scikit-learn and scikit-image seems to have some similarity so it could well be that some of your comment applies to the scikit-learn code as well.

@aivision2020
Copy link
Contributor Author

@aivision2020 aivision2020 commented Feb 1, 2017

Ah I see. Sorry.
So looking at
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/ransac.py
line 418

        # break if sufficient number of inliers or score is reached
        if (n_inliers_best >= self.stop_n_inliers
                or score_best >= self.stop_score
                or self.n_trials_
                   >= _dynamic_max_trials(n_inliers_best, n_samples,
                                          min_samples,
                                          self.stop_probability)):
            break

should be something like:

        # break if sufficient number of inliers or score is reached
        self.max_trials = min(self.max_trials, _dynamic_max_trials(n_inliers_best, n_samples,
                                          min_samples, self.stop_probability))
        if (n_inliers_best >= self.stop_n_inliers
                or score_best >= self.stop_score
                or self.n_trials_>=self.max_trials):
            break
@lesteve
Copy link
Member

@lesteve lesteve commented Feb 1, 2017

I am sorry but I have to ask, are you using the scikit-learn RANSAC code or the scikit-image RANSAC code?

@aivision2020
Copy link
Contributor Author

@aivision2020 aivision2020 commented Feb 1, 2017

I was using scikit-image. But I looked at scikit-learn code and it has the same issue (I didn't make a unit test for it though)

@aivision2020
Copy link
Contributor Author

@aivision2020 aivision2020 commented Feb 1, 2017

a unit test:

from sklearn import linear_model
from sklearn.linear_model.ransac import _dynamic_max_trials
import numpy as np

X =np.linspace(0,100,100)[:,None]
print X.shape
Y = 2*X
Y[80:] = np.random.rand(20,1)
Y[:80] += np.random.rand(80,1)

for i in range(10):
    clt_ransac = linear_model.RANSACRegressor(linear_model.LinearRegression(),max_trials=100)
    clt_ransac.fit(X, Y)
    yhat_ransac = clt_ransac.predict(X)
    inlier_mask = clt_ransac.inlier_mask_
    print '*************************'
    print 'actual     trials:' , clt_ransac.n_trials_
    print 'should be dynamic:', _dynamic_max_trials(np.sum(inlier_mask), 100,
                        2, clt_ransac.stop_probability)
output:
*************************
actual     trials: 5
should be dynamic: 5.0
*************************
actual     trials: 11
should be dynamic: 5.0
*************************
actual     trials: 11
should be dynamic: 5.0
*************************
actual     trials: 10
should be dynamic: 5.0
*************************
actual     trials: 16
should be dynamic: 5.0
*************************
actual     trials: 7
should be dynamic: 5.0
*************************
actual     trials: 100
should be dynamic: 5.0
*************************
actual     trials: 8
should be dynamic: 5.0
*************************
actual     trials: 9
should be dynamic: 5.0
*************************
actual     trials: 27
should be dynamic: 5.0

Notice there are more trials then needed. Once he was so unlucky he did the max number of iters

After fix 
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80
*************************
actual     trials: 6
should be dynamic: 5.0
80

The changes
changed the loop type to while

           self.n_trials_ = 0
           while (self.n_trials_ <= self.max_trials):
                      self.n_trials_ += 1

and the break criteria

             self.max_trials = min(self.max_trials,_dynamic_max_trials(n_inliers_best, n_samples,
                                min_samples,self.stop_probability))

            # break if sufficient number of inliers or score is reached
            if (n_inliers_best >= self.stop_n_inliers
                    or score_best >= self.stop_score):
                break
@jnothman
Copy link
Member

@jnothman jnothman commented Feb 1, 2017

I'm sorry that I do not know this algorithm well from a theoretical standpoint.

The description of this feature in the stop_probability docstring says "requires to generate at least N samples". Your unit test would have us generate exactly N samples, would you not? Your _dynamic_max_trials calculations are made after the best inliers are found. The previous iterations might have yielded fewer inliers and thus a higher _dynamic_max_trials hence requiring further iteration.

However, I think you may be right that because of the various ways a trial can fail, we are surpassing the required number of trials calculated on the basis of the previous best inlier set. Could you please submit a pull request with your changes?

Also, I have added some formatting to your last comment. Please do this yourself in the future. (Click edit to see what I did.)

@aivision2020
Copy link
Contributor Author

@aivision2020 aivision2020 commented Feb 2, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.