In [2]:
import numpy as np 
import matplotlib.pyplot as plt


In [3]:
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3

In [1]:
from sklearn.linear_model import RANSACRegressor, LinearRegression

ransac = RANSACRegressor(LinearRegression(),max_trials=100,min_samples=50,loss='absolute_loss',residual_threshold=5.0,random_state=0)

# RANSAC
RANSAC stands for Random Sample Consensus. It is an iterative algorithm used for robust estimation of parameters from a set of data points that may contain outliers. 

The main idea behind RANSAC is to randomly select a subset of data points, called the inliers, and fit a model to these points. The model is then used to classify the remaining data points as inliers or outliers based on a predefined threshold. This process is repeated multiple times, and the model with the highest number of inliers is considered the best fit.

RANSAC is commonly used in computer vision and image processing tasks, such as line fitting, image stitching, and object recognition. It is particularly useful when dealing with data that contains a significant amount of noise or outliers.

Here's a high-level overview of the RANSAC algorithm:

1. Randomly select a subset of data points.
2. Fit a model to the selected points.
3. Classify the remaining data points as inliers or outliers based on the model and a predefined threshold.
4. Repeat steps 1-3 for a specified number of iterations.
5. Select the model with the highest number of inliers as the best fit.

RANSAC is a powerful algorithm for robust parameter estimation, as it can handle data with outliers and provide reliable results. However, it does have some limitations, such as the need to specify the number of iterations and the threshold for classifying points as inliers or outliers. Additionally, the performance of RANSAC can be affected by the quality of the initial random sample and the choice of the model being fitted.

In [4]:
inlier_mask = ransac.inlier_mask_
outlier_mask = np.logical_not(inlier_mask)
line_X = np.arange(3,10,1)
line_y_ransac = ransac.predict(line_X[:,np.newaxis])
plt.scatter(X[inlier_mask],y[inlier_mask],color='yellowgreen',edgecolors="white",marker='o',label='Inliers')
plt.scatter(X[outlier_mask], y[outlier_mask],c="limegreen",edgecolor="white",marker='s',label='Outliers')
plt.plot(line_X, line_y_ransac, color='cornflowerblue', linewidth=2, label='RANSAC regressor')
plt.xlabel("Average number of rooms [RM]")
plt.ylabel("Price in $1000s [MEDV]")
plt.legend(loc='upper left')
plt.show()

AttributeError: 'RANSACRegressor' object has no attribute 'inlier_mask_'

In [5]:
print("Slope: %.3f".format(ransac.estimator_.coef_[0]))

AttributeError: 'RANSACRegressor' object has no attribute 'estimator_'

In [None]:
print("Intercept: %.3f" % ransac.estimator_.intercept_)