# RANSAC Algorithm

The RANSAC (RANdom SAmple Consensus) algorithm is a method used in computer vision to estimate parameters of a mathematical model from a set of observed data that contains outliers. RANSAC is particularly useful when the data set contains a significant percentage of outliers.

### Key Concepts

1. Inliers and Outliers:
- Inliers are data points that fit a particular model well.
- Outliers are data points that do not fit the model well.
2. Iterations: The algorithm repeatedly selects a random subset of the data to estimate the model parameters.
3. Consensus: For each iteration, the model is tested against the entire dataset to determine how many data points are inliers.
4. Best Model: The model with the highest number of inliers is chosen as the best model.

### Steps of the RANSAC Algorithm

1. Randomly select a subset of the original data.
2. Fit a model to the selected subset.
3. Determine the number of inliers for the fitted model.
4. Repeat the above steps for a specified number of iterations.
5. Select the model with the highest number of inliers.

# Example: Line Fitting with RANSAC

Imagine you have a set of 2D points, and you want to fit a line to these points, but some of the points are outliers.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Generate synthetic data with outliers
np.random.seed(0)
n_samples = 500
n_outliers = 50

# Generate inlier data
X = np.linspace(-10, 10, n_samples)
Y = 2.5 * X + np.random.normal(size=n_samples)

# Add outliers
X[:n_outliers] = np.random.uniform(-10, 10, n_outliers)
Y[:n_outliers] = np.random.uniform(-10, 10, n_outliers)

# Visualize the data
plt.scatter(X, Y, color='b', marker='o', label='Data points')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Data points with outliers')
plt.legend()
plt.show()

# RANSAC algorithm
def ransac_line_fit(X, Y, n_iterations, threshold):
    best_model = None
    best_inliers_count = 0

    for _ in range(n_iterations):
        # Randomly sample 2 points to define a line
        indices = np.random.choice(len(X), 2, replace=False)
        x1, y1 = X[indices[0]], Y[indices[0]]
        x2, y2 = X[indices[1]], Y[indices[1]]

        # Calculate the line parameters
        a = (y2 - y1) / (x2 - x1)
        b = y1 - a * x1

        # Calculate the inliers
        inliers_count = 0
        for i in range(len(X)):
            y_pred = a * X[i] + b
            if abs(Y[i] - y_pred) < threshold:
                inliers_count += 1

        # Update the best model if the current one has more inliers
        if inliers_count > best_inliers_count:
            best_inliers_count = inliers_count
            best_model = (a, b)

    return best_model

# Set RANSAC parameters
n_iterations = 1000
threshold = 1.0

# Fit the line using RANSAC
a, b = ransac_line_fit(X, Y, n_iterations, threshold)
print(f"Best fit line: Y = {a:.2f}X + {b:.2f}")

# Plot the results
plt.scatter(X, Y, color='b', marker='o', label='Data points')
plt.plot(X, a * X + b, color='r', label=f'RANSAC fit: Y = {a:.2f}X + {b:.2f}')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('RANSAC Line Fitting')
plt.legend()
plt.show()