### Linear Regression

##### Linear regression models are a good starting point for regression tasks. Such models are popular because they can be fit quickly and are straightforward to interpret. You are already familiar with the simplest form of linear regression model (i.e., fitting a straight line to two-dimensional data), but such models can be extended to model more complicated data behavior.

#### We begin with the standard imports:

In [1]:
# Importing necessary libraries
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

### Simple Linear Regression

#### We will start with the most familiar linear regression, a straight-line fit to data. A straight-line fit is a model of the form: y=ax+b where a is commonly known as the slope, and b is commonly known as the intercept.

In [2]:
# Sample data
X = np.array([[1], [2], [3], [4], [5],[6],[7],[8],[9],[10]])
y = np.array([2, 4, 6, 8, 10,12,14,16,18,20])

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=3)

#### We can use Scikit-Learn's LinearRegression estimator to fit this data and construct the best-fit line,

In [3]:
# Creating and fitting the model
model = LinearRegression()
model.fit(X_train, y_train)

# Making predictions
y_pred = model.predict(X_test)

# Model evaluation
mse = mean_squared_error(y_test, y_pred)

print("Mean Squared Error:", mse)

Mean Squared Error: 1.5777218104420236e-30


The slope and intercept of the data are contained in the model's fit parameters, which in Scikit-Learn are always marked by a trailing underscore. Here the relevant parameters are coef_ and intercept_:

In [9]:
slope = model.coef_[0]
y_intercept = model.intercept_

print("Slope:", slope)
print("Y-intercept:", y_intercept)


Slope: 1.9999999999999991
Y-intercept: 5.329070518200751e-15


In [5]:
print(y_pred)

[12. 10.]
