# Linear Regression 
Linear regression is a simple yet powerful supervised learning algorithm used for predicting continuous values.
In this notebook, we'll cover:
- The concept of linear regression
- Implementing linear regression using scikit-learn
- Visualizing the results

## 1. What is Linear Regression?
Linear regression models the relationship between a dependent variable (Y) and one or more independent variables (X) using a straight line:

$$ Y = mX + b $$

where:
- $m$ is the slope  
- $b$ is the intercept  

We aim to find the best values for \( m \) and \( b \) to minimize the error.


In [1]:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

## 2. Generating Sample Data
Let's generate a dataset with a linear relationship and some noise.

In [None]:
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)  # Linear equation with some noise
plt.scatter(X, y, color='blue', alpha=0.5)
plt.xlabel('X')
plt.ylabel('y')
plt.title('Generated Data')
plt.show()

## 3. Splitting Data into Training and Testing Sets
To evaluate our model effectively, we split our dataset into a training set (80%) and a testing set (20%).

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(f'Training set size: {len(X_train)}')
print(f'Testing set size: {len(X_test)}')

## 4. Training the Linear Regression Model
We use `LinearRegression` from scikit-learn to fit the model on our training data.

In [None]:
model = LinearRegression()
model.fit(X_train, y_train)
print(f'Intercept: {model.intercept_[0]:.2f}')
print(f'Slope: {model.coef_[0][0]:.2f}')

## 5. Evaluating the Model
Now, we predict on the test set and calculate the Mean Squared Error (MSE).

In [None]:
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse:.2f}')

## 6. Visualizing the Regression Line
We plot the data points along with the regression line.

In [None]:
plt.scatter(X_test, y_test, color='blue', alpha=0.5, label='Actual Data')
plt.plot(X_test, y_pred, color='red', linewidth=2, label='Regression Line')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression Model')
plt.legend()
plt.show()