# Linear Regression in Machine Learning
#### [Khundrakpam Veeshel Singh]

Linear regression is a fundamental technique in machine learning and statistics. It is used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation to the observed data.

## Table of Contents
1. [Introduction](#Introduction)
2. [Mathematical Formulation](#Mathematical-Formulation)
3. [Generating Data with Specific Parameters](#Generating-Data-with-Specific-Parameters)
4. [Implementation](#Implementation)
    1. [Generating Synthetic Data](#generating-synthetic-data)
    2. [Model Training](#Model-Training)
    3. [Evaluation](#Evaluation)
5. [Comparison of Parameters](#Comparison-of-Parameters)
6. [Conclusion](#conclusion)
   ***

## Introduction

Linear regression aims to find the best-fitting line through the data points. This line can then be used to make predictions about new data. In its simplest form, linear regression deals with a single independent variable, which is known as **simple linear regression**. When there are multiple independent variables, it is known as **multiple linear regression**.

## Mathematical Formulation

The linear regression model assumes that the relationship between the dependent variable $ y $ and the independent variable $ x $ can be described by the linear equation:

$ y = \beta_0 + \beta_1 x + \epsilon $

where:
- $ y $ is the dependent variable
- $ x $ is the independent variable
- $ \beta_0 $ is the y-intercept
- $ \beta_1 $ is the slope of the line
- $ \epsilon $ is the error term

In the case of multiple linear regression with multiple independent variables $ x_1, x_2, \ldots, x_n $, the equation extends to:

$ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots + \beta_n x_n + \epsilon $

## Implementation
### Generating Data with Specific Parameters

We will use the specific linear equation $ y = 2x - 3 $ to generate a dataset with added noise.

```python

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Set parameters
m = 2  # slope
b = -3 # y-intercept
np.random.seed(0)

# Generate synthetic data
X = 2 * np.random.rand(100, 1)
y = m * X + b + .2*(np.random.randn(100, 1))  # Add noise

plt.scatter(X, y)
plt.xlabel('X')
plt.ylabel('y')
plt.title('Synthetic Data with Noise')
plt.show()

###  Model training

In [None]:
from sklearn.linear_model import LinearRegression

# Create and train the model
model = LinearRegression()
model.fit(X, y)


### Evaluation

In [None]:
# Make predictions
X_new = np.array([[0], [2]])
y_pred = model.predict(X_new)

# Plot the results
plt.scatter(X, y)
plt.plot(X_new, y_pred, 'r-', label='Linear Model')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression Fit')
plt.legend()
plt.show()


##  Comparison of Parameters

In [None]:
print(f"True Slope (m)       : {m}")
print(f"True Intercept (b)   : {b}")
print(f"Learned Slope (m)    : {slope:.3f}")
print(f"Learned Intercept (b): {intercept:.3f}")


##  Conclusion
Linear regression is a powerful and simple method for predicting a continuous outcome variable based on one or more predictors. In this notebook, we covered:

The mathematical formulation of linear regression
- How to generate synthetic data with a known linear relationship and noise
- How to implement linear regression in Python and compare learned parameters with true values
- How to visualize the fitted line
- By understanding and implementing linear regression, you can develop predictive models and gain insights into the relationships between variables.