# Linear Regression

Linear regression is a widely used statistical technique for modeling the relationship between a dependent variable and one or more independent variables. It assumes a linear relationship between the variables and aims to find the best-fit line that minimizes the difference between the predicted and actual values. The goal of linear regression is to make predictions or understand the impact of independent variables on the dependent variable.

In simple linear regression, we consider a single independent variable and a single dependent variable. The relationship between the variables can be represented by the equation:

$y = mx + c$

Where:

- `y` is the dependent variable
- `x` is the independent variable
- `c` is the y-intercept (the value of `y` when `x` is 0)
- `m` is the slope (the change in `y` for a unit change in `x`)

In higher dimension this equation becomes:

$y = wx + b$

The goal is to estimate the values of $w$ and $b$ that best fit the data.

**Data**

Input: $x$ (i.e, measurements, covariates, features, independent variables)

Output: $y$ (i.e., response, dependent variable)



**Goal**

You need to find a regression function $y\approx f(x, \beta)$, where $\beta$ is the parameter to be estimated from observations.

For Simple Linear regression: $y = \beta_0 + \beta_1x$


For Multiple Linear regression: $y  = \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{2} + \cdots + \beta_{d}x_{d}$, where $d$ is the number of features.


A regression method is linear if the prediction $f$ is a linear function of the unknown parameters $\beta$.

## Assumptions of Linear Regression

https://www.geeksforgeeks.org/assumptions-of-linear-regression/

- __Linear regression should be linear in parameters__

  The response variable, $y$ is the function of input variables, $x$'s and parameters, $\beta$'s. But the linearity condition is up to parameters. The output variable should be linear in terms of parameters, not necessarily in terms of input variables.

  For example: 
  
  The equation below is linear in terms of both inputs and parameters, so hold the assumption.

  $$y = \beta_0 + \beta_1x$$  

  Similarly, equation below is not linear in terms of inputs but linear in terms of parmaeters so it holds the assumption.

  $$y = \beta_0 +\beta_1x^2$$

  Lastly, the equation below is linear in terms of input but is not linear in terms of parameters, so it violates the assumption and is not a linear regression model.

  $$y = \beta_0 +\beta_1^2x$$

- __There shouldn't be multicollinearity.__

  Multi colinearity here means perfect colinearity. This assumption is for input variables. In simple linear regression, where we have a single input variable, this assumption doesn't play any role, but in case of multiple linear regression, we should be careful. Any two or more sets of input variables should not be perfectly correlated. Perfect correlation might not make the predictor's matrix full rank, which creates a problem in estimating the parameters.

  For example, while predicting the house price, you can have many input variables, _length_, _breadth_, _area_, _location_, and many more. In this case, if you include the feature, _area_  along with _length_, _breadth_, you might violate the assumptions because:

  $$\text{area} = \text{length}\times\text{breadth}$$

  In such a situation, it is better to drop one of the three input variables from the linear regression model.

- __There should be a random sampling of observations.__

  The observations for the linear regression should be randomly sampled from any population. Suppose, you are trying to build a regression model to know the factors that affect the price of the house, then you must select houses randomly from a locality, rather than adopting a convenient sampling procedure. Also, the number of observations should always be higher than the number of parameters to be estimated.

# Parameter Estimation Techniques

Now that you know the assumptions of Linear Regression, it's time you know the techniques to find the parameters, i.e., intercept and regression coefficients. Two of them are as follows:

1. Least Squares Estimation

  In practice, observed data or input-output pair is given, and $\beta's$ are unknown. We use the given input-output pair to estimate the coefficients. The estimated intercept and regression coefficient later helps us in predicting the output value with the input values. There is an error or residual since the estimated regression can not satisfy all the output data points. Hence, we have an error or residual, which is the difference between the estimated output value and the actual output value. Errors can either be positive or negative. We can sum the errors to evaluate the estimated linear regression line. To get rid of the cancellation of the positive and negative error, we square the error and add then which is popularly called as Sum of Squares of errors or Residual Sum of Squares.

  This parameter estimation technique finds the parameters by minimizing the sum of squares. There are mainly three types of Least Squares:

      - Ordinary Least Squares

      - Weighted Least Squares

      - Generalized Least Squares


2. Maximum Likelihood Estimation

    Maximum likelihood estimation is a well known probabilistic framework for finding parameters that best describe the observed data. You find the parameters by maximizing the conditional probability of observing the data, $x$ given specific probability distribution and its parameters $\beta's$. Detailed discussion on this technique is out of the scope.