# Linear Regression

Linear Regression is a supervised learning algorithm used for predicting a continuous target variable based on one or more input features. It models the relationship between the dependent variable (Y) and independent variable(s) (X) by fitting a straight line through the data points.

- Type of Algorithm: Supervised Learning (Regression)
- Target Variable: Continuous
- Goal: Minimize the difference (error) between predicted and actual values.

Hypothesis Function: A linear equation represents the relationship between the independent variable (x) and dependent variable (y).
$$y = mx + c$$

Cost Fucntion: Measures the error between the predicted and actual values (commonly known as Mean Squared Error or MSE).
$$ MSE = (1/n) {\sum} (y_{actual} - y_{predicted})^2 $$

Optimization: Gradient Descent or similar methods are used to minimize the cost function and find the best-fit line.

#### Assumptions of Linear Regression

**Linearity**: The relationship between the dependent and independent variable is linear.</br>
**Independence**: Observations are independent of each other.</br>
**Homoscedasticity**: Constant variance of errors.</br>
**Normality**: Errors are normally distributed.</br>
**No Multicollinearity**: Independent variables are not highly correlated.</br>

**Applications**:
- House Price Prediction: Predicting house prices based on features like size, location, and number of rooms.
- Sales Forecasting: Estimating future sales based on past performance.
- Medical Research: Predicting patient outcomes based on health metrics.

| **Advantages**                                                                 | **Disadvantages**                                                                  |
|-------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|
| Simple and easy to implement.                                                 | Assumes a linear relationship between variables, which may not always exist.      |
| Interpretable: Coefficients indicate the contribution of each feature.         | Sensitive to outliers, which can distort the model.                               |
| Computationally efficient, especially for small to medium-sized datasets.      | Struggles with multicollinearity (high correlation between independent variables). |
| Works well with linearly separable data.                                       | Limited to predicting continuous variables only.                                  |
| Provides probabilistic predictions and confidence intervals.                   | Assumes homoscedasticity (constant variance of errors), which may not hold true.  |
| Can be regularized (e.g., Ridge, Lasso) to handle overfitting.                 | Not suitable for complex relationships (requires feature engineering or other models). |
| Useful as a baseline model for comparison with more complex algorithms.        | Requires careful feature selection and scaling for optimal performance.           |
| The mathematical foundation is well understood and widely studied.             | Not robust to missing data and noisy datasets.                                    |
| Easy to integrate into larger systems and pipelines.                           | Assumes no or little multicollinearity among independent variables.               |
| Works with both simple (one feature) and multiple (many features) scenarios.   | Cannot handle non-linear relationships unless features are transformed.           |
