Simple Linear Regression:

y = b0 + b1.x + ε OR \
Y = β0. + β1 x + ε \
y -> dependent variable \
x -> independent variable \
b0 -> intercept coefficient \
b1 -> slope coefficient \
ε -> error

### Ordinary Least Squares:

We have our data point in the graph. Now the question is which of the slope line is the best one. For that we can use 'Ordinary Least Squares' method.

<!-- ![OLS](../../Data%20Images/OLS.png) -->

<img src="../Data%20Images/OLS.png" alt="Project Setup" width="1000">


# Multiple Linear Regression

### Assumptions:
<img src="../Data%20Images/Linear Assumption.png" alt="Project Setup" width="1000">

### Dummy Variables and Avoiding the Dummy Variable Trap

When dealing with categorical data in machine learning, we often need to convert these categories into numerical values that algorithms can understand. One common method is to use **dummy variables**.

### What are Dummy Variables?

Dummy variables, also known as indicator variables, are used to represent categorical data as binary (0 or 1) variables. Each category becomes a separate variable.

### Example: Converting Categorical Data

Suppose you have a column with the names of cities: `New York`, `Los Angeles`, and `Chicago`.

```plaintext
City
New York
Los Angeles
Chicago
New York
Chicago
Los Angeles
```

### Creating Dummy Variables

For this column, you would create three new binary variables:

1. `New York`
2. `Los Angeles`
3. `Chicago`

Each row would have a 1 in the column corresponding to its city and 0s in the others.

```plaintext
City         New York  Los Angeles  Chicago
New York     1         0            0
Los Angeles  0         1            0
Chicago      0         0            1
New York     1         0            0
Chicago      0         0            1
Los Angeles  0         1            0
```

### Avoiding the Dummy Variable Trap

However, using all these dummy variables in your model can cause multicollinearity issues, known as the **dummy variable trap**. This happens because one of the dummy variables can be perfectly predicted from the others, which can lead to redundant information and issues with model estimation.

To avoid the dummy variable trap, you should use \( k-1 \) dummy variables for \( k \) categories. In our example with three categories, you would use only two dummy variables.

### Example: Using \( k-1 \) Dummy Variables

Let's choose to drop the dummy variable for `Chicago`. This means we will use only `New York` and `Los Angeles`:

```plaintext
City         New York  Los Angeles
New York     1         0
Los Angeles  0         1
Chicago      0         0
New York     1         0
Chicago      0         0
Los Angeles  0         1
```

In this setup:
- `New York` is 1 if the city is New York, 0 otherwise.
- `Los Angeles` is 1 if the city is Los Angeles, 0 otherwise.
- If both `New York` and `Los Angeles` are 0, the city must be Chicago.

### Implementation in Python

Here's how you can create dummy variables in Python using pandas:

```python
import pandas as pd

# Sample data
data = {
    'City': ['New York', 'Los Angeles', 'Chicago', 'New York', 'Chicago', 'Los Angeles']
}
df = pd.DataFrame(data)

# Create dummy variables
dummies = pd.get_dummies(df['City'], drop_first=True)

# Concatenate the original DataFrame and the dummy variables
df = pd.concat([df, dummies], axis=1)

print(df)
```

### Output:

```plaintext
          City  Los Angeles  New York
0     New York            0         1
1  Los Angeles            1         0
2      Chicago            0         0
3     New York            0         1
4      Chicago            0         0
5  Los Angeles            1         0
```

In this example:
- The `City` column is transformed into dummy variables.
- We use the `drop_first=True` parameter in `pd.get_dummies()` to avoid the dummy variable trap by dropping the first category (Chicago in this case).

## Building A Model
In Multiple Linear Regression, we have to choose which independent variable we have to keep and which to throw out. Here are Some methods:
- All in
- Backward Elimination
- Forward Selection
- BiDirectional Elimination
- Score 

### Backward Elimination
Backward Elimination is a statistical method used to select the most significant features in a regression model. This process involves iteratively removing the least significant feature based on a chosen significance level (p-value) until only statistically significant features remain.

Here's a step-by-step explanation of the Backward Elimination process:

1. **Select a significance level** (e.g., SL = 0.05).
2. **Fit the full model** including all possible predictors.
3. **Consider the predictor with the highest p-value**. If the p-value is greater than the significance level, remove this predictor.
4. **Refit the model** without the removed predictor(without removed variable).
5. **Repeat steps 3-4** until all remaining predictors have p-values less than the significance level.


### Forward Selection
 is a stepwise approach to model selection where we start with an empty model and iteratively add features that improve the model the most. This process continues until no significant improvement can be made by adding any remaining feature.

Here's how you can implement Forward Selection in Python using the given dataset:

### Steps for Forward Selection

1. **Start with an empty model**.
2. **Add features one by one** based on a chosen significance level (p-value).
3. **At each step, add the feature that improves the model the most** (i.e., the one with the lowest p-value that is below the significance level).
4. **Stop when no further improvement is possible** (i.e., all remaining features have p-values above the significance level).


Bidirectional Elimination is a combination of both Forward Selection and Backward Elimination. In this method, we start by including features step-by-step (like in Forward Selection), but after adding each new feature, we also check if any of the already included features have become insignificant and should be removed (like in Backward Elimination). This ensures that the model includes only the most significant features at each step.

Here is how to implement Bidirectional Elimination in Python using the given dataset:

### Steps for Bidirectional Elimination

1. **Start with an empty model**.
2. **Add features one by one** based on a chosen significance level (p-value).
3. **At each step of adding a feature, check all features in the model** and remove those that are not significant.
4. **Repeat the process** until no more features can be added or removed.


**Bidirectional Elimination**:
- **Forward Selection Step**: Add features one by one, checking which feature improves the model the most (lowest p-value).
- **Backward Elimination Step**: After adding a feature, check all features already in the model and remove those that are not significant.
- Repeat the process until no more features can be added or removed that significantly improve the model.

### All Possible Models
All Possible Models" (also known as "Exhaustive Search") is a method where we evaluate all possible combinations of features to identify the best model based on a chosen criterion (e.g., Adjusted R-squared, AIC, BIC). This method is computationally expensive but guarantees finding the best subset of features for the model.

> Multiple Linear Regression do not require feature scaling. Coefficient will take of high low values

##### In Multple Linear Regression, Skilearn.linearmodel->LinearRegression()
 Will take care of dummy variable trap and Backward elimination. We dont need to explicitly do it by ourself.