This tutorial covers **multiple linear regression** using Python, specifically for predicting home prices based on features like **area, number of bedrooms, and age**. Below is a detailed breakdown of the key concepts and steps involved.

---

## **1. Introduction to Multiple Linear Regression**
Multiple Linear Regression is an extension of Simple Linear Regression. It allows us to predict a **dependent variable (target variable)** using **multiple independent variables (features).**  

### **1.1 Linear Regression Equation**
For multiple variables, the equation is:
\[
\text{Price} = m_1 \times \text{Area} + m_2 \times \text{Bedrooms} + m_3 \times \text{Age} + b
\]
Where:
- \( m_1, m_2, m_3 \) are **coefficients** (weights assigned to each variable).
- \( b \) is the **intercept**.
- **Area, Bedrooms, and Age** are independent variables (features).
- **Price** is the dependent variable (target).

The equation can be generalized for \( n \) variables as:
\[
Y = m_1X_1 + m_2X_2 + ... + m_nX_n + b
\]

---

## **2. Preprocessing the Data**
Before applying a machine learning model, we must **clean and preprocess** the data.

### **2.1 Handling Missing Data**
- In the dataset, **some bedroom values are missing**.
- To handle this, we replace the missing values with the **median** (middle value) of the column.  
  ```python
  median_bedrooms = math.floor(df.bedrooms.median())
  df.bedrooms.fillna(median_bedrooms, inplace=True)
  ```
- **Why use the median?**  
  The median is **less affected by outliers** compared to the mean.

### **2.2 Checking Linear Relationship**
- The relationship between each **feature** and **price** should be **linear**.
- From the dataset, we observe:
  - As **area** and **bedrooms** increase → **Price increases**.
  - As **age** increases → **Price decreases**.

Since we see **linear relationships**, we can apply **linear regression**.

---

## **3. Training the Model**
### **3.1 Import Required Libraries**
```python
import pandas as pd
import numpy as np
import math
from sklearn.linear_model import LinearRegression
```

### **3.2 Load Data**
```python
df = pd.read_csv("home_prices.csv")
```
- The dataset contains **home prices** along with **area, bedrooms, and age**.

### **3.3 Create and Train the Model**
```python
reg = LinearRegression()
reg.fit(df[['area', 'bedrooms', 'age']], df.price)
```
- `.fit()` method trains the model by finding the **best coefficients** (slopes) for the equation.

### **3.4 Understanding Coefficients**
```python
print(reg.coef_)  # [m1, m2, m3]
print(reg.intercept_)  # b
```
- The **coefficients (m1, m2, m3)** represent how much the price changes **per unit change** in each feature.

---

## **4. Making Predictions**
Once the model is trained, we can predict house prices for new data.

### **4.1 Predicting for a Home (3000 sq ft, 3 bedrooms, 40 years old)**
```python
reg.predict([[3000, 3, 40]])
```
- Example output: **$498,408**  
  If we change the **age to 15**, the price increases, showing how **age affects the price**.

### **4.2 Formula-Based Prediction**
The predicted price is calculated as:
\[
\text{Price} = m_1 \times 3000 + m_2 \times 3 + m_3 \times 40 + b
\]

---

## **5. Exercise: Salary Prediction**
The exercise involves building a **multiple linear regression model** to predict a **candidate's salary** based on:
1. **Years of experience**
2. **Written test score**
3. **Personal interview score**

### **5.1 Challenges in Dataset**
- Missing experience values → Assume them as **zero**.
- Convert **text experience** to numbers using `word2number` module.
- Missing test/interview scores → Fill with **median**.

### **5.2 Predict Salaries**
- Candidate 1: **2 years experience, scores 9 & 6**.
- Candidate 2: **12 years experience, scores 10 & 10**.

Train the model and predict their salaries.

---

## **6. Summary**
- **Multiple Linear Regression** is used for prediction with multiple input variables.
- **Data Preprocessing** (handling missing values) is crucial.
- **Model Training** finds the best coefficients.
- **Predictions** are made using the learned equation.

Would you like me to provide the full Python code for implementation? 🚀