Q1. In order to predict house price based on several characteristics, such as location, square footage,
number of bedrooms, etc., you are developing an SVM regression model. Which regression metric in this
situation would be the best to employ?

For predicting house prices using an SVM regression model, **Mean Absolute Error (MAE)** or **Root Mean Squared Error (RMSE)** are commonly used metrics:

- **MAE (Mean Absolute Error):** Measures the average magnitude of errors in predictions, without considering their direction. It is less sensitive to outliers.

  \[
  \text{MAE} = \frac{1}{N} \sum_{i=1}^{N} | y_i - \hat{y}_i |
  \]

- **RMSE (Root Mean Squared Error):** Measures the square root of the average of the squared differences between actual and predicted values. It penalizes larger errors more heavily than MAE.

  \[
  \text{RMSE} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{y}_i)^2}
  \]

**Choosing Between MAE and RMSE:**

- **MAE** is preferred if you want a metric that is more robust to outliers.
- **RMSE** is useful if larger errors are particularly undesirable and you want to penalize them more heavily.

Q2. You have built an SVM regression model and are trying to decide between using MSE or R-squared as
your evaluation metric. Which metric would be more appropriate if your goal is to predict the actual price
of a house as accurately as possible?

If your goal is to predict the actual price of a house as accurately as possible, **Mean Squared Error (MSE)** would be more appropriate.

**Reason:**

- **MSE (Mean Squared Error):** Provides a direct measure of the average squared difference between actual and predicted values. It is sensitive to larger errors, which helps to minimize prediction errors in absolute terms.

  \[
  \text{MSE} = \frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{y}_i)^2
  \]

**R-squared** is a relative measure that indicates the proportion of variance explained by the model but does not provide a direct measure of prediction accuracy. Therefore, MSE is preferred for directly evaluating prediction accuracy.

Q3. You have a dataset with a significant number of outliers and are trying to select an appropriate
regression metric to use with your SVM model. Which metric would be the most appropriate in this
scenario?

In the presence of significant outliers, **Mean Absolute Error (MAE)** is the most appropriate regression metric to use.

**Reason:**

- **MAE (Mean Absolute Error):** Measures the average magnitude of errors in predictions without squaring them, making it less sensitive to outliers compared to metrics like MSE or RMSE. This helps provide a more robust evaluation when outliers are present.

  \[
  \text{MAE} = \frac{1}{N} \sum_{i=1}^{N} | y_i - \hat{y}_i |
  \]

Q4. You have built an SVM regression model using a polynomial kernel and are trying to select the best
metric to evaluate its performance. You have calculated both MSE and RMSE and found that both values
are very close. Which metric should you choose to use in this case?

When MSE and RMSE values are very close, **Root Mean Squared Error (RMSE)** is generally the better choice to evaluate your SVM regression model.

**Reason:**

- **RMSE (Root Mean Squared Error):** Provides error measurements in the same units as the target variable (e.g., house prices), making it more interpretable and directly comparable to the actual values. It also penalizes larger errors more, which can be useful if minimizing large errors is important.

  \[
  \text{RMSE} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{y}_i)^2}
  \]

Q5. You are comparing the performance of different SVM regression models using different kernels (linear,
polynomial, and RBF) and are trying to select the best evaluation metric. Which metric would be most
appropriate if your goal is to measure how well the model explains the variance in the target variable?

If your goal is to measure how well the model explains the variance in the target variable, **R-squared (Coefficient of Determination)** would be the most appropriate evaluation metric.

**Reason:**

- **R-squared:** Indicates the proportion of variance in the target variable that is explained by the model. It provides a relative measure of how well the model fits the data compared to a baseline model (e.g., mean of the target variable).

  \[
  R^2 = 1 - \frac{\sum_{i=1}^{N} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{N} (y_i - \bar{y})^2}
  \]

R-squared helps you understand the effectiveness of different kernels in capturing the variability in the target variable, making it suitable for comparing model performance in terms of variance explanation.