<a href="https://colab.research.google.com/github/UrvashiiThakur/practiceGit/blob/main/8April.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Q1. Best Regression Metric for Predicting House Prices

For predicting house prices using an SVM regression model, the Mean Absolute Error (MAE) is often a suitable metric. MAE measures the average magnitude of errors in a set of predictions, without considering their direction. It is particularly useful in this context because:

- **Interpretability**: MAE is easy to interpret, as it represents the average difference between the predicted and actual prices in the same unit (currency) as the target variable.
- **Robustness to Outliers**: Unlike MSE (Mean Squared Error), MAE is less sensitive to outliers, making it a good choice when the dataset contains some outlier prices.

### Q2. Choosing Between MSE and R-squared for House Price Prediction

If your goal is to predict the actual price of a house as accurately as possible, Mean Squared Error (MSE) would be more appropriate. MSE provides a clear measure of the average squared difference between the predicted and actual values, penalizing larger errors more than smaller ones. This makes MSE useful when accuracy in prediction is crucial.

### Q3. Regression Metric for a Dataset with Significant Outliers

For a dataset with a significant number of outliers, the Mean Absolute Error (MAE) is the most appropriate metric. MAE is less sensitive to outliers compared to MSE, as it does not square the error term. This makes it more robust in scenarios where outliers could disproportionately influence the evaluation metric.

### Q4. Choosing Between MSE and RMSE When Values Are Close

When both MSE (Mean Squared Error) and RMSE (Root Mean Squared Error) values are very close, the choice between the two often comes down to interpretability:

- **RMSE**: RMSE is the square root of MSE and is in the same unit as the target variable. It is more interpretable because it represents the average magnitude of the error.
- **MSE**: MSE is the squared error and is more sensitive to larger errors.

If both metrics are close, RMSE is generally preferred due to its interpretability in the same unit as the target variable.

### Q5. Best Metric for Comparing SVM Models with Different Kernels

When comparing the performance of different SVM regression models with different kernels (linear, polynomial, and RBF), the R-squared (\( R^2 \)) metric would be most appropriate if your goal is to measure how well the model explains the variance in the target variable. \( R^2 \) indicates the proportion of the variance in the dependent variable that is predictable from the independent variables. It is useful for:

- **Comparative Analysis**: \( R^2 \) provides a clear comparison between models, showing which model explains the variance better.
- **Model Explanation**: Higher \( R^2 \) values indicate better explanatory power of the model.

### Example Workflow for SVM Regression Model

1. **Import Libraries and Load Dataset**:
   ```python
   import pandas as pd
   from sklearn.model_selection import train_test_split
   from sklearn.preprocessing import StandardScaler
   from sklearn.svm import SVR
   from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

   # Load dataset
   data = pd.read_csv('house_prices.csv')
   ```

2. **Split Dataset into Training and Testing Sets**:
   ```python
   X = data.drop('price', axis=1)
   y = data['price']
   X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
   ```

3. **Preprocess the Data**:
   ```python
   scaler = StandardScaler()
   X_train_scaled = scaler.fit_transform(X_train)
   X_test_scaled = scaler.transform(X_test)
   ```

4. **Create and Train the SVR Model**:
   ```python
   svr = SVR(kernel='poly', degree=3, C=1.0, epsilon=0.1)
   svr.fit(X_train_scaled, y_train)
   ```

5. **Predict and Evaluate the Model**:
   ```python
   y_pred = svr.predict(X_test_scaled)

   # Calculate Metrics
   mae = mean_absolute_error(y_test, y_pred)
   mse = mean_squared_error(y_test, y_pred)
   rmse = mean_squared_error(y_test, y_pred, squared=False)
   r2 = r2_score(y_test, y_pred)

   print(f"MAE: {mae}")
   print(f"MSE: {mse}")
   print(f"RMSE: {rmse}")
   print(f"R-squared: {r2}")
   ```

6. **Tune Hyperparameters using GridSearchCV**:
   ```python
   from sklearn.model_selection import GridSearchCV

   param_grid = {
       'C': [0.1, 1, 10, 100],
       'kernel': ['linear', 'poly', 'rbf'],
       'degree': [2, 3, 4],
       'gamma': ['scale', 'auto'],
       'epsilon': [0.1, 0.2, 0.3]
   }

   grid_search = GridSearchCV(SVR(), param_grid, refit=True, verbose=3)
   grid_search.fit(X_train_scaled, y_train)
   ```

7. **Train the Tuned Model**:
   ```python
   best_svr = grid_search.best_estimator_
   best_svr.fit(X_train_scaled, y_train)
   ```

8. **Save the Model**:
   ```python
   import pickle

   with open('svr_model.pkl', 'wb') as file:
       pickle.dump(best_svr, file)
   ```