## Support Vector Regression

Although not amazing, support vector regression (SVR) is better at dealing with uneven spatial data than linear regression for example so it will hopefully produce a good model. In order to model using an SVR, we will first load the data.

In [4]:
### import libraries ###

import pandas as pd
from sklearn.svm import SVR
from sklearn.metrics import r2_score, mean_squared_error
from sklearn.model_selection import GridSearchCV

In [2]:
### importing data ###

# features
features_train = pd.read_csv('data/features_train.csv', index_col = 0)
features_test = pd.read_csv('data/features_test.csv', index_col = 0)

# target
target_train = pd.read_csv('data/target_train.csv', index_col = 0)
target_test = pd.read_csv('data/target_test.csv', index_col = 0)

Now we will model using SVR but we will use a grid search to find the optimal parameters.

In [12]:
### building grid search ###

# choosing SVR
model = SVR()

# parameters to search through
parameters = {'kernel' : ('linear', 'rbf'), 'C' : [0.01, 0.1, 1, 10, 100, 1000]}

# fitting grid search
clf = GridSearchCV(model, parameters)
clf.fit(features_train, target_train.values.ravel())

# showing best parameters
clf.best_params_

{'C': 1, 'kernel': 'rbf'}

We found the best parameters for SVR.

In [14]:
### checking metrics ###

# choosing best parameters
model = SVR(C = 1, kernel = 'rbf')

# fitting model
model.fit(features_train, target_train.values.ravel())

# predicted target
pred = model.predict(features_test)

print('RMSE is:', mean_squared_error(target_test, pred, squared = False), 'and the r2 is:', r2_score(target_test, pred))

RMSE is: 1.0633770643225113 and the r2 is: -0.13077078092716254


SVR does not act as a good model as it is worse than a horizontal line. 

## Conclusion
SVR is not a good model for predicting forest fire damage.
* r2 = -0.1308
* RMSE = 1.0634