## Regressão Linear com SVR (Support Vector Regression) com house_prices
##### SVR (Support Vector Regression) possui os mesmos conceitos do SVM, porém tendo o diferencial de seu atributo epsilon,
##### possuindo os seguintes kernels testados neste notebook: RBF (Segundo Jones, o que tem os melhores resultados)

In [6]:
import pandas as pd
base = pd.read_csv(r'house_prices.csv')

In [7]:
X = base.iloc[:, 3:19].values
y = base.iloc[:, 2:3].values
print(X)
print(y)

[[ 3.00000e+00  1.00000e+00  1.18000e+03 ...  9.81780e+04  4.75112e+01
  -1.22257e+02]
 [ 3.00000e+00  2.25000e+00  2.57000e+03 ...  9.81250e+04  4.77210e+01
  -1.22319e+02]
 [ 2.00000e+00  1.00000e+00  7.70000e+02 ...  9.80280e+04  4.77379e+01
  -1.22233e+02]
 ...
 [ 2.00000e+00  7.50000e-01  1.02000e+03 ...  9.81440e+04  4.75944e+01
  -1.22299e+02]
 [ 3.00000e+00  2.50000e+00  1.60000e+03 ...  9.80270e+04  4.75345e+01
  -1.22069e+02]
 [ 2.00000e+00  7.50000e-01  1.02000e+03 ...  9.81440e+04  4.75941e+01
  -1.22299e+02]]
[[221900.]
 [538000.]
 [180000.]
 ...
 [402101.]
 [400000.]
 [325000.]]


##### Realizando o preprocessamento de X e y pois o kernel utilizado é o RBF

In [8]:
from sklearn.preprocessing import StandardScaler
scaler_x = StandardScaler()
X = scaler_x.fit_transform(X)
scaler_y = StandardScaler()
y = scaler_y.fit_transform(y)
print(X)
print(y)

[[-0.39873715 -1.44746357 -0.97983502 ...  1.87015159 -0.35257175
  -0.30607896]
 [-0.39873715  0.1756067   0.53363434 ...  0.87956753  1.16156823
  -0.74634143]
 [-1.47395936 -1.44746357 -1.42625404 ... -0.9333882   1.28353661
  -0.13565477]
 ...
 [-1.47395936 -1.77207762 -1.15404732 ...  1.23468257  0.24788796
  -0.60432128]
 [-0.39873715  0.50022075 -0.52252773 ... -0.95207846 -0.18441416
   1.02891048]
 [-1.47395936 -1.77207762 -1.15404732 ...  1.23468257  0.24572284
  -0.60432128]]
[[-0.86671733]
 [-0.00568792]
 [-0.98084935]
 ...
 [-0.37586519]
 [-0.38158814]
 [-0.58588173]]


In [9]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

In [10]:
# Treinamento do modelo
from sklearn.svm import SVR
regressor = SVR(kernel='rbf')
regressor.fit(X_train, y_train)

  y = column_or_1d(y, warn=True)


SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='scale',
    kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

In [12]:
# Score do SVR em cima do treinamento de House_prices com RBF
score = regressor.score(X_train, y_train)
print(score)

0.8123465093269254


In [13]:
# Score do SVR em cima do teste de House_prices com RBF
print(regressor.score(X_test, y_test))

0.7375903031882551


##### Obtendo previsoes

In [14]:
previsoes = regressor.predict(X_test)
print(previsoes)

[-0.54140132  3.44956474  0.02550451 ... -0.46136154 -0.90562137
 -1.03010152]


In [16]:
# Revertendo o escalonamento de y_test
y_test = scaler_y.inverse_transform(y_test)
previsoes = scaler_y.inverse_transform(previsoes)
print(previsoes)

[ 341329.58944883 1806487.87659637  549451.32286862 ...  370713.69062882
  207617.59706888  161918.60662813]


##### Obtendo o MAE (Mean Absolute Error)

In [17]:
from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(y_test, previsoes)
print(mae)

197832267794.83667
