# Section III b - Support Vector Regressions (SVR)

We will use a Support Vector regression to check for non-linear models. Before we can use our Support Vector Regression we have to make sure our features are all scaled. Some regression classes include a feature scaling algorithm, like the multiple regression above, but our Support Vector Regression does not. We had to fit transform both the feature and response variables before we produce our predictions.   For our Support Vector Regression, we chose a Gaussian(rbf) kernel. 
(image source:) https://en.wikipedia.org/wiki/Support_vector_machine#/media/File:Svr_epsilons_demo.svg
![](./images/a2.jpg "")
Pros: SVR works well on non-linear problems

Cons: You have to use feature scaling so the results are more difficult to understand 

* **Principal Component Analysis** — Before feeding into the SVR, we will do a PCA; varying the number of components from N=10 to N=50.

### Support Vector Regression with PCA (n=10)

In [1]:
import numpy as np   #Mathematics library
import matplotlib.pyplot as plt # for plotting
import pandas as pd  #manage datasets
import seaborn as sea


df = pd.read_csv('FinishMissing.csv')
df=df.drop('Unnamed: 0',axis=1)


In [2]:
#Drop outliers before splitting ex and y
avg = df['logerror'].mean()
std = df['logerror'].std()
upper_outlier = avg + 2*std
lower_outlier = avg - 2*std
#round up to drop outliers, til reasonable
df=df[ df.logerror > -0.32 ]
df=df[ df.logerror < 0.34 ]
df.to_csv('OutlierRemoved.csv')  #big file


In [3]:
###############Create Dummy variables for Categorical data
df=pd.get_dummies(df,columns=['taxdelinquencyflag','fireplaceflag','propertyzoningdesc','propertycountylandusecode','hashottuborspa','airconditioningtypeid','architecturalstyletypeid','buildingqualitytypeid','buildingclasstypeid','decktypeid','fips','heatingorsystemtypeid','pooltypeid10','pooltypeid2','pooltypeid7','propertylandusetypeid','regionidcounty','regionidcity','regionidzip','regionidneighborhood','storytypeid','typeconstructiontypeid','month','day'],drop_first=True)

dataset=df

In [4]:
#split into response and features, skip to next for testing
X = dataset.iloc[:, 2:].values
y = dataset.iloc[:, 1].values

In [5]:
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)



In [6]:
###################### Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)


In [7]:
##### DIMENSIONALITY REDUCTION : PRINCIPAL COMPONENT ANALYSIS(PCA)
# Applying PCA * requires feature scaling
from sklearn.decomposition import PCA
pca = PCA(n_components = 10) # number of principal components explain variance, use '0' first
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)
X = np.concatenate((X_train,X_test),axis=0)

In [8]:
from sklearn.svm import SVR
regressor = SVR(kernel = 'rbf')
regressor.fit(X_train, y_train)

SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='auto',
  kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

In [9]:
y_pred = regressor.predict(X_test)

In [10]:
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_test, y_pred, sample_weight=None, multioutput='uniform_average')
print(mse)
rmse = np.sqrt(mse)
print(rmse)

0.997378173431
0.998688226341


In [11]:

from sklearn.cross_validation import cross_val_score
scores = cross_val_score(regressor, X, y, cv=4, scoring='neg_mean_squared_error')
mse_scores = -scores
# calculate the average MSE
print(mse_scores.mean())
rmse_kfold = np.sqrt(mse_scores.mean())
print(rmse_kfold)




1.01186707294
1.00591603673


### Support Vector Regression with PCA (n=20)

In [12]:
dataset=df

#split into response and features, skip to next for testing
X = dataset.iloc[:, 2:].values
y = dataset.iloc[:, 1].values

In [13]:
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)




In [14]:
###################### Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)


In [15]:
##### DIMENSIONALITY REDUCTION : PRINCIPAL COMPONENT ANALYSIS(PCA)
# Applying PCA * requires feature scaling
from sklearn.decomposition import PCA
pca = PCA(n_components = 20) # number of principal components explain variance, use '0' first
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)
X = np.concatenate((X_train,X_test),axis=0)

In [16]:
regressor = SVR(kernel = 'rbf')
regressor.fit(X_train, y_train)

SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='auto',
  kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

In [17]:
y_pred = regressor.predict(X_test)

In [18]:
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_test, y_pred, sample_weight=None, multioutput='uniform_average')
print(mse)
rmse = np.sqrt(mse)
print(rmse)

0.99377121505
0.996880742642


In [19]:
from sklearn.cross_validation import cross_val_score
scores = cross_val_score(regressor, X, y, cv=4, scoring='neg_mean_squared_error')
mse_scores = -scores
# calculate the average MSE
print(mse_scores.mean())
rmse_kfold = np.sqrt(mse_scores.mean())
print(rmse_kfold)

1.01401227349
1.00698176423


### Support Vector Regression with PCA (n=50)

In [20]:
dataset=df

#split into response and features, skip to next for testing
X = dataset.iloc[:, 2:].values
y = dataset.iloc[:, 1].values

In [21]:
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)



In [22]:
###################### Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)



In [23]:
##### DIMENSIONALITY REDUCTION : PRINCIPAL COMPONENT ANALYSIS(PCA)
# Applying PCA * requires feature scaling
from sklearn.decomposition import PCA
pca = PCA(n_components = 50) # number of principal components explain variance, use '0' first
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)
X = np.concatenate((X_train,X_test),axis=0)

In [24]:

regressor = SVR(kernel = 'rbf')
regressor.fit(X_train, y_train)

SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='auto',
  kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

In [25]:
y_pred = regressor.predict(X_test)

In [26]:
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_test, y_pred, sample_weight=None, multioutput='uniform_average')
print(mse)
rmse = np.sqrt(mse)
print(rmse)

0.993735736236
0.996862947569


In [27]:
from sklearn.cross_validation import cross_val_score
scores = cross_val_score(regressor, X, y, cv=4, scoring='neg_mean_squared_error')
mse_scores = -scores
# calculate the average MSE
print(mse_scores.mean())
rmse_kfold = np.sqrt(mse_scores.mean())
print(rmse_kfold)

1.01259737139
1.00627897295
