# Support Vector Machine [ Regression ]
     - Support Vector Machine [ SVM ] is a supervised machine learning algorithm used for classification and regression tasks. It aims to find the optimal boundary [ hyperplane ] that best separates data points of different classes. In a 2D space, SVM finds a line that divides the data points into two classes with the maximum margin between them.
    
     - Support Vector Regression [ SVR ] is the regression counterpart of SVM. Instead of classifying data, it tries to fit a line [ or hyperplane ] within a margin of tolerance (ε) such that most points fall within this boundary. Minimize the prediction error while keeping the model as simple as possible.
    
     - Hyperplane => A hyperplane is the decision boundary that separates data points belonging to different classes in SVM.
    
     - The margin is the distance between the hyperplane and the closest data points from each class. SVM tries to maximize this margin, ensuring better generalization on unseen data. The data points that lie on the edges of this margin are called Support Vectors.
       Larger margin → better classifier.
    
     - Process 
      1. Input Data
      2. Plot Data Points
      3. Find the Best Hyperplane
      4. Use Support Vectors
      5. Classify New Data

### Import Libraries

In [14]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score,mean_squared_error
from sklearn.svm import SVR 
from sklearn.preprocessing import StandardScaler

### Load dataset

In [16]:
pd.set_option('display.max_columns', None)
data = pd.read_csv("bangalore house price prediction OHE-data.csv")
data.head(5)

Unnamed: 0,bath,balcony,price,total_sqft_int,bhk,price_per_sqft,area_typeSuper built-up Area,area_typeBuilt-up Area,area_typePlot Area,availability_Ready To Move,location_Whitefield,location_Sarjapur Road,location_Electronic City,location_Marathahalli,location_Raja Rajeshwari Nagar,location_Haralur Road,location_Hennur Road,location_Bannerghatta Road,location_Uttarahalli,location_Thanisandra,location_Electronic City Phase II,location_Hebbal,location_7th Phase JP Nagar,location_Yelahanka,location_Kanakpura Road,location_KR Puram,location_Sarjapur,location_Rajaji Nagar,location_Kasavanhalli,location_Bellandur,location_Begur Road,location_Banashankari,location_Kothanur,location_Hormavu,location_Harlur,location_Akshaya Nagar,location_Jakkur,location_Electronics City Phase 1,location_Varthur,location_Chandapura,location_HSR Layout,location_Hennur,location_Ramamurthy Nagar,location_Ramagondanahalli,location_Kaggadasapura,location_Kundalahalli,location_Koramangala,location_Hulimavu,location_Budigere,location_Hoodi,location_Malleshwaram,location_Hegde Nagar,location_8th Phase JP Nagar,location_Gottigere,location_JP Nagar,location_Yeshwanthpur,location_Channasandra,location_Bisuvanahalli,location_Vittasandra,location_Indira Nagar,location_Vijayanagar,location_Kengeri,location_Brookefield,location_Sahakara Nagar,location_Hosa Road,location_Old Airport Road,location_Bommasandra,location_Balagere,location_Green Glen Layout,location_Old Madras Road,location_Rachenahalli,location_Panathur,location_Kudlu Gate,location_Thigalarapalya,location_Ambedkar Nagar,location_Jigani,location_Yelahanka New Town,location_Talaghattapura,location_Mysore Road,location_Kadugodi,location_Frazer Town,location_Dodda Nekkundi,location_Devanahalli,location_Kanakapura,location_Attibele,location_Anekal,location_Lakshminarayana Pura,location_Nagarbhavi,location_Ananth Nagar,location_5th Phase JP Nagar,location_TC Palaya,location_CV Raman Nagar,location_Kengeri Satellite Town,location_Kudlu,location_Jalahalli,location_Subramanyapura,location_Bhoganhalli,location_Doddathoguru,location_Kalena Agrahara,location_Horamavu Agara,location_Vidyaranyapura,location_BTM 2nd Stage,location_Hebbal Kempapura,location_Hosur Road,location_Horamavu Banaswadi,location_Domlur,location_Mahadevpura,location_Tumkur Road
0,3.0,2.0,150.0,1672.0,3,8971.291866,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,3.0,3.0,149.0,1750.0,3,8514.285714,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,3.0,2.0,150.0,1750.0,3,8571.428571,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,2.0,2.0,40.0,1250.0,2,3200.0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,2.0,2.0,83.0,1200.0,2,6916.666667,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Check For Null Values 

In [18]:
data.isnull().sum()

bath                           0
balcony                        0
price                          0
total_sqft_int                 0
bhk                            0
                              ..
location_Hosur Road            0
location_Horamavu Banaswadi    0
location_Domlur                0
location_Mahadevpura           0
location_Tumkur Road           0
Length: 108, dtype: int64

### All column 

In [20]:
data.columns

Index(['bath', 'balcony', 'price', 'total_sqft_int', 'bhk', 'price_per_sqft',
       'area_typeSuper built-up  Area', 'area_typeBuilt-up  Area',
       'area_typePlot  Area', 'availability_Ready To Move',
       ...
       'location_Kalena Agrahara', 'location_Horamavu Agara',
       'location_Vidyaranyapura', 'location_BTM 2nd Stage',
       'location_Hebbal Kempapura', 'location_Hosur Road',
       'location_Horamavu Banaswadi', 'location_Domlur',
       'location_Mahadevpura', 'location_Tumkur Road'],
      dtype='object', length=108)

### Splitting X & y Into Train And Test Data 

In [22]:
X = data.drop(columns=['price'],axis = 1)
y = data['price']
X_train,X_test, y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

### Use StandardScaler () for scaling data 

In [24]:
scale = StandardScaler()
scale.fit(X_train)

In [25]:
X_train = scale.fit_transform(X_train)
X_test = scale.fit_transform(X_test)

### Support Vector Machine Regression [ SVR ] With rbf Kernal 
    -> A kernel is a mathematical function that allows SVM or SVR to work in a higher-dimensional space without explicitly transforming the data
    -> The RBF kernel (also known as the Gaussian kernel) is the most popular and widely used kernel in SVR. It maps the data into an infinite-dimensional space. It assumes that points closer to each other in the feature space should have similar target values.
     

In [27]:
support_vector_regre_rbf = SVR(kernel='rbf')
support_vector_regre_rbf.fit(X_train,y_train)

In [28]:
y_pred = support_vector_regre_rbf.predict(X_test)
print("SVR With rbf Kernel Score ::",support_vector_regre_rbf.score(X_test,y_test))
print("Mean Square Error ::",mean_squared_error(y_test,y_pred))
print("Root Mean Squared Error ::",np.sqrt(mean_squared_error(y_test,y_pred)))

SVR With rbf Kernel Score :: 0.37888542829400473
Mean Square Error :: 7051.606683974654
Root Mean Squared Error :: 83.97384523751818


### SVR With Ploy Kernel 
    The Polynomial kernel represents the similarity of vectors in a feature space over polynomials of the original variables. It allows for curved regression lines by considering not only the given features but also their combinations.

In [30]:
support_vector_regre_poly = SVR(kernel='poly',degree=1)
support_vector_regre_poly.fit(X_train,y_train)

In [31]:
y_pred1 = support_vector_regre_poly.predict(X_test)
print("SVR With poly Kernel Score ::",support_vector_regre_poly.score(X_test,y_test))
print("Mean Square Error ::",mean_squared_error(y_test,y_pred1))
print("Root Mean Squared Error ::",np.sqrt(mean_squared_error(y_test,y_pred1)))

SVR With poly Kernel Score :: 0.5614388859959334
Mean Square Error :: 4979.049959089208
Root Mean Squared Error :: 70.56238345669176


### SVR With Linear Kernel 
    The Linear kernel is the simplest form of kernel. It is used when the relationship between the features and the target variable is already linear. It does not map the data to a higher dimension but instead works in the original feature space.

In [33]:
support_vector_regre_lin = SVR(kernel='linear')
support_vector_regre_lin.fit(X_train,y_train)

In [34]:
y_pred2 = support_vector_regre_lin.predict(X_test)
print("SVR With Linear Kernel Score ::",support_vector_regre_lin.score(X_test,y_test))
print("Mean Square Error ::",mean_squared_error(y_test,y_pred2))
print("Root Mean Squared Error ::",np.sqrt(mean_squared_error(y_test,y_pred2)))

SVR With Linear Kernel Score :: 0.8701179097686287
Mean Square Error :: 1474.5708075863142
Root Mean Squared Error :: 38.40014072352228


### Check The Sample Prediction 

In [36]:
y_test[0:5]   # Original Data 

960      47.0
132      60.0
2431     65.0
2229    325.0
4503     60.0
Name: price, dtype: float64

In [37]:
y_pred2[0:5]   # Pedicted Data

array([ 36.839895  ,  56.26864245,  62.09013935, 289.64805458,
        73.16531516])