# Polynomial Linear Regression 
    Polynomial linear regression is a regression analysis technique in which the relationship between the independent variable x and the dependent variable y is modeled as an nth degree polynomial in x.
    The Equation: It extends the simple linear regression model by adding polynomial terms (features raised to a power)
$$y = \beta_0 + \beta_1x + \beta_2x^2 + \beta_3x^3 + \dots + \beta_nx^n + \epsilon$$

    y :: Dependent variable (target)
    x :: Independent variable (feature)
    β₀, β1,..., βn :: Model parameters (coefficients)
    n :: The degree of the polynomial
    epsilon :: Random error

 ## Why it is linear 
    Despite fitting a curved, non-linear line to the data points, the model is classified mathematically as linear. This is because the term "linear" refers to the unknown parameters (the β₀ coefficients), not the input variables. The equation is a linear combination of the parameters.
    -> Application: It is used when the data exhibits a curvilinear relationship that a standard straight-line model underfits. By transforming the original features into polynomial features (e.g., x, x^2, x^3), the linear model can map complex curves.
    -> The Trade-off (Degree Selection): Degree 1 (n=1): Standard linear regression (straight line).

### Import libraries

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error,mean_absolute_error

### Load dataset

In [4]:
pd.set_option("display.max_columns",None)
data_1 = pd.read_csv("bangalore house price prediction OHE-data.csv")
data_1.sample(5)

Unnamed: 0,bath,balcony,price,total_sqft_int,bhk,price_per_sqft,area_typeSuper built-up Area,area_typeBuilt-up Area,area_typePlot Area,availability_Ready To Move,location_Whitefield,location_Sarjapur Road,location_Electronic City,location_Marathahalli,location_Raja Rajeshwari Nagar,location_Haralur Road,location_Hennur Road,location_Bannerghatta Road,location_Uttarahalli,location_Thanisandra,location_Electronic City Phase II,location_Hebbal,location_7th Phase JP Nagar,location_Yelahanka,location_Kanakpura Road,location_KR Puram,location_Sarjapur,location_Rajaji Nagar,location_Kasavanhalli,location_Bellandur,location_Begur Road,location_Banashankari,location_Kothanur,location_Hormavu,location_Harlur,location_Akshaya Nagar,location_Jakkur,location_Electronics City Phase 1,location_Varthur,location_Chandapura,location_HSR Layout,location_Hennur,location_Ramamurthy Nagar,location_Ramagondanahalli,location_Kaggadasapura,location_Kundalahalli,location_Koramangala,location_Hulimavu,location_Budigere,location_Hoodi,location_Malleshwaram,location_Hegde Nagar,location_8th Phase JP Nagar,location_Gottigere,location_JP Nagar,location_Yeshwanthpur,location_Channasandra,location_Bisuvanahalli,location_Vittasandra,location_Indira Nagar,location_Vijayanagar,location_Kengeri,location_Brookefield,location_Sahakara Nagar,location_Hosa Road,location_Old Airport Road,location_Bommasandra,location_Balagere,location_Green Glen Layout,location_Old Madras Road,location_Rachenahalli,location_Panathur,location_Kudlu Gate,location_Thigalarapalya,location_Ambedkar Nagar,location_Jigani,location_Yelahanka New Town,location_Talaghattapura,location_Mysore Road,location_Kadugodi,location_Frazer Town,location_Dodda Nekkundi,location_Devanahalli,location_Kanakapura,location_Attibele,location_Anekal,location_Lakshminarayana Pura,location_Nagarbhavi,location_Ananth Nagar,location_5th Phase JP Nagar,location_TC Palaya,location_CV Raman Nagar,location_Kengeri Satellite Town,location_Kudlu,location_Jalahalli,location_Subramanyapura,location_Bhoganhalli,location_Doddathoguru,location_Kalena Agrahara,location_Horamavu Agara,location_Vidyaranyapura,location_BTM 2nd Stage,location_Hebbal Kempapura,location_Hosur Road,location_Horamavu Banaswadi,location_Domlur,location_Mahadevpura,location_Tumkur Road
2414,3.0,2.0,45.0,1400.0,3,3214.285714,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
85,2.0,0.0,190.0,1567.2,3,12123.532414,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5614,4.0,3.0,253.0,2121.0,3,11928.335691,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
996,2.0,2.0,52.0,1294.0,3,4018.547141,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1285,2.0,1.0,45.0,910.0,2,4945.054945,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Split X & y into train and test set

In [7]:
X = data_1.drop(['price'], axis=1)
y = data_1[["price"]]
X_train,X_test, y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

### Scaling data via StandaerdScaler()

In [10]:
sc = StandardScaler()
sc.fit(X_train)
X_train = sc.transform(X_train)
X_test = sc.transform(X_test)

### Polynomial Features 

In [13]:
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
poly_reg = PolynomialFeatures(degree=2)
poly_reg.fit(X_train)
X_train_poly = poly_reg.transform(X_train)
X_test_poly = poly_reg.transform(X_test)

In [15]:
X_train_poly.shape, X_test_poly.shape

((5696, 5886), (1424, 5886))

### Model - Linear Regression 

In [18]:
lr = LinearRegression()
lr.fit(X_train_poly, y_train)

### Prediction

In [19]:
y_pred = lr.predict(X_test_poly)
print("Prediction for X_test_poly ::",y_pred)

Prediction for X_test_poly :: [[47.        ]
 [60.        ]
 [65.        ]
 ...
 [56.5       ]
 [85.61032866]
 [49.        ]]


### Score for Linear regression 

In [21]:
lr.score(X_test_poly, y_test,)

0.9949473112369437

### Linear regression Evaluation Metrics

In [22]:
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_test,y_pred)
print("Mean Squared Error [MSE] ::",mse)
print("Mean Absoluate Error [MAE] ::",mae)
print("Root Mean Squared Error [RMSE] ::",rmse)

Mean Squared Error [MSE] :: 57.363931674874415
Mean Absoluate Error [MAE] :: 0.7027788800287276
Root Mean Squared Error [RMSE] :: 7.57389805020337
