##### Please upvote if you like the work!!!

##### Cells with requirement of high computational power (kfold cv) have been commented but the results have been displayed in the following cells.

# Permanent Magnet Synchronous Motor

![alt text](https://alliedmarketresearch.files.wordpress.com/2017/02/permanent-magnet-synchronous-motor-pmsm.png?w=705)

The permanent-magnet synchronous machine (PMSM) drive is one of best choices for a full range of motion control applications. For example, the PMSM is widely used in robotics, machine tools, actuators, and it is being considered in high-power applications such as industrial drives and vehicular propulsion. It is also used for residential/commercial applications. The PMSM is known for having low torque ripple, superior dynamic performance, high efficiency and high power density.

In [None]:
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

In [None]:
df = pd.read_csv('../input/electric-motor-temperature/pmsm_temperature_data.csv')

In [None]:
df_test = df[(df['profile_id'] == 65) | (df['profile_id'] == 72)]
df = df[(df['profile_id'] != 65) & (df['profile_id'] != 72)]

In [None]:
df

In [None]:
df.describe()

In [None]:
df.isnull().sum()

There are no missing values in the dataset.

In [None]:
plt.figure(figsize=(15,6))
df['profile_id'].value_counts().sort_values().plot(kind = 'bar')

As we can see, session ids 66, 6 and 20 have the most number of measurements recorded.

In [None]:
for i in df.columns:
    sns.distplot(df[i],color='g')
    sns.boxplot(df[i],color = 'y')
    plt.vlines(df[i].mean(),ymin = -1,ymax = 1,color = 'r')
    plt.show()

As we can see from the the above plots, the mean and median for most of the plots are very close to each other. So the data seems to have low skewness for almost all variables.

### Checking skewness and kurtosis numerically

In [None]:
import scipy.stats as stats
for i in df.columns:
    print(i,' :\nSkew : ',df[i].skew(),' : \nKurtosis : ',df[i].kurt())
    print()

As it is not highly skewed data and looking at the values of the dataset it seems there already has been some normalization done.

In [None]:
plt.figure(figsize=(14,7))
sns.heatmap(df.corr(),annot=True)

From the heatmap above, we can see that torque and q component of current are almost perfectly correlated. Also there seems to be a very high correlation between temperature measurements of stator yoke, stator tooth and stator windings.

For a random measurement, we can try to compare the temperatures of the 3 stator components.

In [None]:
plt.figure(figsize=(20,5))
df[df['profile_id'] == 20]['stator_yoke'].plot(label = 'stator yoke')
df[df['profile_id'] == 20]['stator_tooth'].plot(label = 'stator tooth')
df[df['profile_id'] == 20]['stator_winding'].plot(label = 'stator winding')
plt.legend()

As we can see from the plot, all three stator components follow a similar measurment variance.

As the dataset author mentioned, the records in the same profile id have been sorted by time, we can assume that these recordings have been arranged in series of time.

Due to this we can infer that there has not been much time given for the motor to cool down in between recording the sensor data as we can see that initially the stator yoke temperature is low as compared to temperature of stator winding but as we progress in time, the stator yoke temperature goes above the temperature of stator winding.

As profile_id is an id for each measurement session, we can remove it from any furthur analysis and model building.

In [None]:
df.drop('profile_id',axis = 1,inplace=True)
df_test.drop('profile_id',axis = 1,inplace=True)

# Statistical Analysis of Variables
We'll see which particular variables contribute to the rotor temperature individually by checking their statistical significance.

### Ambient Temperature

In [None]:
sns.distplot(df['ambient'])

In [None]:
from scipy.stats import shapiro
shapiro(df['ambient'])

In [None]:
shapiro(df['pm'])

H0 : variance_ambient = variance_pm

H1 : variance_ambient != variance_pm

In [None]:
from scipy.stats import bartlett
bartlett(df['ambient'],df['pm'])

pvalue is less than 0.05. So we reject the null hypothesis and can say that variance for ambient temperature is not equal to the variance of rotor temperature.

### Coolant Temperature

In [None]:
sns.distplot(df['coolant'])

In [None]:
from scipy.stats import shapiro
shapiro(df['coolant'])

In [None]:
shapiro(df['pm'])

H0 : variance_coolant = variance_pm

H1 : variance_coolant != variance_pm

In [None]:
from scipy.stats import bartlett
bartlett(df['coolant'],df['pm'])

pvalue is less than 0.05. So we reject the null hypothesis and can say that variance for coolant temperature is not equal to the variance of rotor temperature.

### Voltage d-component

In [None]:
sns.distplot(df['u_d'])

In [None]:
from scipy.stats import shapiro
shapiro(df['u_d'])

In [None]:
shapiro(df['pm'])

H0 : variance_u_d = variance_pm

H1 : variance_u_d != variance_pm

In [None]:
from scipy.stats import bartlett
bartlett(df['u_d'],df['pm'])

pvalue is less than 0.05. So we reject the null hypothesis and can say that variance for voltage d-component is not equal to the variance of rotor temperature.

### Voltage q-component

In [None]:
sns.distplot(df['u_q'])

In [None]:
from scipy.stats import shapiro
shapiro(df['u_q'])

In [None]:
shapiro(df['pm'])

H0 : variance_u_q = variance_pm

H1 : variance_u_q != variance_pm

In [None]:
from scipy.stats import bartlett
bartlett(df['u_q'],df['pm'])

pvalue is less than 0.05. So we reject the null hypothesis and can say that variance for voltage q-component is not equal to the variance of rotor temperature.

### Motor speed

In [None]:
sns.distplot(df['motor_speed'])

In [None]:
from scipy.stats import shapiro
shapiro(df['motor_speed'])

In [None]:
shapiro(df['pm'])

H0 : variance_motor_speed = variance_pm

H1 : variance_motor_speed != variance_pm

In [None]:
from scipy.stats import bartlett
bartlett(df['motor_speed'],df['pm'])

pvalue is less than 0.05. So we reject the null hypothesis and can say that variance of motor speed is not equal to the variance of rotor temperature.

### Current d-component

In [None]:
sns.distplot(df['i_d'])

In [None]:
from scipy.stats import shapiro
shapiro(df['i_d'])

In [None]:
shapiro(df['pm'])

H0 : variance_i_d = variance_pm

H1 : variance_i_d != variance_pm

In [None]:
from scipy.stats import bartlett
bartlett(df['i_d'],df['pm'])

pvalue is higher than 0.05. So we fail to reject the null hypothesis and can say that we do not have enough evidence to reject the null hypothesis. So we do not have enough evidence to prove that variance of d component of current is not equal to the variance of motor temperature.

### Current q-component

In [None]:
sns.distplot(df['i_q'])

In [None]:
from scipy.stats import shapiro
shapiro(df['i_q'])

In [None]:
shapiro(df['pm'])

H0 : variance_i_q = variance_pm

H1 : variance_i_q != variance_pm

In [None]:
from scipy.stats import bartlett
bartlett(df['i_q'],df['pm'])

pvalue is higher than 0.05. So we fail to reject the null hypothesis and can say that we do not have enough evidence to reject the null hypothesis. So we do not have enough evidence to prove that variance of q component of current is not equal to the variance of motor temperature.

### Shuffling the data

In [None]:
df = df.sample(frac=1,random_state=3)

In [None]:
df.head()

The data description did not provide us with any information on the units of measure. So its difficult to interpret the values measured.

# EDA

In [None]:
sns.scatterplot(df['ambient'],df['pm'])

In [None]:
sns.scatterplot(df['coolant'],df['pm'])

In [None]:
sns.scatterplot(df['motor_speed'],df['pm'])

In [None]:
sns.scatterplot(df['u_q'],df['pm'])

In [None]:
sns.scatterplot(df['u_d'],df['pm'])

In [None]:
sns.scatterplot(df['i_q'],df['pm'])

In [None]:
sns.scatterplot(df['i_d'],df['pm'])

## Basic multivariate regression (Base Model)

As we want to predict the temperatures of stator components and rotor(pm), we will drop these values from our dataset for regression. Also, torque is a quantity, which is not reliably measurable in field applications, so this feature shall be omitted in this modelling.

In [None]:
from sklearn.preprocessing import MinMaxScaler
X = df.drop(['pm','stator_yoke','stator_tooth','stator_winding','torque'],axis = 1)
X_df_test = df_test.drop(['pm','stator_yoke','stator_tooth','stator_winding','torque'],axis = 1)
mm = MinMaxScaler()
X = mm.fit_transform(X)
X_df_test = mm.fit_transform(X_df_test)
y = df['pm']
y_df_test = df_test['pm']
X = pd.DataFrame(X,columns = ['ambient', 'coolant', 'u_d', 'u_q', 'motor_speed', 'i_d','i_q'])
X_df_test = pd.DataFrame(X_df_test,columns = ['ambient', 'coolant', 'u_d', 'u_q', 'motor_speed', 'i_d','i_q'])
y.reset_index(drop = True,inplace = True)
y_df_test.reset_index(drop = True,inplace = True)

In [None]:
print(X.shape)
print(y.shape)

In [None]:
for i in X.columns:
    print(X[i].skew())
    sns.distplot(X[i],color='g')
    sns.boxplot(X[i],color = 'y')
    plt.vlines(X[i].mean(),ymin = -1,ymax = 1,color = 'r')
    plt.show()

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=3)

In [None]:
import statsmodels.api as sm
X_train_const = sm.add_constant(X_train)
lin_reg = sm.OLS(y_train,X_train_const).fit()
lin_reg.summary()

In [None]:
from statsmodels.stats.diagnostic import linear_rainbow
linear_rainbow(lin_reg)

In [None]:
from statsmodels.stats.api import het_goldfeldquandt
het_goldfeldquandt(lin_reg.resid,lin_reg.model.exog)

In [None]:
from statsmodels.stats.outliers_influence import variance_inflation_factor
vif = [variance_inflation_factor(X_train_const.values,i) for i in range(X_train_const.shape[1])]
pd.DataFrame(vif,index=X_train_const.columns)

##### Observations :
1. Looking at the pvalues of the each feature, all the them seems to be significant is predicting the stator winding temperature as pvalues are very low.
2. The Durbin watson test score also is very close to 2, so we can say there seems to be very low autocorrelation in the dataset.
3. The pvalue for Jarque-Bera test is less that 0.05, so we reject the null hypothesis that the residuals are normally distributed. We will also check for distribution of residuals as well as QQ-plot to check visually.
4. The pvalue for rainbow test is greater than 0.05, so we fail to reject the null hypothesis and can say that the data follows linearity.
5. The pvalue for Goldfeld Quantile distribution test is greater than 0.05, so we fail to reject the null hypothesis and can say that the data is homoskedastic in nature.
6. But we can also see that there are high vif value for motor_speed. So we can say that there seems to be some multicollinearity in our model.

In [None]:
lin_reg.resid.plot(kind = 'density')

In [None]:
import scipy.stats as stats
import pylab
st_residual = lin_reg.get_influence().resid_studentized_internal
stats.probplot(st_residual, dist="norm", plot = pylab)
plt.show()

As we can see the from the QQ plot as well as kde plot that the residuals are quiet well normally distributed around the centre but deviate from normal distribution towards the extremes which might be the factor influencing JB test to fail the normality test.

In [None]:
y_train_pred = lin_reg.predict(X_train_const)
train_rmse = np.sqrt(np.sum(((y_train-y_train_pred)**2))/len(y_train))
train_rmse

In [None]:
X_test_const = sm.add_constant(X_test)
y_test_pred = lin_reg.predict(X_test_const)
y_test_pred

In [None]:
test_rmse = np.sqrt(np.sum(((y_test-y_test_pred)**2))/len(y_test))
test_rmse

In [None]:
lin_reg.rsquared_adj

### Transforming skewed data and capping outliers

In [None]:
X_trans = X
X_trans['coolant'] = np.power(X_trans['coolant'],1/3)
X_trans['ambient'] = np.power(X_trans['ambient'],3)
X_trans['i_d'] = np.power(X_trans['i_d'],3)

In [None]:
for i in X_trans.columns:
    print(X_trans[i].skew())
    sns.distplot(X_trans[i],color='g')
    sns.boxplot(X_trans[i],color = 'y')
    plt.vlines(X_trans[i].mean(),ymin = -1,ymax = 1,color = 'r')
    plt.show()

In [None]:
z = np.abs(stats.zscore(X_trans))
print(z)

In [None]:
X_trans = X_trans.drop(np.where(z > 3)[0][0:])
X_trans.reset_index(drop=True,inplace = True)
y = y.drop(np.where(z > 3)[0][0:])
y.reset_index(drop = True,inplace = True)

In [None]:
print(X_trans.shape)
print(y.shape)

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_trans, y, test_size=0.3, random_state=3)

In [None]:
import statsmodels.api as sm
X_train_const = sm.add_constant(X_train)
lin_reg = sm.OLS(y_train,X_train_const).fit()
lin_reg.summary()

In [None]:
y_train_pred = lin_reg.predict(X_train_const)
train_rmse = np.sqrt(np.sum(((y_train-y_train_pred)**2))/len(y_train))
train_rmse

In [None]:
X_test_const = sm.add_constant(X_test)
y_test_pred = lin_reg.predict(X_test_const)
y_test_pred

In [None]:
test_rmse = np.sqrt(np.sum(((y_test-y_test_pred)**2))/len(y_test))
test_rmse

There is no imporvement in our rmse by transforming the data. So we will not go ahead with the transformation.

In [None]:
X = X_trans

### Taking care of multicollinearity using PCA

In [None]:
from sklearn.decomposition import PCA
pca  = PCA()
pca.fit(X)

In [None]:
pca.explained_variance_ratio_

In [None]:
np.cumsum(pca.explained_variance_ratio_)

As we can see, 96 percent of the variance in data is explained by the first 5 principal components. So we'll choose these 5 components and see if there is any improvement in the Linear model.

In [None]:
pca5 = PCA(n_components=5)
X_pca = pca5.fit_transform(X)
X_pca

In [None]:
X_pca_train, X_pca_test, y_train, y_test = train_test_split(X_pca, y, test_size=0.3, random_state=3)

In [None]:
X_pca_train_const = sm.add_constant(X_pca_train)
lin_reg = sm.OLS(y_train,X_pca_train_const).fit()
lin_reg.summary()

In [None]:
y_train_pred = lin_reg.predict(X_pca_train_const)
train_rmse = np.sqrt(np.sum(((y_train-y_train_pred)**2))/len(y_train))
train_rmse

In [None]:
X_pca_test_const = sm.add_constant(X_pca_test)
y_test_pred = lin_reg.predict(X_pca_test_const)
y_test_pred

In [None]:
test_rmse = np.sqrt(np.sum(((y_test-y_test_pred)**2))/len(y_test))
test_rmse

There is no imporvement in our rmse by using PCA. So we will not go ahead with the PCA transformation.

### Dropping the d and q components of current(i) looking at the statistical analysis

In [None]:
X_wo_dqi = X.drop(['i_d','i_q'],axis = 1)

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_wo_dqi, y, test_size=0.3, random_state=3)

In [None]:
import statsmodels.api as sm
X_train_const = sm.add_constant(X_train)
lin_reg = sm.OLS(y_train,X_train_const).fit()
lin_reg.summary()

In [None]:
y_train_pred = lin_reg.predict(X_train_const)
train_rmse = np.sqrt(np.sum(((y_train-y_train_pred)**2))/len(y_train))
train_rmse

In [None]:
X_test_const = sm.add_constant(X_test)
y_test_pred = lin_reg.predict(X_test_const)
y_test_pred

In [None]:
test_rmse = np.sqrt(np.sum(((y_test-y_test_pred)**2))/len(y_test))
test_rmse

There is no imporvement in our rmse by using elimination d and q components of current. So we will not go ahead with the elimination.

### Dropping the motor speed looking at the vif values

In [None]:
X_wo_ms = X.drop(['motor_speed'],axis = 1)

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_wo_ms, y, test_size=0.3, random_state=3)

In [None]:
import statsmodels.api as sm
X_train_const = sm.add_constant(X_train)
lin_reg = sm.OLS(y_train,X_train_const).fit()
lin_reg.summary()

In [None]:
y_train_pred = lin_reg.predict(X_train_const)
train_rmse = np.sqrt(np.sum(((y_train-y_train_pred)**2))/len(y_train))
train_rmse

In [None]:
X_test_const = sm.add_constant(X_test)
y_test_pred = lin_reg.predict(X_test_const)
y_test_pred

In [None]:
test_rmse = np.sqrt(np.sum(((y_test-y_test_pred)**2))/len(y_test))
test_rmse

There is no imporvement in our rmse by using elimination motor speed feature. So we will not go ahead with the elimination.

##### Giving a range estimate rather than giving a point estimate is always a more believable strategy. This can be achieved by using k-fold cross validation.

In [None]:
y = pd.DataFrame(y)

In [None]:
from sklearn.linear_model import LinearRegression, Ridge, Lasso
# lr = LinearRegression()
# ridge = Ridge(alpha = 20000)
lasso = Lasso(alpha = 0.012)

In [None]:
# from sklearn.model_selection import KFold
# from sklearn import metrics
# kf = KFold(n_splits=5,shuffle=True,random_state=0)
# for model,name in zip([lr,ridge,lasso],['LR','Ridge','Lasso']):
#     mse_li = []
#     for train_idx,test_idx in kf.split(X,y):
#         X_train,X_test = X.iloc[train_idx,:],X.iloc[test_idx,:]
#         y_train,y_test = y.iloc[train_idx,:],y.iloc[test_idx,:]
#         model.fit(X_train,y_train)
#         y_pred = model.predict(X_test)
#         mse = metrics.mean_squared_error(y_test,y_pred)
#         mse_li.append(mse)
#     print('RMSE scores : %0.03f (+/- %0.08f) [%s]'%(np.mean(mse_li), np.var(mse_li,ddof = 1), name))
#     print()

RMSE scores : 0.536 (+/- 0.00000489) [LR]

RMSE scores : 0.631 (+/- 0.00000388) [Ridge]

RMSE scores : 0.564 (+/- 0.00000385) [Lasso]

# Non Parametric Models

In [None]:
# from sklearn.ensemble import RandomForestRegressor
# from sklearn.tree import DecisionTreeRegressor
# from sklearn.model_selection import RandomizedSearchCV
# from scipy.stats import randint
# dt = DecisionTreeRegressor(random_state=0)
# rf = RandomForestRegressor(random_state=0,n_jobs = -1)
# param_dt = {
#         'criterion' : ['mse','mae'],
#         'max_depth' : randint(1,11)
# }
# param_rf = {
#         'n_estimators' : randint(1,70),
#         'max_depth' : randint(1,11)
# }
# rscv_dt = RandomizedSearchCV(dt,param_dt,scoring='neg_mean_squared_error',cv = 5,n_jobs=1,n_iter = 2,verbose = 1000,random_state = 0)
# rscv_rf = RandomizedSearchCV(rf,param_rf,scoring='neg_mean_squared_error',cv = 5,n_jobs=-1,n_iter = 2,verbose = 1000,random_state = 0)
# rscv_dt.fit(X,y)
# rscv_rf.fit(X,y)
# print(rscv_dt.best_params_)
# print(rscv_rf.best_params_)

DT : criterion=mse, max_depth=6

RF : max_depth=6, n_estimators=41

In [None]:
# from sklearn.ensemble import RandomForestRegressor
# from sklearn.tree import DecisionTreeRegressor
# from sklearn.neighbors import KNeighborsRegressor
# dt = DecisionTreeRegressor(criterion='mse',max_depth=6,random_state=0)
# rf = RandomForestRegressor(n_estimators=41,max_depth=6,random_state=0,n_jobs = -1)

In [None]:
# from sklearn.model_selection import KFold
# from sklearn import metrics
# kf = KFold(n_splits=5,shuffle=True,random_state=0)
# for model,name in zip([dt,rf],['DT','RF']):
#     mse_li = []
#     for train_idx,test_idx in kf.split(X,y):
#         X_train,X_test = X.iloc[train_idx,:],X.iloc[test_idx,:]
#         y_train,y_test = y.iloc[train_idx,:],y.iloc[test_idx,:]
#         model.fit(X_train,y_train)
#         y_pred = model.predict(X_test)
#         mse = metrics.mean_squared_error(y_test,y_pred)
#         mse_li.append(mse)
#     print('RMSE scores : %0.03f (+/- %0.08f) [%s]'%(np.mean(mse_li), np.var(mse_li,ddof = 1), name))
#     print()

RMSE scores : 0.380 (+/- 0.00000606) [DT]

RMSE scores : 0.374 (+/- 0.00000853) [RF]

### Bagging Models
Finding best number of estimators

In [None]:
from sklearn.ensemble import BaggingRegressor
# from sklearn.model_selection import KFold, cross_val_score
# models = []
# models.append(("LinearRegression",lr))
# models.append(("Lasso",lasso))
# models.append(("Ridge",ridge))
# models.append(("DT",dt))
# for name,model in models:
#     mse_var = []
#     for val in np.arange(1,21):
#         bg_model = BaggingRegressor(base_estimator=model,n_estimators=val,n_jobs=-1,verbose = 1000, random_state = 0)
#         kfold = KFold(n_splits=5,shuffle=True,random_state=0)
#         results = cross_val_score(bg_model,X,y,cv=kfold,n_jobs=-1,scoring='neg_mean_squared_error',verbose = 1000)
#         mse_var.append(np.var(results,ddof = 1))
#     print(name,np.argmin(mse_var)+1)

LinearRegression 12

Lasso 2

Ridge 2

DT 3

### Boosting Models
Finding best number of estimators

In [None]:
# from sklearn.ensemble import AdaBoostRegressor
# from sklearn.model_selection import KFold, cross_val_score
# models = []
# models.append(("LinearRegression",lr))
# models.append(("Lasso",lasso))
# models.append(("Ridge",ridge))
# models.append(("DT",dt))
# models.append(("RF",rf))
# for name,model in models:
#     mse_mean = []
#     for val in np.arange(1,21):
#         bg_model = AdaBoostRegressor(base_estimator=model,n_estimators=val, random_state = 0)
#         kfold = KFold(n_splits=5,shuffle=True,random_state=0)
#         results = cross_val_score(bg_model,X,y,cv=kfold,n_jobs=-1,scoring='neg_mean_squared_error',verbose = 1000)
#         mse_mean.append(np.mean(results))
#     print(name,np.argmax(mse_mean)+1)

LinearRegression 1

Lasso 10

Ridge 3

DT 15

RF 8

In [None]:
# #Bagging Models
# LR_bag = BaggingRegressor(base_estimator = lr,n_estimators = 12,random_state = 0,n_jobs = -1)
lasso_bag = BaggingRegressor(base_estimator = lasso,n_estimators = 2,random_state = 0,n_jobs = -1)
# DT_bag = BaggingRegressor(base_estimator = dt,n_estimators = 3,random_state = 0,n_jobs = -1,verbose = 1000)
# ridge_bag = BaggingRegressor(base_estimator = ridge,n_estimators = 2,random_state = 0,n_jobs = -1) 
# # #Boosting models
# lasso_boost = AdaBoostRegressor(base_estimator = lasso,n_estimators = 10,random_state = 0)
# ridge_boost = AdaBoostRegressor(base_estimator = ridge,n_estimators = 3,random_state = 0)
# DT_boost = AdaBoostRegressor(base_estimator = dt,n_estimators = 15,random_state = 0)
# RF_boost = AdaBoostRegressor(base_estimator = rf,n_estimators = 8,random_state = 0)

In [None]:
# from sklearn.ensemble import GradientBoostingRegressor
# GBC = GradientBoostingRegressor(n_estimators = 100,random_state = 0)

In [None]:
# models = []
# models.append(('LR Bagged',LR_bag))
# models.append(('Lasso Bagged',lasso_bag))
# models.append(('Lasso Boosted',lasso_boost))
# models.append(('Ridge Bagged',ridge_bag))
# models.append(('Ridge Boosted',ridge_boost))
# models.append(('DTree Bagged',DT_bag))
# models.append(('DTree Boosted',DT_boost))
# models.append(('Gradient Boost',GBC))
# models.append(('RF Boosted',RF_boost))

In [None]:
# results = []
# names = []
# for name, model in models:
#     kfold = KFold(n_splits = 5,random_state = 0,shuffle = True)
#     cv_results = cross_val_score(model,X,y,cv = kfold,scoring='neg_mean_squared_error',n_jobs = -1)
#     results.append(cv_results)
#     names.append(name)
#     print(name,' : ',np.mean(cv_results),' -- ',np.var(cv_results,ddof = 1))

Model : Bias Error -- Variance Error

RMSE scores : 0.536 (+/- 0.00000489) [LR]

RMSE scores : 0.631 (+/- 0.00000388) [Ridge]

RMSE scores : 0.564 (+/- 0.00000385) [Lasso]

RMSE scores : 0.380 (+/- 0.00000606) [DT]

RMSE scores : 0.374 (+/- 0.00000853) [RF]

LR Bagged  :  -0.5363121272813037  --  4.881295834654977e-06

Lasso Bagged  :  -0.5645736874272845  --  3.2947376969099687e-06

Lasso Boosted  :  -0.571080275441007  --  5.1543103985968674e-06

Ridge Bagged  :  -0.6309426367758058  --  3.2555007823844387e-06

Ridge Boosted  :  -0.6172788016072153  --  3.302273899220944e-06

DTree Bagged  :  -0.37582901559820736  --  5.251632737569686e-06

DTree Boosted  :  -0.3232454863288907  --  2.7692299178526762e-05

Gradient Boost  :  -0.3165118622627031  --  8.654229505655393e-06

RF Boosted  :  -0.315700682698399  --  4.0892902046352836e-05

As we can see from the result, ridge bagged seems to gives the best result as far as the handling of variance error is concerned, but on the other hand, Gradient boost and RF boosted gives the best result as far as the handling of bias error is concerned. Overall if we see, Lasso bagged gives quite a reasonable and acceptable result as far as handling both bias and variance error is concerned. So, we will select Lasso Bagged as our final model.

In [None]:
from sklearn.metrics import r2_score,mean_squared_error
lasso_bag.fit(X,y)
test_pred = lasso_bag.predict(X_df_test)

In [None]:
test_pred

##### Please upvote if you like the work!!!