## Voting Regressor

In [21]:
import pandas as pd
import numpy as np
# Importing the California Housing dataset
from sklearn.datasets import fetch_california_housing


# Loading the dataset
housing_data = fetch_california_housing()
housing = pd.DataFrame(data = housing_data['data'], columns = housing_data['feature_names'])


In [22]:
housing['MedHouseValue'] = housing_data['target']

In [23]:
housing

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,MedHouseValue
0,8.3252,41.0,6.984127,1.023810,322.0,2.555556,37.88,-122.23,4.526
1,8.3014,21.0,6.238137,0.971880,2401.0,2.109842,37.86,-122.22,3.585
2,7.2574,52.0,8.288136,1.073446,496.0,2.802260,37.85,-122.24,3.521
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25,3.413
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25,3.422
...,...,...,...,...,...,...,...,...,...
20635,1.5603,25.0,5.045455,1.133333,845.0,2.560606,39.48,-121.09,0.781
20636,2.5568,18.0,6.114035,1.315789,356.0,3.122807,39.49,-121.21,0.771
20637,1.7000,17.0,5.205543,1.120092,1007.0,2.325635,39.43,-121.22,0.923
20638,1.8672,18.0,5.329513,1.171920,741.0,2.123209,39.43,-121.32,0.847


In [24]:
X,y = fetch_california_housing(return_X_y=True)

In [25]:
X.shape

(20640, 8)

In [26]:
y.shape

(20640,)

#### Base Models
- We trained two base models: `LinearRegression` and `DecisionTreeRegressor`.

In [27]:
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.svm import SVR
from sklearn.model_selection import cross_val_score

In [28]:
lr = LinearRegression()
dt = DecisionTreeRegressor()
#svr = SVR()

In [29]:
estimators = [('lr',lr),('dt',dt)] #,('svr',svr)]


In [30]:
for estimator in estimators:
  scores = cross_val_score(estimator[1],X,y,scoring='r2',cv=10)
  print(estimator[0],np.round(np.mean(scores),2))

lr 0.51
dt 0.23


#### Voting Regressor
- We combined the base models using `VotingRegressor`.
- Cross-validation score (R²) for `Voting Regressor`: 0.53

In [31]:
from sklearn.ensemble import VotingRegressor

In [32]:
vr = VotingRegressor(estimators)
scores = cross_val_score(vr,X,y,scoring='r2',cv=10)
print("Voting Regressor",np.round(np.mean(scores),2))

Voting Regressor 0.54


#### Weighted Voting
- We experimented with different weights for the base models in the `VotingRegressor`.


In [33]:
print("Cross-validation scores (R²) for different weight combinations:")
for i in range(1,4):
  for j in range(1,4):
    vr = VotingRegressor(estimators,weights=[i,j])
    scores = cross_val_score(vr,X,y,scoring='r2',cv=10)
    print("For i={},j={}".format(i,j),np.round(np.mean(scores),2))


Cross-validation scores (R²) for different weight combinations:
For i=1,j=1 0.54
For i=1,j=2 0.47
For i=1,j=3 0.43
For i=2,j=1 0.56
For i=2,j=2 0.54
For i=2,j=3 0.5
For i=3,j=1 0.56
For i=3,j=2 0.56
For i=3,j=3 0.54


- The highest R² score (0.56) was achieved with weight combinations `i=2, j=1`, `i=3, j=1`, and `i=3, j=2`.

#### Using the Same Algorithm
- We trained `DecisionTreeRegressor` models with different `max_depth` values:

In [34]:
# using the same algorithm

dt1 = DecisionTreeRegressor(max_depth=1)
dt2 = DecisionTreeRegressor(max_depth=3)
dt3 = DecisionTreeRegressor(max_depth=5)
dt4 = DecisionTreeRegressor(max_depth=7)
dt5 = DecisionTreeRegressor(max_depth=None)

In [35]:
estimators = [('dt1',dt1),('dt2',dt2),('dt3',dt3),('dt4',dt4),('dt5',dt5)]


In [36]:
for estimator in estimators:
  scores = cross_val_score(estimator[1],X,y,scoring='r2',cv=10)
  print(estimator[0],np.round(np.mean(scores),2))

dt1 0.13
dt2 0.36
dt3 0.43
dt4 0.47
dt5 0.24


In [38]:
vr = VotingRegressor(estimators)
scores = cross_val_score(vr,X,y,scoring='r2',cv=10)
print("Voting Regressor",np.round(np.mean(scores),2))

Voting Regressor 0.5


In [None]:
vr = VotingRegressor(estimators)
scores = cross_val_score(vr,X,y,scoring='r2',cv=10)
print("Voting Regressor",np.round(np.mean(scores),2))


- Combined these models using `VotingRegressor`.
- Cross-validation score (R²) for `Voting Regressor`: 0.5

#### Conclusion
- Combining models with different hyperparameters using `VotingRegressor` can lead to better performance.
- Experimenting with weights for base models in `VotingRegressor` can further improve results.
