## Fine Tuning the Model

* **Biased Error**: Error produced by the model during the Fitting Stage (Training Stage).
* **Variance Error**: Difference in prediction when model fits into diffrent data set.

1. Biased Error is Low, Varinace Error is High then the model is ***Over Fitted Model***
2. Biased Error is High, Varinace Error is Low then the model is ***Under Fitted Model***

How to handle overfitted model ?
> **Regularization**: 
> * Ridge
> * Lasso
> * Elastic net

In [1]:
import pandas as pd
import numpy as np
import optuna
from sklearn.model_selection import cross_val_score

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
data=pd.read_csv('../data-sets/IPL_IMB_data.csv')

In [3]:
data.head(2)

Unnamed: 0,PLAYER NAME,AGE,COUNTRY,PLAYING ROLE,T-RUNS,T-WKTS,ODI-RUNS-S,ODI-SR-B,ODI-WKTS,ODI-SR-BL,...,HS,AVE,SR-B,SIXERS,RUNS-C,WKTS,AVE-BL,ECON,SR-BL,SOLD PRICE
0,"Abdulla, YA",2,SA,Allrounder,0,0,0,0.0,0,0.0,...,0,0.0,0.0,0,307,15,20.47,8.9,13.93,50000
1,Abdur Razzak,2,BAN,Bowler,214,18,657,71.41,185,37.6,...,0,0.0,0.0,0,29,0,0.0,14.5,0.0,50000


In [4]:
data.shape


(130, 22)

In [5]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 130 entries, 0 to 129
Data columns (total 22 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   PLAYER NAME    130 non-null    object 
 1   AGE            130 non-null    int64  
 2   COUNTRY        130 non-null    object 
 3   PLAYING ROLE   130 non-null    object 
 4   T-RUNS         130 non-null    int64  
 5   T-WKTS         130 non-null    int64  
 6   ODI-RUNS-S     130 non-null    int64  
 7   ODI-SR-B       130 non-null    float64
 8   ODI-WKTS       130 non-null    int64  
 9   ODI-SR-BL      130 non-null    float64
 10  CAPTAINCY EXP  130 non-null    int64  
 11  RUNS-S         130 non-null    int64  
 12  HS             130 non-null    int64  
 13  AVE            130 non-null    float64
 14  SR-B           130 non-null    float64
 15  SIXERS         130 non-null    int64  
 16  RUNS-C         130 non-null    int64  
 17  WKTS           130 non-null    int64  
 18  AVE-BL    

In [6]:
data['AGE']=data['AGE'].astype('object')
data['CAPTAINCY EXP']=data['CAPTAINCY EXP'].astype('object')

In [7]:
data=data.drop(columns='PLAYER NAME')

In [8]:
num_data=data.select_dtypes(include=np.number)
cat_data=data.select_dtypes(exclude=np.number)

In [9]:
cat_data.head(2)

Unnamed: 0,AGE,COUNTRY,PLAYING ROLE,CAPTAINCY EXP
0,2,SA,Allrounder,0
1,2,BAN,Bowler,0


In [10]:
cat_data=pd.get_dummies(cat_data,drop_first=True,dtype=float)

In [11]:
cat_data.head(2)

Unnamed: 0,AGE_2,AGE_3,COUNTRY_BAN,COUNTRY_ENG,COUNTRY_IND,COUNTRY_NZ,COUNTRY_PAK,COUNTRY_SA,COUNTRY_SL,COUNTRY_WI,COUNTRY_ZIM,PLAYING ROLE_Batsman,PLAYING ROLE_Bowler,PLAYING ROLE_W. Keeper,CAPTAINCY EXP_1
0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0


In [12]:
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
num_data_sc=sc.fit_transform(num_data)
num_data_sc=pd.DataFrame(num_data_sc,columns=num_data.columns)
num_data_sc.head(2)

Unnamed: 0,T-RUNS,T-WKTS,ODI-RUNS-S,ODI-SR-B,ODI-WKTS,ODI-SR-BL,RUNS-S,HS,AVE,SR-B,SIXERS,RUNS-C,WKTS,AVE-BL,ECON,SR-BL,SOLD PRICE
0,-0.657994,-0.468108,-0.703043,-2.758455,-0.68676,-1.277132,-0.839099,-1.307954,-1.693829,-3.10288,-0.745369,-0.30301,-0.099814,-0.127413,0.547597,-0.226928,-1.162826
1,-0.593006,-0.34146,-0.518927,0.00952,0.983269,0.133821,-0.839099,-1.307954,-1.693829,-3.10288,-0.745369,-0.802864,-0.790019,-1.115257,1.685233,-1.142498,-1.162826


In [13]:
data_full=pd.concat([num_data_sc,cat_data],axis=1)

In [14]:
out=data_full['SOLD PRICE']
inp=data_full.drop('SOLD PRICE',axis=1)

In [15]:
from statsmodels.stats.outliers_influence import variance_inflation_factor

In [16]:
vif=pd.DataFrame()
vif['VIF']=[variance_inflation_factor(inp.values,i) for i in range(inp.shape[1])]
vif['feature']=inp.columns
vif.sort_values('VIF',ascending=False).head(5)

Unnamed: 0,VIF,feature
15,45.04742,SR-BL
13,44.645799,AVE-BL
11,21.792199,RUNS-C
12,20.39178,WKTS
2,10.972636,ODI-RUNS-S


In [17]:
inp1=inp.drop(['SR-BL','AVE-BL','RUNS-C'],axis=1)
vif=pd.DataFrame()
vif['VIF']=[variance_inflation_factor(inp1.values,i) for i in range(inp1.shape[1])]
vif['feature']=inp1.columns
vif.sort_values('VIF',ascending=False).head(5)

Unnamed: 0,VIF,feature
2,10.944706,ODI-RUNS-S
6,9.481883,RUNS-S
0,8.39543,T-RUNS
7,8.363663,HS
4,6.85509,ODI-WKTS


In [18]:
import statsmodels.api as sm
inpc=sm.add_constant(inp1)
ols=sm.OLS(out,inpc)
ols_mod=ols.fit()
ols_mod.summary()

0,1,2,3
Dep. Variable:,SOLD PRICE,R-squared:,0.545
Model:,OLS,Adj. R-squared:,0.419
Method:,Least Squares,F-statistic:,4.322
Date:,"Sun, 04 Aug 2024",Prob (F-statistic):,2.95e-08
Time:,15:27:00,Log-Likelihood:,-133.27
No. Observations:,130,AIC:,324.5
Df Residuals:,101,BIC:,407.7
Df Model:,28,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.2306,0.318,0.725,0.470,-0.401,0.862
T-RUNS,-0.2921,0.196,-1.489,0.140,-0.681,0.097
T-WKTS,-0.0933,0.169,-0.552,0.582,-0.428,0.242
ODI-RUNS-S,0.4246,0.222,1.910,0.059,-0.016,0.866
ODI-SR-B,0.0017,0.087,0.019,0.985,-0.170,0.174
ODI-WKTS,0.3344,0.176,1.900,0.060,-0.015,0.684
ODI-SR-BL,-0.0718,0.085,-0.845,0.400,-0.240,0.097
RUNS-S,0.2079,0.208,1.000,0.320,-0.204,0.620
HS,-0.3427,0.194,-1.763,0.081,-0.728,0.043

0,1,2,3
Omnibus:,7.05,Durbin-Watson:,2.058
Prob(Omnibus):,0.029,Jarque-Bera (JB):,6.694
Skew:,0.479,Prob(JB):,0.0352
Kurtosis:,3.565,Cond. No.,32.8


In [19]:
while(len(inp1.columns)>0):
    inpc=sm.add_constant(inp1)
    ols=sm.OLS(out,inpc)
    ols_mod=ols.fit()
    
    f=ols_mod.pvalues[1:].idxmax()
    if ols_mod.pvalues[1:].max()>0.05:
        inp1=inp1.drop(f,axis=1)
    else:
        break

print('The Final Features are :', inp1.columns)    

The Final Features are : Index(['ODI-RUNS-S', 'ODI-WKTS', 'SIXERS', 'AGE_2', 'AGE_3', 'COUNTRY_ENG',
       'COUNTRY_IND', 'COUNTRY_WI'],
      dtype='object')


In [20]:
import statsmodels.api as sm
inpc=sm.add_constant(inp1)
ols=sm.OLS(out,inpc)
ols_mod=ols.fit()
ols_mod.summary()

0,1,2,3
Dep. Variable:,SOLD PRICE,R-squared:,0.462
Model:,OLS,Adj. R-squared:,0.427
Method:,Least Squares,F-statistic:,13.01
Date:,"Sun, 04 Aug 2024",Prob (F-statistic):,2.09e-13
Time:,15:27:00,Log-Likelihood:,-144.11
No. Observations:,130,AIC:,306.2
Df Residuals:,121,BIC:,332.0
Df Model:,8,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.2460,0.230,1.072,0.286,-0.209,0.701
ODI-RUNS-S,0.3151,0.083,3.795,0.000,0.151,0.479
ODI-WKTS,0.2521,0.071,3.543,0.001,0.111,0.393
SIXERS,0.3488,0.076,4.568,0.000,0.198,0.500
AGE_2,-0.4823,0.219,-2.206,0.029,-0.915,-0.049
AGE_3,-0.6935,0.285,-2.432,0.016,-1.258,-0.129
COUNTRY_ENG,1.7017,0.453,3.760,0.000,0.806,2.598
COUNTRY_IND,0.5269,0.152,3.466,0.001,0.226,0.828
COUNTRY_WI,-0.6852,0.330,-2.077,0.040,-1.338,-0.032

0,1,2,3
Omnibus:,7.341,Durbin-Watson:,1.979
Prob(Omnibus):,0.025,Jarque-Bera (JB):,6.947
Skew:,0.527,Prob(JB):,0.031
Kurtosis:,3.416,Cond. No.,9.03


# Regularization

SSE = $\sum((Y_a - Y_p)^2)$

= $\sum((y-b_1 \times x_1 - b_0)^2)$

=  $\sum((y-b_1 \times x_1 - b_0)^2) + \lambda $ # here the $\lambda$ is penality constant is called ***Ridge Regularization*** 

= $\sum((y -b_1 \times x_1 - b_2 \times x_2 - b_0)^2) + \lambda (\beta1 ^ 2 + \beta2 ^ 2)$ # here the lamda of beta is called Hyper Parameter $L_2$.

= $\sum((y -b_1 \times x_1 - b_2 \times x_2 - b_0)^2) + \lambda (\beta1 + \beta2)$ # here the lamda of beta is called Hyper Parameter $L_1$ norm which defines ***Lasso Regularization***.

= $\sum((y -b_1 \times x_1 - b_2 \times x_2 - b_0)^2) + L_2 + L_1$ # which defines ***Elasto Regularization***.

In [21]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score,mean_squared_error

In [22]:
xtrain, xtest, ytrain,ytest = train_test_split(inp1, out, test_size=0.3, random_state=48)

In [25]:
lr = LinearRegression()

lr.fit(xtrain, ytrain)

ypred_train = lr.predict(xtrain)
ypred_test = lr.predict(xtest)

r2_train = r2_score(ytrain, ypred_train)
r2_test = r2_score(ytest, ypred_test)

rmse_train = np.sqrt(mean_squared_error(ytrain, ypred_train))
rmse_test = np.sqrt(mean_squared_error(ytest, ypred_test))

res_lr = [rmse_train, rmse_test,r2_train, r2_test]
res_lr

[0.6935960236384503,
 0.8439460602872784,
 0.508838132286692,
 0.31795438052636205]

In [26]:
0.6935960236384503 - 0.8439460602872784, 0.508838132286692 - 0.31795438052636205

(-0.15035003664882807, 0.1908837517603299)

In [27]:
from sklearn.linear_model import Ridge

here the alpha is used to pass $\lambda$ value

In [51]:
def ridge_regul(inp):
    rid = Ridge(alpha=inp)

    rid.fit(xtrain, ytrain)

    ypred_train = rid.predict(xtrain)
    ypred_test = rid.predict(xtest)

    r2_train = r2_score(ytrain, ypred_train)
    r2_test = r2_score(ytest, ypred_test)

    rmse_train = np.sqrt(mean_squared_error(ytrain, ypred_train))
    rmse_test = np.sqrt(mean_squared_error(ytest, ypred_test))

    res_r = [rmse_train, rmse_test,r2_train, r2_test]
    
    return res_r, rid

ridge_regul(1)

([0.6999914082289306,
  0.8342032740791273,
  0.499738741015878,
  0.3336109936891847],
 Ridge(alpha=1))

In [35]:
ridge_regul(4)

[0.7280827940618565,
 0.828061123169325,
 0.4587810569734947,
 0.34338797149862244]

In [36]:
ridge_regul(10)

[0.7606247345540222,
 0.8320239586999733,
 0.4093198985114397,
 0.33708826322377583]

In [37]:
ridge_regul(50)

[0.8295503859230371,
 0.8704451128459049,
 0.2974180434546707,
 0.27445086916536077]

In [38]:
from sklearn.linear_model import Lasso

In [46]:
def lasso_regul(inp): # Penality Constant
    las = Lasso(alpha=inp)

    las.fit(xtrain, ytrain)

    ypred_train = las.predict(xtrain)
    ypred_test = las.predict(xtest)

    r2_train = r2_score(ytrain, ypred_train)
    r2_test = r2_score(ytest, ypred_test)

    rmse_train = np.sqrt(mean_squared_error(ytrain, ypred_train))
    rmse_test = np.sqrt(mean_squared_error(ytest, ypred_test))

    res_r = [rmse_train, rmse_test,r2_train, r2_test]
    
    return res_r, las

lasso_regul(1)

([0.9896788143474161, 1.0244397037994115, 0.0, -0.004978677448944202],
 Lasso(alpha=1))

In [40]:
lasso_regul(4)

[0.9896788143474161, 1.0244397037994115, 0.0, -0.004978677448944202]

In [41]:
lasso_regul(0.05)

[0.7977374769545587,
 0.8492308266467251,
 0.35027220902360656,
 0.3093857351994026]

The Value of $R^2$ without regularization is 31.7, with lasso 30.9 since they are close its Optimal now, and Lasso Regularozation is having more Penality Level

In [47]:
from sklearn.linear_model import ElasticNet

def enet_regul(inp, ratio): # Penality Constant
    enet = ElasticNet(alpha=inp, l1_ratio=ratio)
    # when l1_rations is given with 0 its towards Ridge and 1 is towards Lasso
    enet.fit(xtrain, ytrain)

    ypred_train = enet.predict(xtrain)
    ypred_test = enet.predict(xtest)

    r2_train = r2_score(ytrain, ypred_train)
    r2_test = r2_score(ytest, ypred_test)

    rmse_train = np.sqrt(mean_squared_error(ytrain, ypred_train))
    rmse_test = np.sqrt(mean_squared_error(ytest, ypred_test))

    res_r = [rmse_train, rmse_test,r2_train, r2_test]
    
    return res_r, enet

enet_regul(0.1, 0.5) # ratio 0.5 will be equal distributed towards both Lasso, Ridge

([0.8109053171740085,
  0.8540860296462246,
  0.3286457399355487,
  0.3014664336411256],
 ElasticNet(alpha=0.1))

In [52]:
result = pd.DataFrame()

result['LR'] = res_lr
result['Ridge_alpha_1'] = ridge_regul(1)[0]
result['Ridge_alpha_4'] = ridge_regul(4)[0]
result['Ridge_alpha_10'] = ridge_regul(10)[0]
result['Ridge_alpha_50'] = ridge_regul(50)[0]
result['Lasso_alpha_0.05'] = lasso_regul(0.05)[0]
result['Elasticnet_aplha0.1_l1_ration0.5'] = enet_regul(0.1, 0.5)[0]

result.index = ['rmse_train', 'rmse_test','r2_train', 'r2_test']

result

Unnamed: 0,LR,Ridge_alpha_1,Ridge_alpha_4,Ridge_alpha_10,Ridge_alpha_50,Lasso_alpha_0.05,Elasticnet_aplha0.1_l1_ration0.5
rmse_train,0.693596,0.699991,0.728083,0.760625,0.82955,0.797737,0.810905
rmse_test,0.843946,0.834203,0.828061,0.832024,0.870445,0.849231,0.854086
r2_train,0.508838,0.499739,0.458781,0.40932,0.297418,0.350272,0.328646
r2_test,0.317954,0.333611,0.343388,0.337088,0.274451,0.309386,0.301466


### The diffrence between rmse_train and rmse_test should be less than 5% diffrence so that we can accept the model as best model

In [53]:
coef = pd.DataFrame()

coef['LR'] = lr.coef_
coef['Ridge_alpha_1'] = ridge_regul(1)[1].coef_
coef['Ridge_alpha_4'] = ridge_regul(4)[1].coef_
coef['Ridge_alpha_10'] = ridge_regul(10)[1].coef_
coef['Ridge_alpha_50'] = ridge_regul(50)[1].coef_
coef['Lasso_alpha_0.05'] = lasso_regul(0.05)[1].coef_
coef['Elasticnet_aplha0.1_l1_ration0.5'] = enet_regul(0.1, 0.5)[1].coef_
coef.index = xtrain.columns
coef

Unnamed: 0,LR,Ridge_alpha_1,Ridge_alpha_4,Ridge_alpha_10,Ridge_alpha_50,Lasso_alpha_0.05,Elasticnet_aplha0.1_l1_ration0.5
ODI-RUNS-S,0.278054,0.270084,0.25265,0.230974,0.171995,0.196924,0.187882
ODI-WKTS,0.290678,0.274438,0.244025,0.209058,0.118899,0.161583,0.145262
SIXERS,0.427187,0.409033,0.378808,0.346878,0.24555,0.325875,0.310919
AGE_2,-0.284312,-0.223564,-0.139212,-0.084529,-0.033794,-0.0,-0.0
AGE_3,-0.434347,-0.358855,-0.239101,-0.145352,-0.033179,-0.0,-0.0
COUNTRY_ENG,1.760034,1.296947,0.719654,0.374819,0.084466,0.09596,0.021868
COUNTRY_IND,0.6198,0.588974,0.518679,0.418405,0.177952,0.406633,0.327365
COUNTRY_WI,-0.919388,-0.731246,-0.450653,-0.25364,-0.06015,-0.0,-0.0


Here the Age_2 and Age_3 are taken away from the Model in the Lasso and Elasticnet,
Ridge Regression will not maek the $\beta$ coefficients to zero, it will be atleast close to zero but not equal.

In the Lasso and Elastic Net models are Embedded method of Feature Selection

In [54]:
from sklearn.model_selection import GridSearchCV

In [55]:
ridge = Ridge()
param = { 'alpha': [0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1, 2, 5,8,10, 50, 60, 70, 90, 100]}

grid = GridSearchCV(ridge, param_grid=param, cv=5, scoring='neg_mean_squared_error')

In [56]:
hyp_rid = grid.fit(xtrain, ytrain)

hyp_rid.best_params_

{'alpha': 2}

In [57]:
pd.DataFrame(hyp_rid.cv_results_)

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_alpha,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,0.001234,0.000865,0.001553,0.001227,1e-06,{'alpha': 1e-06},-0.810468,-0.723142,-0.473926,-1.068505,-0.589832,-0.733175,0.203115,11
1,0.001638,0.003275,0.000802,0.001605,1e-05,{'alpha': 1e-05},-0.810469,-0.723142,-0.473927,-1.068501,-0.589831,-0.733174,0.203114,10
2,0.0,0.0,0.0,0.0,0.0001,{'alpha': 0.0001},-0.810483,-0.723141,-0.473936,-1.068457,-0.589823,-0.733168,0.203099,9
3,0.00159,0.000918,0.000495,0.000448,0.001,{'alpha': 0.001},-0.810617,-0.723131,-0.474032,-1.068022,-0.589739,-0.733108,0.202953,8
4,0.0,0.0,0.0,0.0,0.01,{'alpha': 0.01},-0.811957,-0.723034,-0.474988,-1.063703,-0.58891,-0.732518,0.201509,7
5,0.00313,0.006261,0.0,0.0,0.1,{'alpha': 0.1},-0.825019,-0.722108,-0.484253,-1.023577,-0.581015,-0.727194,0.188588,5
6,0.003123,0.006247,0.0,0.0,1.0,{'alpha': 1},-0.927961,-0.715779,-0.553258,-0.800806,-0.528546,-0.70527,0.150444,2
7,0.0,0.0,0.0,0.0,2.0,{'alpha': 2},-1.00412,-0.712045,-0.597888,-0.705545,-0.497906,-0.703501,0.16963,1
8,0.0,0.0,0.0,0.0,5.0,{'alpha': 5},-1.133791,-0.708364,-0.653994,-0.61633,-0.456848,-0.713866,0.226064,3
9,0.003138,0.006277,0.0,0.0,8.0,{'alpha': 8},-1.206578,-0.709138,-0.671067,-0.589579,-0.439978,-0.723268,0.258704,4


Grid Search CV gives , provides rank based upon mean_test_score column, better consider std_dev_test_score column as well for identifiying the best value of aplha

$\sigma \beta \alpha \lambda \sum \det$

## Gradient Descent

* Vanilla Gradient Descent
* Stochastic Gradient Descent
* mini batch Gradient Descent

In [58]:
from sklearn.linear_model import SGDRegressor

def sgdr_regul(inp): # Penality Constant
    sgdr = SGDRegressor(random_state=inp)

    sgdr.fit(xtrain, ytrain)

    ypred_train = sgdr.predict(xtrain)
    ypred_test = sgdr.predict(xtest)

    r2_train = r2_score(ytrain, ypred_train)
    r2_test = r2_score(ytest, ypred_test)

    rmse_train = np.sqrt(mean_squared_error(ytrain, ypred_train))
    rmse_test = np.sqrt(mean_squared_error(ytest, ypred_test))

    res_r = [rmse_train, rmse_test,r2_train, r2_test]
    
    return res_r, sgdr

sgdr_regul(10)

([0.7708853671680184,
  0.8354483479623287,
  0.3932761643849171,
  0.33162029729044706],
 SGDRegressor(random_state=10))

In [59]:
coef['SGDR_r_state10'] = sgdr_regul(10)[1].coef_

coef

Unnamed: 0,LR,Ridge_alpha_1,Ridge_alpha_4,Ridge_alpha_10,Ridge_alpha_50,Lasso_alpha_0.05,Elasticnet_aplha0.1_l1_ration0.5,SGDR_r_state10
ODI-RUNS-S,0.278054,0.270084,0.25265,0.230974,0.171995,0.196924,0.187882,0.231484
ODI-WKTS,0.290678,0.274438,0.244025,0.209058,0.118899,0.161583,0.145262,0.224366
SIXERS,0.427187,0.409033,0.378808,0.346878,0.24555,0.325875,0.310919,0.373653
AGE_2,-0.284312,-0.223564,-0.139212,-0.084529,-0.033794,-0.0,-0.0,-0.095132
AGE_3,-0.434347,-0.358855,-0.239101,-0.145352,-0.033179,-0.0,-0.0,-0.135261
COUNTRY_ENG,1.760034,1.296947,0.719654,0.374819,0.084466,0.09596,0.021868,0.231712
COUNTRY_IND,0.6198,0.588974,0.518679,0.418405,0.177952,0.406633,0.327365,0.416394
COUNTRY_WI,-0.919388,-0.731246,-0.450653,-0.25364,-0.06015,-0.0,-0.0,-0.179015


By Default Gradient Descent is providing shrinked $\beta$ co-efficients

## Bayesian Optimization Using Optuna

Below are the Methods used to Identifying the $\lambda$ value
* Random Search CV
* Grid Serach CV
* Bayesian Optimization

In [60]:
def objective(trival):
    alpha = trival.suggest_loguniform('alpha', 0.000000000000001, 100)
    ridge = Ridge(alpha=alpha)

    score = np.mean(cross_val_score(ridge, xtrain, ytest, cv=3, scoring='neg_mean_squared_error'))
    return score



In [62]:
# Perform Bayesian Optimization with Optuna
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=20, n_jobs=-1)

[I 2024-08-04 17:56:24,971] A new study created in memory with name: no-name-ec5d2829-9985-41cf-b50e-613e7f7b6d8a
  alpha = trival.suggest_loguniform('alpha', 0.000000000000001, 100)
[W 2024-08-04 17:56:24,982] Trial 0 failed with parameters: {'alpha': 2.867400029102853e-07} because of the following error: ValueError('Found input variables with inconsistent numbers of samples: [91, 39]').
Traceback (most recent call last):
  File "c:\Users\lokesh\anaconda3\Lib\site-packages\optuna\study\_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "C:\Users\lokesh\AppData\Local\Temp\ipykernel_11680\3627793651.py", line 5, in objective
    score = np.mean(cross_val_score(ridge, xtrain, ytest, cv=3, scoring='neg_mean_squared_error'))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\lokesh\anaconda3\Lib\site-packages\sklearn\model_selection\_validation.py", line 515, in cr

In [65]:
print("Best Parameters: ", study.best_params)

ValueError: No trials are completed yet.

In [66]:
best_model = Ridge(alpha=study.best_params['alpha'])
best_model.fit(xtrain, ytrain)

ValueError: No trials are completed yet.