In [1]:
pip install optuna 

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [2]:
import optuna 

In [42]:
optuna.__version__

'3.5.0'

In [3]:
pip list 

Package                       Version
----------------------------- ---------------
aiobotocore                   2.4.2
aiofiles                      22.1.0
aiohttp                       3.8.3
aiohttp-retry                 2.8.3
aioitertools                  0.7.1
aiosignal                     1.2.0
aiosqlite                     0.18.0
alabaster                     0.7.12
alembic                       1.12.1
altair                        5.2.0
amqp                          5.2.0
anaconda-catalogs             0.2.0
anaconda-client               1.11.3
anaconda-navigator            2.4.2
anaconda-project              0.11.1
annotated-types               0.6.0
antlr4-python3-runtime        4.9.3
anyio                         3.5.0
appdirs                       1.4.4
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arrow                         1.2.3
astroid                       2.14.2
astropy                       5.1
asttokens                     2.0.5
async-tim

**Optuna** is a hyperparameter optimization framework that helps in automating the process of tuning hyperparameters for machine learning models. Hyperparameter tuning is crucial for improving the performance of a model, as selecting the right set of hyperparameters can significantly impact the model's accuracy and generalization.

Here are some reasons why Optuna is commonly used for hyperparameter tuning:

- Efficient Search Space Exploration: Optuna provides an efficient way to explore the hyperparameter search space. It uses a tree-structured Parzen Estimator (TPE) algorithm, which balances exploration and exploitation to efficiently find promising regions in the hyperparameter space.

- Flexibility: Optuna is versatile and supports a wide range of machine learning libraries and frameworks, making it easy to integrate with different models and algorithms.

 - Objective Function Optimization: Optuna simplifies the process of defining and optimizing the objective function. Users only need to define the objective function to minimize or maximize, and Optuna takes care of the rest.

- Parallel and Distributed Computing: Optuna supports parallel and distributed optimization, allowing users to leverage multiple computing resources for faster hyperparameter search.

- Integration with ML Frameworks: Optuna integrates seamlessly with popular machine learning frameworks like scikit-learn, TensorFlow, PyTorch, and others, making it easy to incorporate hyperparameter tuning into existing workflows.

- Automatic Logging and Visualization: Optuna automatically logs and visualizes the hyperparameter search process. This includes visualizations of the search space, intermediate results, and convergence plots, providing insights into the optimization process.

- Pruning Mechanism: Optuna provides a pruning mechanism that allows early stopping of unpromising trials, saving computational resources and speeding up the hyperparameter search.

- Active Community: Optuna has an active and supportive community. This means that users can find resources, tutorials, and discussions that can help them effectively use Optuna for hyperparameter tuning.

1. Search Strategy:

         - Optuna: Uses a tree-structured Parzen Estimator (TPE) algorithm. TPE is a Bayesian optimization algorithm that efficiently balances exploration and exploitation to find promising regions in the hyperparameter space.
         - Random Search: Samples hyperparameters randomly from the defined search space.

2. Exploration vs. Exploitation:

        - Optuna: Focuses on exploring the hyperparameter space efficiently based on the information gathered during the optimization process. It adapts the search based on past trials to find the optimal hyperparameters.
        - Random Search: Does not use any information from past trials, and each hyperparameter configuration is selected randomly. It relies more on chance to find good hyperparameter combinations.

3. Efficiency:

         - Optuna: Generally more efficient than random search as it leverages a probabilistic model to guide the search towards promising regions.
         - Random Search: Can be less efficient because it doesn't adapt its search strategy based on the outcomes of previous trials.

4. Parallelization:

         - Optuna: Supports parallel and distributed optimization, allowing concurrent evaluation of multiple hyperparameter configurations, which can speed up the search.
         - Random Search: Can be parallelized by independently evaluating different sets of hyperparameters, but the method itself does not inherently support parallelization.

5. Early Stopping:

          - Optuna: Implements a pruning mechanism that allows early stopping of unpromising trials during the optimization process, saving computational resources.
          - Random Search: Typically does not have built-in mechanisms for early stopping.

6. Deterministic vs. Stochastic:

           - Optuna: Although the TPE algorithm has a stochastic element, the optimization process is more deterministic and guided by the probabilistic model.
           - Random Search: Fully stochastic, as each set of hyperparameters is chosen randomly without considering information from previous trials.

In [44]:
!pip install xgboost

Defaulting to user installation because normal site-packages is not writeable


In [50]:
xgb.__version__

'2.0.3'

In [47]:
import pandas as pd 
import numpy as np 
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
import xgboost as xgb


In [51]:
pip list

Package                       Version
----------------------------- ---------------
aiobotocore                   2.4.2
aiofiles                      22.1.0
aiohttp                       3.8.3
aiohttp-retry                 2.8.3
aioitertools                  0.7.1
aiosignal                     1.2.0
aiosqlite                     0.18.0
alabaster                     0.7.12
alembic                       1.12.1
altair                        5.2.0
amqp                          5.2.0
anaconda-catalogs             0.2.0
anaconda-client               1.11.3
anaconda-navigator            2.4.2
anaconda-project              0.11.1
annotated-types               0.6.0
antlr4-python3-runtime        4.9.3
anyio                         3.5.0
appdirs                       1.4.4
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arrow                         1.2.3
astroid                       2.14.2
astropy                       5.1
asttokens                     2.0.5
async-tim

In [6]:
data=pd.read_csv("https://raw.githubusercontent.com/Chandrakant817/Admission-Prediction/main/Admission_Prediction.csv")

In [7]:
data.head()

Unnamed: 0,Serial No.,GRE Score,TOEFL Score,University Rating,SOP,LOR,CGPA,Research,Chance of Admit
0,1,337.0,118.0,4.0,4.5,4.5,9.65,1,0.92
1,2,324.0,107.0,4.0,4.0,4.5,8.87,1,0.76
2,3,,104.0,3.0,3.0,3.5,8.0,1,0.72
3,4,322.0,110.0,3.0,3.5,2.5,8.67,1,0.8
4,5,314.0,103.0,2.0,2.0,3.0,8.21,0,0.65


In [8]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 9 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Serial No.         500 non-null    int64  
 1   GRE Score          485 non-null    float64
 2   TOEFL Score        490 non-null    float64
 3   University Rating  485 non-null    float64
 4   SOP                500 non-null    float64
 5   LOR                500 non-null    float64
 6   CGPA               500 non-null    float64
 7   Research           500 non-null    int64  
 8   Chance of Admit    500 non-null    float64
dtypes: float64(7), int64(2)
memory usage: 35.3 KB


In [9]:
data.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Serial No.,500.0,250.5,144.481833,1.0,125.75,250.5,375.25,500.0
GRE Score,485.0,316.558763,11.274704,290.0,308.0,317.0,325.0,340.0
TOEFL Score,490.0,107.187755,6.112899,92.0,103.0,107.0,112.0,120.0
University Rating,485.0,3.121649,1.14616,1.0,2.0,3.0,4.0,5.0
SOP,500.0,3.374,0.991004,1.0,2.5,3.5,4.0,5.0
LOR,500.0,3.484,0.92545,1.0,3.0,3.5,4.0,5.0
CGPA,500.0,8.57644,0.604813,6.8,8.1275,8.56,9.04,9.92
Research,500.0,0.56,0.496884,0.0,0.0,1.0,1.0,1.0
Chance of Admit,500.0,0.72174,0.14114,0.34,0.63,0.72,0.82,0.97


In [10]:
data.isnull().sum()

Serial No.            0
GRE Score            15
TOEFL Score          10
University Rating    15
SOP                   0
LOR                   0
CGPA                  0
Research              0
Chance of Admit       0
dtype: int64

In [11]:
data['GRE Score']=data['GRE Score'].fillna(data['GRE Score'].median())
data['TOEFL Score']=data['TOEFL Score'].fillna(data['TOEFL Score'].median())
data['University Rating']=data['University Rating'].fillna(data['University Rating'].median())

In [12]:
data.isnull().sum()

Serial No.           0
GRE Score            0
TOEFL Score          0
University Rating    0
SOP                  0
LOR                  0
CGPA                 0
Research             0
Chance of Admit      0
dtype: int64

In [13]:
data.head()

Unnamed: 0,Serial No.,GRE Score,TOEFL Score,University Rating,SOP,LOR,CGPA,Research,Chance of Admit
0,1,337.0,118.0,4.0,4.5,4.5,9.65,1,0.92
1,2,324.0,107.0,4.0,4.0,4.5,8.87,1,0.76
2,3,317.0,104.0,3.0,3.0,3.5,8.0,1,0.72
3,4,322.0,110.0,3.0,3.5,2.5,8.67,1,0.8
4,5,314.0,103.0,2.0,2.0,3.0,8.21,0,0.65


In [14]:
X = data.drop(['Serial No.','Chance of Admit'],axis=1)

In [15]:
y= data['Chance of Admit']

In [16]:
y.head() 

0    0.92
1    0.76
2    0.72
3    0.80
4    0.65
Name: Chance of Admit, dtype: float64

In [52]:
X_train , X_test , y_train , y_test = train_test_split(X,y,test_size=0.25,random_state=25)

In [53]:
sc = StandardScaler()

In [19]:
X_train_sc = sc.fit_transform(X_train)

In [20]:
X_test_sc=sc.transform(X_test)

In [21]:
for col in X.columns:
    print(col)
    print(X['GRE Score'].apply (lambda x : isinstance(x,str)).sum())

GRE Score
0
TOEFL Score
0
University Rating
0
SOP
0
LOR
0
CGPA
0
Research
0


In [22]:
def objective(trail,data=X,target=y):
    train_x,test_x,train_y,test_y=train_test_split(data,target,test_size=0.25,random_state=25)
    param={
      'tree_method': 'gpu_hist',
      'lambda':trail.suggest_loguniform('lambda' , 1e-4,10.0),
      'alpha' :trail.suggest_loguniform('alpha' , 1e-4 , 10.0),
      'colsample_bytree' :trail.suggest_categorical('colsample_bytree',[.1,.2,.3,.4,.5,.6,.7,.8,.9,1]),
      'subsample' :trail.suggest_categorical('subsample',[.1,.2,.3,.4,.5,.6,.7,.8,.9,1]),
      'learning_rate' : trail.suggest_categorical('learning_rate' , [.00001,.0003,.008,.02,.01,1,8]),
      'n_estimators' :300,
      'max_depth' :trail.suggest_categorical('max_depth',[3,4,5,6,7,8,9,10,11,12]),
      'random_state' :trail.suggest_categorical('random_state',[10,20,30,2000,3454,243123]),
      'min_child_weight' :trail.suggest_int('min_child_weight',1,200)
      }

    model=xgb.XGBRegressor(**param)
    model.fit(train_x,train_y,eval_set=[(test_x,test_y)],verbose=True)
    pred=model.predict(test_x)
    mse=mean_squared_error(test_y,pred)
    return mse


In [67]:
def custom_eval_metric(y_true, y_pred):
    try:
        mse = mean_squared_error(y_true, y_pred)
        return mse
    except ValueError as e:
        print(f"Error calculating metric: {e}")
        return np.inf  # Return a high value for problematic cases

def objective(trial, data=X, target=y):
    X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.25, random_state=25)
    
    param = {
        'tree_method': 'auto',
        'lambda': trial.suggest_loguniform('lambda', 1e-4, 10.0),
        'alpha': trial.suggest_loguniform('alpha', 1e-4, 10.0),
        'colsample_bytree': trial.suggest_categorical('colsample_bytree', [.1, .2, .3, .4, .5, .6, .7, .8, .9, 1]),
        'subsample': trial.suggest_categorical('subsample', [.1, .2, .3, .4, .5, .6, .7, .8, .9, 1]),
        'learning_rate': trial.suggest_categorical('learning_rate', [.00001, .0003, .008, .02, .01, 1, 8]),
        'n_estimators': 300,
        'max_depth': trial.suggest_categorical('max_depth', [3, 4, 5, 6, 7, 8, 9, 10, 11, 12]),
        'random_state': trial.suggest_categorical('random_state', [10, 20, 30, 2000, 3454, 243123]),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 200)
    }
    
    model = xgb.XGBRegressor(**param)
    
    try:
        model.fit(X_train, y_train, eval_set=[(X_test, y_test)], verbose=True, eval_metric=custom_eval_metric)
        mse = mean_squared_error(y_test, model.predict(X_test))
        return mse
    except xgb.core.XGBoostError as e:
        print(f"XGBoostError during training: {e}")
        return np.inf
    except Exception as e:
        print(f"Other error during training: {e}")
        return np.inf

In [68]:
find_params=optuna.create_study()
find_params.optimize(objective,n_trials=10)
find_params.best_trial.params

[I 2024-02-05 12:31:37,889] A new study created in memory with name: no-name-7edeacca-dc82-4f7c-a869-5547f00e0355
  'lambda': trial.suggest_loguniform('lambda', 1e-4, 10.0),
  'alpha': trial.suggest_loguniform('alpha', 1e-4, 10.0),
[I 2024-02-05 12:31:37,901] Trial 0 finished with value: inf and parameters: {'lambda': 0.01703304318339626, 'alpha': 0.09455739729085484, 'colsample_bytree': 0.2, 'subsample': 0.4, 'learning_rate': 0.0003, 'max_depth': 4, 'random_state': 20, 'min_child_weight': 21}. Best is trial 0 with value: inf.
[I 2024-02-05 12:31:37,914] Trial 1 finished with value: inf and parameters: {'lambda': 0.7596478131140674, 'alpha': 4.736131298783449, 'colsample_bytree': 0.8, 'subsample': 0.8, 'learning_rate': 0.0003, 'max_depth': 7, 'random_state': 2000, 'min_child_weight': 17}. Best is trial 0 with value: inf.
[I 2024-02-05 12:31:37,927] Trial 2 finished with value: inf and parameters: {'lambda': 0.00028107613386610706, 'alpha': 0.0004046084185358585, 'colsample_bytree': 0.8

Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array-like, got <class 'xgboost.core.QuantileDMatrix'>
Other error during training: Expected sequence or array

{'lambda': 0.01703304318339626,
 'alpha': 0.09455739729085484,
 'colsample_bytree': 0.2,
 'subsample': 0.4,
 'learning_rate': 0.0003,
 'max_depth': 4,
 'random_state': 20,
 'min_child_weight': 21}

In [69]:
find_params.trials_dataframe()

Unnamed: 0,number,value,datetime_start,datetime_complete,duration,params_alpha,params_colsample_bytree,params_lambda,params_learning_rate,params_max_depth,params_min_child_weight,params_random_state,params_subsample,state
0,0,inf,2024-02-05 12:31:37.889640,2024-02-05 12:31:37.900642,0 days 00:00:00.011002,0.094557,0.2,0.017033,0.0003,4,21,20,0.4,COMPLETE
1,1,inf,2024-02-05 12:31:37.901640,2024-02-05 12:31:37.913651,0 days 00:00:00.012011,4.736131,0.8,0.759648,0.0003,7,17,2000,0.8,COMPLETE
2,2,inf,2024-02-05 12:31:37.914649,2024-02-05 12:31:37.926164,0 days 00:00:00.011515,0.000405,0.8,0.000281,0.01,12,161,30,1.0,COMPLETE
3,3,inf,2024-02-05 12:31:37.927164,2024-02-05 12:31:37.940169,0 days 00:00:00.013005,0.0001,1.0,0.070412,0.008,5,195,30,0.1,COMPLETE
4,4,inf,2024-02-05 12:31:37.941167,2024-02-05 12:31:37.953169,0 days 00:00:00.012002,0.016321,0.5,0.000794,0.0003,8,186,10,0.5,COMPLETE
5,5,inf,2024-02-05 12:31:37.954169,2024-02-05 12:31:37.966168,0 days 00:00:00.011999,0.000614,1.0,0.000168,0.0003,9,191,243123,1.0,COMPLETE
6,6,inf,2024-02-05 12:31:37.967169,2024-02-05 12:31:37.979167,0 days 00:00:00.011998,2.504939,0.1,0.001834,1e-05,6,19,3454,0.8,COMPLETE
7,7,inf,2024-02-05 12:31:37.980164,2024-02-05 12:31:37.991164,0 days 00:00:00.011000,0.001788,0.2,0.000778,0.008,4,126,3454,0.2,COMPLETE
8,8,inf,2024-02-05 12:31:37.992164,2024-02-05 12:31:38.003168,0 days 00:00:00.011004,0.000115,0.3,0.683067,8.0,6,182,10,0.3,COMPLETE
9,9,inf,2024-02-05 12:31:38.004168,2024-02-05 12:31:38.015175,0 days 00:00:00.011007,0.003162,0.9,1.264717,8.0,8,195,3454,0.3,COMPLETE


In [70]:
optuna.visualization.plot_optimization_history(find_params)

which represents the trail

In [71]:
optuna.visualization.plot_slice(find_params)

[W 2024-02-05 12:42:41,147] Trial 0 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,148] Trial 1 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,148] Trial 2 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,148] Trial 3 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,149] Trial 4 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,149] Trial 5 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,149] Trial 6 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,150] Trial 7 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,150] Trial 8 is omitted in visualization because its objective value is inf or nan.
[W 2024-02-05 12:42:41,150] Trial 9 i

In [25]:
!python --version

Python 3.11.3


In [26]:
!pip show optuna

Name: optuna
Version: 3.5.0
Summary: A hyperparameter optimization framework
Home-page: 
Author: Takuya Akiba
Author-email: 
License: MIT License

Copyright (c) 2018 Preferred Networks, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT H

In [58]:
best_params = {'lambda': 0.0045037035167713214,
 'alpha': 0.024637878711476057,
 'colsample_bytree': 0.6,
 'subsample': 0.9,
 'learning_rate': 0.01,
 'max_depth': 6,
 'random_state': 30,
 'min_child_weight': 95}

In [59]:
model = xgb.XGBRegressor(**best_params)

In [60]:
model.fit(X_train,y_train)

In [61]:
y_pred = model.predict(X_test)

In [62]:
y_pred

array([0.6474818 , 0.7419816 , 0.6403743 , 0.81478786, 0.81478786,
       0.8023725 , 0.64609414, 0.81478786, 0.78821045, 0.6981354 ,
       0.8123957 , 0.77724135, 0.7623051 , 0.6389285 , 0.6441678 ,
       0.7734029 , 0.63924205, 0.6932725 , 0.636761  , 0.7126943 ,
       0.81478786, 0.63924205, 0.81478786, 0.77327436, 0.81478786,
       0.78957194, 0.74692595, 0.69695187, 0.80445594, 0.6387594 ,
       0.6389285 , 0.636761  , 0.6839502 , 0.80753225, 0.6512642 ,
       0.81478786, 0.7010654 , 0.7090469 , 0.7205688 , 0.76331824,
       0.636761  , 0.74506277, 0.69549644, 0.81478786, 0.6405434 ,
       0.81478786, 0.63924205, 0.81478786, 0.81478786, 0.65441793,
       0.67661387, 0.81478786, 0.6387594 , 0.81478786, 0.76538575,
       0.6395858 , 0.81478786, 0.63837594, 0.80753225, 0.6480065 ,
       0.636761  , 0.81478786, 0.6816219 , 0.64063334, 0.7113567 ,
       0.6555364 , 0.6886481 , 0.636761  , 0.640857  , 0.7540566 ,
       0.81478786, 0.7781899 , 0.81478786, 0.6945416 , 0.67661

In [63]:
#r2score=[0,1]
from sklearn.metrics import r2_score

In [64]:
r2_score(y_test,y_pred)

0.6008779308142105

In [None]:
find_params.trials_dataframe()

In [None]:
optuna.visualization.plot_optimization_history(find_params)

In [None]:
optuna.visualization.plot_slice(find_params)

In [None]:
optuna.visualization.plot_contour(find_params,params=['alpha','lambda'])

In [None]:
model=xgb.XGBRegressor(**best_params)

In [None]:
model.fit(X_train,y_train)

In [None]:
y_pred=model.predict(X_test)

In [None]:
r2score=[0,1]
from sklearn.metrics import r2_score

r2_score(y_test,y_pred)

In [None]:
from sklearn.ensemble import RandomForestRegressor
model2=RandomForestRegressor()
model2.fit(X_train,y_train)
y_pred2=model2.predict(X_test)
r2_score(y_test,y_pred2)

In [None]:
xgboost intution--next class
difference bwetween all the techniques

In [None]:
you need to take one more data regression and explore more and more about this optuna
take classificaiton dataset and check with the accracy with different different hyperparameter(optuna)
write atleast 5 differences between optuna and grid search and random search