# Task29-> Hyperparameter Tuning Techniques

### 
When working on a machine learning project, selecting and fine-tuning the right model is essential for 
optimal performance. Common models from sklearn include Linear Regression for predicting continuous 
values, Logistic Regression for binary classification, and Decision Trees, Random Forests, SVMs, and
KNNs for various classification and regression tasks. Hyperparameter tuning techniques such as Grid 
Search, Random Search, and Bayesian Optimization can be employed to find the best parameters for these 
models, enhancing their performance and predictive accuracy. Apply these techniques and note down the
results to evaluate the impact of different models and hyperparameters on your dataset's performance.

### importing necessary libraries and dataset

In [15]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score, mean_squared_error
from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier

file_path = r'C:\Users\Huawei\Desktop\Iris.csv'
df = pd.read_csv(file_path)

### Map species to numbers

In [26]:
species_mapping = {'Iris-setosa': 0, 'Iris-versicolor': 1, 'Iris-virginica': 2}
df['Species'] = df['Species'].map(species_mapping)#converted categorical labels into numerical values so that models can process and analyze the data effectively.

### Define features and target

In [3]:
X = df[['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']]
y = df['Species']

### Split dataset

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Define models and parameters

In [35]:
models = {
    'Logistic Regression': {
        'model': LogisticRegression(),
        'params': {'C': [0.1, 1, 10], 'solver': ['lbfgs', 'liblinear']}#C represent regularization(to prevent overfitting) , solver is holding a list of optimzation algorithms lbfgs and liblinear.
    },
    'Decision Tree': {
        'model': DecisionTreeClassifier(),
        'params': {'max_depth': [3, 5, 7]}
    },
    'Random Forest': {
        'model': RandomForestClassifier(),
        'params': {'n_estimators': [10, 50, 100], 'max_features': ['auto', 'sqrt']}#auto for total no. of features, sqrt for square root of total no. of features. 
    },
    'SVM': {
        'model': SVC(),
        'params': {'C': [0.1, 1, 10], 'kernel': ['linear', 'rbf']}#here kernel is holding a list of kernel types linear and rbf.
    },
    'KNN': {
        'model': KNeighborsClassifier(),
        'params': {'n_neighbors': [3, 5, 7]}
    }
}

### Perform Grid Search for classification models

In [31]:
for name, model_info in models.items():
    clf = GridSearchCV(model_info['model'], model_info['params'], cv=5)
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

### Evaluate model

In [32]:
accuracy = accuracy_score(y_test, y_pred)
print(name)
print("Best Params:", clf.best_params_)
print("Accuracy:", accuracy * 100)



KNN
Best Params: {'n_neighbors': 3}
Accuracy: 100.0


### Linear Regression

In [33]:
linear_reg = LinearRegression()
linear_reg.fit(X_train, y_train)
y_pred_lr = linear_reg.predict(X_test)
mse = mean_squared_error(y_test, y_pred_lr)
print('Linear Regression:')
print('Mean Squared Error: ', mse)


Linear Regression:
Mean Squared Error:  0.03723364456197502


### My analysis on Perfomance of Models on Iris Dataset

### . KNN achieved perfect accuracy on the test set with the best parameter being 3 neighbors. 
### . This indicates that KNN is highly effective for iris dataset when configured with n_neighbors hyperparameter set to 3.
### . Linear Regression shows a low Mean Squared Error i.e 0.0372..., which suggests that the model performs well in terms of prediction accuracy.
### . Although Linear Regression is typically more suited for regression tasks rather than classification.