# Grid Search
Grid search is a systematic approach to tuning hyperparameters where a model is trained and evaluated exhaustively over a specified range of parameter values.

### Importing libs

In [81]:
import pandas as pd

### Importing the dataset

In [82]:
dataset = pd.read_csv('./filez/Social_Network_Ads.csv')
dataset.head()

Unnamed: 0,Age,EstimatedSalary,Purchased
0,19,19000,0
1,35,20000,0
2,26,43000,0
3,27,57000,0
4,19,76000,0


### Splitting the dataset into Train/Test sets

In [83]:
from sklearn.model_selection import train_test_split

X = dataset.drop("Purchased", axis=1)
y = dataset["Purchased"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0
)

### Feature Scaling

In [84]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

### Training the Kernel SVM model on the Training set

In [85]:
from sklearn.svm import SVC

classifier = SVC(kernel="rbf", random_state=0)
classifier.fit(X_train, y_train)

### Making the Confusion Matrix

In [86]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

def evaluate_model(classifier):
    y_pred = classifier.predict(X_test)

    print(f"1) classification_report:\n\n", classification_report(y_test, y_pred))
    print(f"2) confusion_matrix:\n\n", confusion_matrix(y_test, y_pred), "\n")
    print(f"3) accuracy_score:\n\n", accuracy_score(y_test, y_pred))

evaluate_model(classifier)

1) classification_report:

               precision    recall  f1-score   support

           0       0.96      0.94      0.95        68
           1       0.88      0.91      0.89        32

    accuracy                           0.93       100
   macro avg       0.92      0.92      0.92       100
weighted avg       0.93      0.93      0.93       100

2) confusion_matrix:

 [[64  4]
 [ 3 29]] 

3) accuracy_score:

 0.93


### Applying K-Fold Cross Validation

In [87]:
from sklearn.model_selection import cross_val_score

def show_accuracy():
    accuracies = cross_val_score(estimator=classifier, X=X_train, y=y_train, cv=10)
    print(f"Accuracy {accuracies.mean() * 100:,.2f}%")
    print(f"Standard Deviation: {accuracies.std()*100:,.2f}%")

In [88]:
show_accuracy()

Accuracy 90.33%
Standard Deviation: 6.57%


### Applying Grid Search to find the best model and the best parameters

- **C (Regularization parameter)**: tells the SVM optimization how much you want to avoid misclassifying each training example. Larger values of C will choose a smaller-margin hyperplane if that hyperplane does a better job of getting all the training points classified correctly.
Additional values: You might try more values around the range you've already set, especially if you notice the best performance is on the edges of your current range. Consider trying smaller increments (e.g., 0.1, 0.2, 0.3) or larger values if overfitting is not an issue.

- **gamma (Kernel coefficient for ‘rbf’)**: defines how far the influence of a single training example reaches. Low values mean ‘far’ and high values mean ‘close’.
Additional values: Similar to C, you could try more granular values or a wider range if needed. If the best gamma is at one end of your range, extend the range in that direction.

In [89]:
from sklearn.model_selection import GridSearchCV

# list of combinations of hyperparameters in two dictionaries to test 2 different kernels,
# because the gamma parameter can only be used with the rbf kernel
# dict 1: linear kernel
# dict 2: rbf kernel

ranges = [i/10 for i in range(1, 10)] # from 0.1 to 0.9

params = [
    {
        "C": ranges,
        "kernel": ["linear"],
    },
    {
        "C": ranges,
        "kernel": ["rbf"],
        "gamma": ranges,
    },
]

grid_search = GridSearchCV(
    estimator=classifier, param_grid=params, scoring="accuracy", cv=10, n_jobs=-1
)

grid_search.fit(X=X_train, y=y_train)

best_accuracy = grid_search.best_score_
best_params = grid_search.best_params_

print(f'best accuracy: {best_accuracy * 100:,.2f}%')
print(f'best params: {best_params}')

best accuracy: 91.00%
best params: {'C': 0.8, 'gamma': 0.9, 'kernel': 'rbf'}


In [90]:
# Model with the best params:
classifier = SVC(kernel="rbf", random_state=0, C=0.8, gamma=0.9)
classifier.fit(X_train, y_train)

show_accuracy()

Accuracy 91.00%
Standard Deviation: 6.67%


**Evaluating the Trade-off**:
- Higher Average Accuracy: The second model has a higher average accuracy (91.00% vs. 90.33%). This suggests that, on average, it is slightly better at making correct predictions.
- Higher Standard Deviation: The second model also has a slightly higher standard deviation (6.67% vs. 6.57%). This means there's more variability in its performance across different subsets of the data.

**Decision Criteria**:
- Prioritizing Accuracy: If your primary goal is to achieve the highest possible accuracy, the second model is preferable. The increase in accuracy, though marginal, might be significant depending on the application.
- Prioritizing Consistency: If consistency across different datasets or cross-validation folds is more critical, the first model may be more appropriate, as it has a slightly lower standard deviation. However, the difference in standard deviation is very small (0.1 percentage points), so this might not be a decisive factor.

### Accuracy vs. Consistency

#### Accuracy
- **What Is It?**: Accuracy measures how often a model makes the correct prediction. It's like the model's hit rate – how many times it gets things right.
- **Example**: Imagine a weather forecasting model. If it predicts rain and it actually rains on 70 out of 100 days when it predicted rain, its accuracy is 70%.

#### Consistency (Standard Deviation in Performance)*
- **What Is It?**: Consistency here refers to how stable or variable the model's performance is across different situations or datasets. A lower standard deviation in model performance implies greater consistency.
- **Example**: Let's say you use the same weather forecasting model in different cities. In City A, its accuracy is 68%; in City B, it's 72%; in City C, it's 69%. The small range in accuracy (68% to 72%) across cities indicates high consistency. Now imagine another scenario where in City A, its accuracy is 50%; in City B, 90%; in City C, 70%. Here, the wide range (50% to 90%) indicates low consistency.

#### Simplified Comparison
- **Accuracy vs. Consistency**: A model could be highly accurate but not consistent. For example, it might work exceptionally well in certain conditions (e.g., predicting sunny days) but poorly in others (e.g., predicting rainy days), leading to high accuracy but low consistency. Conversely, a model could be consistently good but not the best in terms of accuracy. It might always be somewhat right but rarely perfect.

#### Clear Example
- **Weather Forecast Model**:
  - **High Accuracy, Low Consistency**: A model predicts weather with 95% accuracy during summer but only 55% accuracy during winter. Its overall accuracy might be high, but its performance is inconsistent across seasons.
  - **Lower Accuracy, High Consistency**: Another model predicts weather with 75% accuracy year-round, regardless of the season. Its accuracy isn't as high as the first model in summer, but it's more consistent throughout the year.

In summary, accuracy tells you how often the model is right, while consistency tells you how stable the model's accuracy is across different conditions or datasets. In real-world applications, the choice between a highly accurate model and a consistently good model often depends on the specific needs and context of the task at hand.

(*) In the context of evaluating machine learning models, when we talk about standard deviation, we're typically referring to the variation in the model's performance across different scenarios or datasets.