# __Machine Learning - Grid Search.__
Date : 29, March, 2024.

- The majority of machine learning models contain parameters that can be adjusted to vary how the model learns. 

- For example, the logistic regression model, from sklearn, has a parameter C that controls regularization, which affects the complexity of the model.
How do we pick the best value for C? The best value is dependent on the data used to train the model.

- One method is to try out different values and then pick the value that gives the best score. This technique is known as a grid search.

- If we had to select the values for two or more parameters, we would evaluate all combinations of the sets of values thus forming a grid of values.

- Before we get into the example it is good to know what the parameter we are changing does.
    
    * Higher values of C tell the model, the training data resembles real world information, place a greater weight on the training data.

    * While lower values of C do the opposite.



__Using Default Parameters.__

- First let's see what kind of results we can generate without a grid search using only the base parameters.

- Code : 

In [4]:
from sklearn import datasets 
from sklearn.linear_model import LogisticRegression 

iris= datasets.load_iris()
# print(iris)
X= iris['data']
y= iris['target']

logit= LogisticRegression(max_iter= 10000)
print(logit.fit(X, y))
print(logit.score(X, y))

LogisticRegression(max_iter=10000)
0.9733333333333334


__Code Explanation.__

- logit = LogisticRegression(max_iter=10000): 

    * This line initializes a logistic regression model object named logit. The max_iter parameter specifies the maximum number of iterations for the optimization algorithm to converge. Here, it's set to a high value (10000) to ensure convergence.

- Keep in mind the default value for C in a logistic regression model is 1, we will compare this later.

- print(logit.score(X, y)): 
    
    * This line prints the accuracy score of the trained logistic regression model on the same data it was trained on. The score method calculates the accuracy by comparing the predicted labels with the actual labels in the dataset.

With the default setting of C = 1, we achieved a score of 0.973.    

Let's see if we can do any better by implementing a grid search with difference values of 0.973.

__Implementing Grid Search.__

- We will follow the same steps of before except this time we will set a range of values for C.

- Knowing which values to set for the searched parameters will take a combination of domain knowledge and practice.

- Since the default value for C is 1, we will set a range of values surrounding it.

    * C = [0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2]

- Next we will create a for loop to change out the values of C and evaluate the model with each change.

- First we will create an empty list to store the score within.

    * scores = []

- To change the values of C we must loop over the range of values and update the parameter each time.

    * for choice in C:               
        logit.set_params(C=choice)   
        logit.fit(X, y)     
        scores.append(logit.score(X, y))   

- With the scores stored in a list, we can evaluate what the best choice of C is.

    * print(scores)

Now, Overall Code is : 