# 4. Supervised Analysis and Classifiers

Supervised algorithms (or **supervised** models) observe the way in which the feature inputs `X` (e.g. function word frequencies of known texts) correlate with class outputs `y` (e.g. authorship / label of the text). A supervised model ‘observes’ correctly labelled, preclassified X-y pairs, in order to register meaningful correlations (e.g. between certain function word patterns X with certain authors y). This process is called ‘training,’ and the X-y pairs which the model trains on correspond to what is commonly referred to as `training data`. Consequently, once this learning process has taken place, the model can be confronted with `test data,` comprising previously unobserved and unclassified texts. On the basis of what it has observed, the supervised model can make a prediction (classification), and assign the unseen test data to a class, either by a hard decision or by outputting a probability score.

Classification is a considerable field of research on its own. There are **many types** of classifiers, and it is not always clear which one will perform best, and why. Varying types of classifiers tend to react differently to different problems, have a variety of parametrization options and require other methods by which to optimize their performance during training. A big advantage of supervised machine-learning methods, ‘text classification’ (Sebastiani 2002), is the **possibility of evaluation**. By making different combinations of parameters, such as the feature set, the vector length (number of features), sample length, vectorization method, scaling method, etc., and evaluating how well they can be fitted to a class (author), scholars can finetune and optimize these parameters.
Before we proceed, we **repeat**, with the block of code below, some of **the steps from the previous notebook**.

These are:

1. Loading and segmentation of documents, containers `authors`, `titles`, `texts`
2. Vectorization of `texts` to matrix `X` containing vectors for all text segments
3. Scale `X` by applying `StandardScaler()`

Note that, as opposed to the three previous notebooks, we are now introducing a **test corpus** of allegedly **unknown authorship**. These files can be found in our `'corpus/test/'` folder. Below, we will apply a classifier to attribute the text to one of the **known classes**, i.e. our training set from the `'corpus/train/'` folder.

In [15]:
# Connect Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Change working directory
%cd /content/drive/MyDrive/didip_ss

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/MyDrive/didip_ss


In [16]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.preprocessing import StandardScaler
from string import punctuation
import glob
import numpy as np
import os
import pandas as pd
import re

current_directory = os.getcwd() # gets current directory
folder_path = 'D04/corpus' # gets directory path to corpus folder containing .txt files

# We declare some parameters — the 'settings' of our stylometric experiments
sample_len = 5000 # word length of text segment

data_dict = {}
for folder_name in glob.glob(folder_path + '/*'):

    dict_key = folder_name.split('/')[-1] # make train and test data split

    # Declare empty lists to fill up with our metadata and data
    authors = []
    titles = []
    texts = []

    # Open all file objects in folder and gather data
    for filename in glob.glob(folder_name + '/*'):
        author = filename.split("/")[-1].split(".")[0].split("_")[0]
        title = filename.split("/")[-1].split(".")[0].split("_")[1]

        bulk = []
        text = open(filename, encoding='utf-8-sig').read() # utf-8-sig encoding automatically handles and removes Unicode Byte Order Mark (BOM) if present

        # .split() method splits string into list of substrings based on a specified delimiter. By default, the delimiter is a whitespace
        # .strip() method removes leading and trailing whitespace from a string: spaces, tabs, newlines, and other whitespace characters.
        for word in text.strip().split():
            word = re.sub('\d+', '', word) # escape digits
            word = re.sub('[%s]' % re.escape(punctuation), '', word) # escape punctuation
            word = word.lower() # convert upper to lowercase
            bulk.append(word)

        # Split up the text into discrete chunks or segments
        bulk = [word for word in bulk if word != ""] # list comprehension that removes emptry strings
        bulk = [bulk[i:i+sample_len] for i in range(0, len(bulk), sample_len)]
        for index, sample in enumerate(bulk):
            if len(sample) == sample_len:
                authors.append(author)
                titles.append(title + "_{}".format(str(index + 1)))
                texts.append(" ".join(sample))

    data_dict[dict_key] = [authors, titles, texts]

In [17]:
from sklearn.preprocessing import LabelEncoder

# In this block of code, we separately vectorize and scale the train and test partitions of our dataset

for set, [authors, titles, texts] in data_dict.items():
    if set == 'train': # make sure your folder is called 'train'!
        # Vectorize by most common words
        model = CountVectorizer(max_features=250, # n features = vector length / vector dimensionality.
                                analyzer='word', # feature type
                                ngram_range=((1,1)))
        X_train = model.fit_transform(texts).toarray()

        feat_frequencies = np.asarray(X_train.sum(axis=0)).flatten()
        features = model.get_feature_names_out()
        feat_freq_df = pd.DataFrame({'feature': features, 'frequency': feat_frequencies})
        feat_freq_df = feat_freq_df.sort_values(by='frequency', ascending=False).reset_index(drop=True)
        sorted_features = feat_freq_df['feature'].tolist()
        sorted_indices = [model.vocabulary_[feat] for feat in sorted_features]
        X_train_sorted = X_train[:, sorted_indices]

        # Feed sorted features again to new model
        model = CountVectorizer(stop_words=[],
                                analyzer='word',
                                vocabulary=sorted_features,
                                ngram_range=((1,1)))
        X_train = model.fit_transform(texts).toarray()

        # Scale by StandardScaler()
        scaler = StandardScaler()
        X_train = scaler.fit_transform(X_train)

        le = LabelEncoder()
        y_train = le.fit_transform(authors)

# For vectorizing and scaling our test set, we use the same models as for our training set!
for set, [authors, titles, texts] in data_dict.items():
    if set == 'test': # make sure your folder is called 'test'!
        X_test = model.transform(texts).toarray()
        X_test = scaler.transform(X_test)
        test_titles = titles

## 4.1 Training a Classifier (by applying SVM)

Especially in recent years, that have witnessed the rise of machine learning and computing power,  classification algorithms such as support vector machines (SVM’s, **support vector machines**) have become increasingly popular. A Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. It works by finding the optimal hyperplane that maximally separates data points of different classes in a high-dimensional space. SVM is effective in high-dimensional spaces and is versatile with different kernel functions for non-linear classification.

For now, we will not occupy ourselves too much with the hyperparameters of SVM's just yet, and first take a look at some more general principles of training and evaluating classifiers.

### 4.1.1 Preparing the Dataset for Training → `train_test_split`

Above, we already declared a train and test set in separate folders. During training, however, it is considered good practice to hold out a **development set** (`X_dev`), also known as a **validation set**. It is a subset of the training set that is set aside during the training of a machine learning model. This heldout `X_dev` is not used in the training process but is instead used to evaluate the model's performance during development. The heldout dev set allows you to assess how well your model is likely to perform on unseen data (i.e. `X_test`), providing a better estimate of its generalization ability, offering a basis for tuning hyperparameters by evaluating different settings, detecting overfitting early to take preventive measures, and aiding in selecting the best performing model for production.

Here is a list of the variables that you will encounter in the process of partitioning our data set:

* `X_train` and `y_train`: The **full training set**: all vectorized text segments (`X_train`) labelled by authorship (`y_train`).
* `X_train_split` and `y_train_split`: The **remaining training set** after subtraction of the validation set.
* `X_dev` and `y_dev`: The **validation set**, subset of the training data  temporarily held out in order to function as a kind of stand-in test set.
* `X_test`:  The *actual* **test set**, i.e. texts unseen by the model, for which authorship are —truly, this time— unknown.

Parameters to take into account when subtracting `X_dev` from `X_train` are the **split ratio** (`test_size=0.33`) and a **random seed** in the split process to ensure that the results are reproducible (`random_state`).

### 4.1.2 Evaluation: Accuracy, Precision, Recall, F1 score

Once our model is trained on `X_train_split` and known labels `y_train_split`, we are effectively able to test the model's quality by having it predict on the heldout `X_dev` set. This yields a vector of **predictions** `y_dev_pred`, which can be readily compared to what is commonly referred to as "**ground truth**" or "**gold standard**".

During training, each of our authors (let's say for now, authors A, B, and C) is awarded a class label corresponding to a digit, e.g. `0`, `1`, `2`.
- The `y_dev`-array will, therefore, look like something like this: `[0, 0, 1, 1, 2, 2]`, where each of 3 authors in the training set is corresponded by 2 text samples in the development set.  
- Possibly, our model can output as prediction (`y_dev_pred`) the vector array `[0, 1, 1, 1, 2, 2]`.

Clearly, it has made a mistake in misattributing the second text segment to class `1` (Author B) instead of class `0` (Author A).

When comparing the golden standard against the predictions, we can extract several interesting evaluation metrics from a `classification_report`, yielding `accuracy`, `precision`, `recall`,`f1-score`.

* `accuracy`: *"How often did we correctly attribute the text segment to a given author?"*  
  I.e. the percentage of correct predictions out of all the predictions made. The answer in our case may be obvious: 5 out of 6 times, 0.83.
* `precision`: *"When we positively identified a text segment as written by a given author, how often was that true?"*  
  Precision is calculated for each class separately and later averaged across classes. In our case, let us consider the example of Author B. In case of Authors A and C, the precision is in fact 100% in both cases: all positive identifications (1/1 for Author A and 2/2 for Author B) were indeed positive. In case of Author B, however, at one time `1` flared up where the outcome should have been `0` (= Author A). This is a **false positive**, and impairs our model's precision to 2/3 —out of 3 positive outcomes for Author B, only 2 were in fact correct—, and yields a score of 0.67. On average, our precision (when macro-averaged across all classes) is 0.89.
* `recall`: *"Were we able to attribute all text segments of a given author to that author?"*  
  Recall computes how many out of all predictions that should have been labelled positive were actually labelled such. Again, each class is first looked at individually. It turns out that, when we indeed look at the example of Author A, we in fact only caught half of the `0`'s we should have caught, because we falsely attributed an observation belonging to `0` to class `1`. The missed instance for class `0` is what we call a **false negative**, and impairs the recall for that class to 1/2, i.e. 0.5. For Authors B and C, the results are 2/2 and 2/2, both times 100%. Taken on average, then, when looking at our model's performance on all authors, the so-called macro recall adds up to 0.83.
* `f1-score`: A balanced score (harmonic mean) that combines precision and recall.

In [18]:
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import pandas as pd

"""
Train on partitioned X_train_split, y_train_split
Test on validation set X_dev => yields y_dev_pred
"""

# Splits datasets into random train and test subsets.
X_train_split, X_dev, y_train_split, y_dev = train_test_split(X_train, y_train, test_size=0.33, random_state=1) # test_size: 1/3 of train data becomes validation set

print('Dimensions of original data set:')
print(X_train.shape)
print('Dimensions of partitions (train and dev) set:')
print(X_train_split.shape)
print(X_dev.shape)

# Initialize an SVM-classifier
svm_classifier = SVC(kernel='linear', C=1.0, random_state=42) # random seed ensures reproducibility
svm_classifier.fit(X_train_split, y_train_split)

# Make predictions with model
y_dev_pred = svm_classifier.predict(X_dev) # y_pred = model predictions

print(classification_report(y_dev, y_dev_pred)) # compare predictions to ground truth / gold standard

"""
Test on test set
Yields y_pred (predictions) of authorship
"""

y_pred = svm_classifier.predict(X_test)
predictions = le.inverse_transform(y_pred)

print()
print("Predicted authorship:")

df = pd.DataFrame(predictions) # structures matrix X as a DataFrame
df.columns = ['Prediction'] # assigns column labels
df.index = test_titles

print(df)

Dimensions of original data set:
(64, 250)
Dimensions of partitions (train and dev) set:
(42, 250)
(22, 250)
              precision    recall  f1-score   support

           0       1.00      1.00      1.00         5
           1       1.00      1.00      1.00        11
           2       1.00      1.00      1.00         6

    accuracy                           1.00        22
   macro avg       1.00      1.00      1.00        22
weighted avg       1.00      1.00      1.00        22


Predicted authorship:
                                     Prediction
Liber-vitae-meritorum_1   Hildegardis-Bingensis
Liber-vitae-meritorum_2   Hildegardis-Bingensis
Liber-vitae-meritorum_3   Hildegardis-Bingensis
Liber-vitae-meritorum_4   Hildegardis-Bingensis
Liber-vitae-meritorum_5   Hildegardis-Bingensis
Liber-vitae-meritorum_6   Hildegardis-Bingensis
Liber-vitae-meritorum_7   Hildegardis-Bingensis
Liber-vitae-meritorum_8   Hildegardis-Bingensis
Liber-vitae-meritorum_9   Hildegardis-Bingensis
Liber-v

## 4.2 `GridsearchCV()`: Tuning Parameters and Hyperparameters of the SVM (Advanced)

In this section, we mainly repeat much of the above, but introduce the useful class `sklearn.model_selection.GridSearchCV`.
Think of what follows as a more advanced and specialized way of going about training your model. This time, we do not simply *choose* whatever parameters we think will work best, we statistically analyze and evaluate a series of varying presets, in order to gauge their performance on a more objective basis.

An SVM has quite a few hyperparameters, such as the regularization parameter (`'C'`) and the kernel parameters (like `'linear'`).
Moreover, from a stylometric methodological perspective, we may want to experiment with varying feature types (function words, character n-grams, ...), feature vector lengths (`n_features`), and segment lengths (`sample_len`). Gridsearch can help us find the optimal settings, ensuring the SVM model achieves the highest possible accuracy and generalizes well to unseen data. This process helps to avoid the pitfalls of manual tuning (which can be subjective) and ensures a more robust and reliable model.

Below, we first declare these various presets in containers, e.g. `sample_len_loop`, `feat_type_loop`, `feat_n_loop`, `c_options`, `kernel_options`, and `k_folds`.

**READ FIRST: Searching many parameters at the same time can be quite costly and take a long time. Try to start your gridsearch by focussing on only a few of these parameters at a time, just so you can get acquainted with them.**

In [19]:
from sklearn import svm
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics import f1_score, make_scorer, recall_score, accuracy_score, precision_score
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import Normalizer, StandardScaler, FunctionTransformer, LabelBinarizer, MinMaxScaler
from string import punctuation
import glob
import numpy as np
import os
import re

current_directory = os.getcwd() # gets current directory

# preprocessing, feature type and vectorization params
sample_len_loop = [1000, 2000, 3000]
feat_type_loop = ['raw_MFW','raw_4grams','tfidf_MFW','tfidf_4grams']
feat_n_loop = [250, 500, 750, 1000]

# SVM model and training params
c_options = [1, 10, 100, 1000] # various C-parameters - adjust the decision margin's flexibility.
kernel_options = ['linear', 'poly', 'rbf', 'sigmoid']
k_folds = [3, 5, 7, 10]

# Function to turn sparse into dense matrices
def to_dense(X):
        X = X.todense()
        X = np.asarray(X) # Because TypeError: np.matrix is not supported.
        X = np.nan_to_num(X)
        return X

Consequently, by using `GridSearchCV()`, one can efficiently navigate through the classifier's parameter space. Moreover, we expand its functionality in the code to follow by introducing a number of implementations particularly suitable for non-traditional authorship attribution. The code block below is an example.

1. **Preprocessing**: First, we open and preprocess our files again (a step you are familiar with by now), and store our `texts` (text segments by a certain `sample_len`) in a variable `X_train`. Consequently, we label the segments by authorship and store the labels in `y_train` (if there are three authors, the labels should be `0` for author A, `1` for author B, and `2` for author C).
2. **Vectorization**: We initialize various vectorization options, where we take into account the feature type (`word` or `char`), declare our `n_gram` preferences, and decide whether we want to input raw frequencies (`CountVectorizer()`) or else TF-IDF frequencies (`TfidfVectorizer`).
3. **Pipeline and Parameter Grid**: We build a `pipe` (`Pipeline`) a tool for chaining together these data preprocessing steps and machine learning algorithms into a single object. This enables seamless and efficient handling of data transformation and model training, facilitating the creation of end-to-end machine learning workflows. We store our variables in a dictionary `param_grid`, specifying the hyperparameter values to search over during the grid search. It allows for systematic exploration of different combinations of hyperparameters to identify the optimal configuration for a machine learning model.
4. **Grid Search and Cross-Validated Results**: Finally, we introduce a new important concept, that of cross-Validation (CV). In fact, CV is an advanced and more reliable way of going about `train_test_split` as it was introduced in the block of code earlier. With CV, the dataset is divided into *k* equal-sized **folds**. Consequently, the model is trained *k* times, where each time it is trained on *k*−1 folds (which corresponds to `X_train_split` above) and tested on the remaining fold (`X_dev`). The performance metrics (accuracy, precision, recall, f1-score) are averaged over the *k* trials to give a more reliable estimate of the model's performance (information you can extract from `results['mean_test_accuracy_score']`).

This helps ensure that the model is not overly dependent on a particular subset of the data, as well as provides a more accurate estimate of the model’s performance on unseen data.

Try to tweak the various parameter settings above, and then run the code below.

**Searching many parameters at the same time can be quite costly and take a long time. Try to start your gridsearch by focussing on only a few of these parameters at a time, just so you can get acquainted with them.**

In [None]:
from datetime import datetime
import glob
import re
from string import punctuation
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import FunctionTransformer
from sklearn import svm
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import make_scorer, precision_score, recall_score, f1_score, accuracy_score
import tqdm
from joblib import Parallel, delayed
import multiprocessing
import scipy.sparse as sp

# Debug flag
DEBUG = True

# Early stopping parameters
N_ITER_NO_CHANGE = 5
TOLERANCE = 0.001

# Function for debug logging
def debug_log(message):
    if DEBUG:
        print(f"DEBUG: {message}")

folder_directory ='D04/corpus/train'

# Preprocessing function
def preprocess_file(filename, sample_len):
    author, title = filename.split('/')[-1].split('.')[0].split('_')[:2]
    with open(filename, encoding='utf-8-sig') as file:
        text = file.read().strip()
        words = re.sub(r'[\d%s]' % re.escape(punctuation), '', text.lower()).split()
        bulk = [words[i:i + sample_len] for i in range(0, len(words), sample_len)]
        return [(author, f"{title}_{index + 1}", " ".join(sample)) for index, sample in enumerate(bulk) if len(sample) == sample_len]

# Function to convert sparse matrix to numpy array
def to_numpy_array(X):
    if sp.issparse(X):
        return X.toarray()
    return np.asarray(X)

# Main processing function
def process_parameters(feat_type, n_feats, sample_len, k_folds, X_train, y_train):
    try:
        vectorizer_params = {
            'raw_MFW': (CountVectorizer, 'word', (1, 1)),
            'tfidf_MFW': (TfidfVectorizer, 'word', (1, 1)),
            'raw_4grams': (CountVectorizer, 'char', (4, 4)),
            'tfidf_4grams': (TfidfVectorizer, 'char', (4, 4))
        }

        vectorizer_class, analyzer, ngram_range = vectorizer_params[feat_type]
        vectorizer = vectorizer_class(analyzer=analyzer, ngram_range=ngram_range, max_features=n_feats)

        pipe = Pipeline([
            ('vectorizer', vectorizer),
            ('to_dense', FunctionTransformer(to_numpy_array, accept_sparse=True)),
            ('feature_scaling', StandardScaler()),
            ('classifier', svm.SVC(probability=True))
        ])

        param_grid = [{
            'vectorizer': [vectorizer],
            'feature_scaling': [StandardScaler()],
            'classifier__C': c_options,
            'classifier__kernel': kernel_options,
        }]

        grid = GridSearchCV(pipe, cv=k_folds, n_jobs=-1, param_grid=param_grid,
                            scoring={
                                'precision_score': make_scorer(precision_score, labels=y_train, average='macro'),
                                'recall_score': make_scorer(recall_score, labels=y_train, average='macro'),
                                'f1_score': make_scorer(f1_score, labels=y_train, average='macro'),
                                'accuracy_score': make_scorer(accuracy_score),
                            },
                            refit='f1_score',
                            verbose=False,
                            error_score='raise'  # This will raise the error instead of returning a warning
                        )

        grid.fit(X_train, y_train)
        results = grid.cv_results_

        return (feat_type, n_feats, sample_len, k_folds, results)
    except Exception as e:
        debug_log(f"Error in process_parameters: {str(e)}")
        return (feat_type, n_feats, sample_len, k_folds, None)

# Main execution
if __name__ == "__main__":
    all_grid_scores = []
    all_parameter_combos = []

    try:
        # Preprocess all files
        all_files = glob.glob(folder_directory + '/*')
        max_sample_len = max(sample_len_loop)  # Assume sample_len_loop is defined

        with Parallel(n_jobs=-1) as parallel:
            all_samples = parallel(delayed(preprocess_file)(filename, max_sample_len) for filename in all_files)

        all_samples = [item for sublist in all_samples for item in sublist]
        authors, titles, texts = zip(*all_samples)

        debug_log(f"Preprocessed {len(texts)} samples")

        X_train = texts
        label_encoder = LabelEncoder()
        y_train = label_encoder.fit_transform(authors)

        debug_log(f"Prepared X_train with {len(X_train)} samples and y_train with {len(y_train)} labels")

        # Create parameter combinations
        param_combinations = [(feat_type, n_feats, sample_len, k)
                              for feat_type in feat_type_loop
                              for n_feats in feat_n_loop
                              for sample_len in sample_len_loop
                              for k in k_folds]

        # Run grid search in parallel
        with Parallel(n_jobs=-1) as parallel:
            results = parallel(delayed(process_parameters)(feat_type, n_feats, sample_len, k, X_train, y_train)
                               for feat_type, n_feats, sample_len, k in tqdm.tqdm(param_combinations, desc="Grid Search Progress"))

        # Process results
        best_score = -np.inf
        for feat_type, n_feats, sample_len, k, grid_results in results:
            if grid_results is None:
                continue
            current_best_score = grid_results['mean_test_f1_score'].max()

            if current_best_score > best_score + TOLERANCE:
                best_score = current_best_score

                params = grid_results['params']
                c_settings = [i['classifier__C'] for i in params]
                kernel_settings = [i['classifier__kernel'] for i in params]
                k_fold_settings = [k for _ in params]
                feat_type_settings = [feat_type for _ in params]
                n_feat_settings = [n_feats for _ in params]
                sample_len_settings = [sample_len for _ in params]

                accuracies = grid_results['mean_test_accuracy_score']
                precisions = grid_results['mean_test_precision_score']
                recalls = grid_results['mean_test_recall_score']
                f1_scores = grid_results['mean_test_f1_score']

                for evaluations in zip(accuracies, precisions, recalls, f1_scores):
                    all_grid_scores.append(evaluations)

                for parameter_combo in zip(c_settings, kernel_settings, k_fold_settings, feat_type_settings, n_feat_settings, sample_len_settings):
                    all_parameter_combos.append(parameter_combo)

        # Create and save report
        full_report = []
        for (acc, prec, recall, f1), params in zip(all_grid_scores, all_parameter_combos):
            model_name = '-'.join([str(i) for i in params])
            full_report.append((model_name, acc, prec, recall, f1))

        df = pd.DataFrame(full_report, columns=['model', 'accuracy', 'precision', 'recall', 'f1 score'])
        df_sorted = df.sort_values(by='f1 score', ascending=False)
        print(df_sorted)

        debug_log("Dataframe created and sorted")

        current_time = datetime.now()
        formatted_time = current_time.strftime("date %d-%m at %Hh%Mm")
        df_sorted.to_excel(f'D04/output/gridsearch_evaluation_report-{formatted_time}.xlsx', index=False)

        debug_log(f"Results written to Excel file: gridsearch_evaluation_report-{formatted_time}.xlsx")

    except Exception as e:
        debug_log(f"An error occurred: {str(e)}")

DEBUG: Preprocessed 109 samples
DEBUG: Prepared X_train with 109 samples and y_train with 109 labels




Grid Search Progress:   0%|          | 0/192 [00:00<?, ?it/s][A[A

Grid Search Progress:   2%|▏         | 4/192 [00:47<37:11, 11.87s/it][A[A

Grid Search Progress:   3%|▎         | 6/192 [02:41<1:35:00, 30.65s/it][A[A

Grid Search Progress:   4%|▍         | 8/192 [04:04<1:46:29, 34.73s/it][A[A

Grid Search Progress:   5%|▌         | 10/192 [05:54<2:06:28, 41.69s/it][A[A

Grid Search Progress:   6%|▋         | 12/192 [07:38<2:15:41, 45.23s/it][A[A

Grid Search Progress:   7%|▋         | 14/192 [09:34<2:26:22, 49.34s/it][A[A

Grid Search Progress:   8%|▊         | 16/192 [10:49<2:13:39, 45.57s/it][A[A

Grid Search Progress:   9%|▉         | 18/192 [12:40<2:20:58, 48.61s/it][A[A

Grid Search Progress:  10%|█         | 20/192 [14:16<2:18:53, 48.45s/it][A[A

Grid Search Progress:  11%|█▏        | 22/192 [16:04<2:22:01, 50.13s/it][A[A

Grid Search Progress:  12%|█▎        | 24/192 [17:11<2:06:24, 45.14s/it][A[A



Grid Search Progress:  15%|█▍        | 28/192 [20:25