<a href="https://colab.research.google.com/github/SammyGbabs/ChatBot/blob/main/QA_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Defining the Chatbot’s Purpose and Domain Alignment**  
The **Agricultural Q&A Chatbot** is designed to assist farmers by providing accurate and reliable answers to their agriculture-related questions. It leverages the **BERT transformer model** from Hugging Face, along with the **[Mahesh2841/Agriculture dataset](https://huggingface.co/datasets/Mahesh2841/Agriculture)**, to ensure responses are contextually relevant to farming practices, crop management, soil health, pest control, and other key agricultural concerns.  

### **Relevance and Necessity of the Chatbot**  
1. **Bridging the Knowledge Gap**: Many smallholder farmers lack access to expert agricultural advice. The chatbot democratizes knowledge by providing instant, AI-driven recommendations.  
2. **Localized & Domain-Specific Insights**: By training on an agriculture-specific dataset, the chatbot ensures **relevant and accurate** responses tailored to farming needs rather than generic answers.  
3. **Scalability & Accessibility**: Unlike human experts who have limited availability, the chatbot can serve **hundreds of farmers simultaneously**, offering 24/7 support.  
4. **Improved Decision-Making**: Timely advice on crop diseases, fertilizers, and weather conditions can enhance **crop yield and sustainability**, directly benefiting food security and the economy.  
5. **Cost-Effective**: Hiring agricultural consultants can be expensive for small-scale farmers. A chatbot provides a **free or low-cost alternative**, improving accessibility to crucial farming information.  

In [None]:
!pip install datasets
!pip install transformers
!pip install sentence-transformers
!pip install accelerate -U
!pip install nltk
!pip install rouge-score

Collecting datasets
  Downloading datasets-3.3.2-py3-none-any.whl.metadata (19 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Downloading datasets-3.3.2-py3-none-any.whl (485 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m485.4/485.4 kB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading multiprocess-0.70.16-py311-none-any.whl (143 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.5/143.5 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading x

In [None]:
# importing the necessary libraries and packages
import numpy as np
import torch
import re
import json, os
from datasets import load_dataset
import torch.nn.functional as F
from sentence_transformers import SentenceTransformer
from sklearn.metrics import accuracy_score
from sklearn.metrics.pairwise import cosine_similarity
from nltk.translate.bleu_score import sentence_bleu
from rouge_score import rouge_scorer
from transformers import AutoTokenizer, BertForSequenceClassification, Trainer, TrainingArguments
from sklearn.metrics import classification_report, confusion_matrix
from transformers import GPT2LMHeadModel, GPT2TokenizerFast

In [None]:
#Loading the Dataset from Hugging face
dataset = load_dataset("Mahesh2841/Agriculture")
print(dataset)  # Checking the dataset structure

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/112 [00:00<?, ?B/s]

agricult_data.json:   0%|          | 0.00/2.21M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/5916 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'response'],
        num_rows: 5916
    })
})


# **Splitting the Dataset**
*   Splits the dataset into training (80%) and testing (20%).
*   Further splits the test data into validation (10%) and test (10%) for model evaluation.





In [None]:
#Splitting the dataset
train_test_valid = dataset['train'].train_test_split(test_size=0.2, seed=42)
test_valid = train_test_valid['test'].train_test_split(test_size=0.5, seed=42)

train_ds = train_test_valid['train']
val_ds = test_valid['train']
test_ds = test_valid['test']

print(train_ds)
print(val_ds)
print(test_ds)

Dataset({
    features: ['instruction', 'input', 'response'],
    num_rows: 4732
})
Dataset({
    features: ['instruction', 'input', 'response'],
    num_rows: 592
})
Dataset({
    features: ['instruction', 'input', 'response'],
    num_rows: 592
})


#**Creating Unique Labels for Classification**
*   Extracts unique responses (answers) from the dataset.
*   Assigns a numerical label to each unique response for classification.
*   Saves the label mapping as a JSON file for later use.

In [None]:
# extracting unique outputs for use as unique labels
unique_labels = list(set(train_ds['response'] + val_ds['response'] + test_ds['response']))

# label mapping
label_mapping = {label: idx for idx, label in enumerate(unique_labels)}

# saving the label mapping to a json file
label_mapping_path = "./label_mapping.json"
with open(label_mapping_path, 'w') as file:
    json.dump(label_mapping, file)


# **Creating Sentence Embeddings for Response Matching**


*  Loads Sentence-BERT (all-MiniLM-L6-v2) to encode text into vector embeddings.
*  Converts all possible responses into vector embeddings for fast similarity matching.
*  Saves the embeddings and model for later use in inference.



In [None]:
# Load a pre-trained sentence transformer model
sent_transf = SentenceTransformer('all-MiniLM-L6-v2')

# Encode all possible responses
response = sent_transf.encode(list(label_mapping.keys()))

# Save the response embeddings
response_path = "./response_embeddings.npy"
np.save(response_path, response)

# Save the semantic model
sent_transf_path = "./sent_transf"
os.makedirs(sent_transf_path, exist_ok=True)
sent_transf.save(sent_transf_path)


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling%2Fconfig.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

# **Implementing Semantic Search for Q&A Matching**
* Takes a user's question as input.
* Encodes the question using Sentence-BERT.
* Compares the question against stored response embeddings using cosine similarity.
* Returns the top-k most relevant answers.

This allows the chatbot to match user questions with the most relevant response, even if the wording differs.



In [None]:
def semantic_search(query, top_k=5):
    # Encode
    query_embedding = sent_transf.encode([query])

    # Cosine similarity
    similarities = cosine_similarity(query_embedding, response)[0]

    # Get top-k similar responses
    top_indices = similarities.argsort()[-top_k:][::-1]

    return [(list(label_mapping.keys())[i], similarities[i]) for i in top_indices]


# **Mapping Text Labels to Numerical Labels**
* Converts each text response (output) into its numerical label using the saved mapping.
* Applies this mapping to train, validation, and test datasets.


In [None]:
# mapping the text labels to numerical labels
def map_label(example):
    example['label'] = label_mapping[example['response']]
    return example

# applying the label mappings
train_ds = train_ds.map(map_label)
val_ds = val_ds.map(map_label)
test_ds = test_ds.map(map_label)


Map:   0%|          | 0/4732 [00:00<?, ? examples/s]

Map:   0%|          | 0/592 [00:00<?, ? examples/s]

Map:   0%|          | 0/592 [00:00<?, ? examples/s]

# **Loading a Specialized BERT Model**
*   loads the BERT tokenizer from the bert-base-uncased model, which is a general-purpose model trained on a large corpus of diverse text.
* Since agriculture often involves natural language questions and everyday terminology, this model can effectively process general agricultural inquiries.



In [None]:
# loading the BERT tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

# **DATA PRE-PROCESSING**

**Text Cleaning and Normalization:** The goal is to remove unwanted characters, reduce noise, and prepare the text for tokenization. This step is crucial in handling variations in agricultural terminology and common phrases that farmers might use when asking questions. For example, you may need to clean up informal language, common abbreviations, and spelling errors that occur in the agricultural domain. By normalizing terms related to crops, soil types, farming practices, and pests, the text can be transformed into a more uniform format before it’s passed to the tokenizer, ensuring that the model can understand and process the queries effectively. This preprocessing is particularly important when dealing with farmers’ language, which might vary based on region, dialect, and education.

In [None]:
def clean_text(text):
    # Lowercase the text
    text = text.lower()
    # Remove special characters and punctuation (excluding some)
    text = re.sub(r'[^\w\s.,!?]', '', text)  # Keep commas, periods, question marks, and exclamation points
    # Remove extra whitespace
    text = re.sub(r'\s+', ' ', text).strip()

    return text

**Tokenization:** This process that is splitting text into smaller units (tokens) that the model can process. Since we are using the BERT model, we use the AutoTokenizer which uses the WordPiece tokenization method.

In [None]:
# tokenization
def tokenize(examples):

    # clean the instruction and input text fields
    cleaning = [clean_text(inst) for inst in examples['instruction']]
    cleaned_inputs = [clean_text(inp) for inp in examples['input']]

    # combine cleaned instruction and input into a single string as this allows the model tp understand the context better.
    inputs = [f"{inst} {inp}" for inst, inp in zip(cleaning, cleaned_inputs)]

    # The BERT tokenizer is used to split the text into tokens, pad the sequences to the same length, and truncate if necessary.
    tokenized_inputs = tokenizer(inputs, truncation=True, padding="max_length", max_length=512)

    # Add the label to the tokenized input
    tokenized_inputs["labels"] = examples["label"]
    return tokenized_inputs


# tokenize datasets
tokenized_train = train_ds.map(tokenize, batched=True)
tokenized_val = val_ds.map(tokenize, batched=True)
tokenized_test = test_ds.map(tokenize, batched=True)

# formatting for pytorch
tokenized_train.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])
tokenized_val.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])
tokenized_test.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])

Map:   0%|          | 0/4732 [00:00<?, ? examples/s]

Map:   0%|          | 0/592 [00:00<?, ? examples/s]

Map:   0%|          | 0/592 [00:00<?, ? examples/s]

In [None]:
#defining the model
num_labels = len(label_mapping)
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=num_labels)

for param in model.parameters():
    if not param.is_contiguous():
        param.data = param.data.contiguous()

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
# define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    report_to="none"
)



In [None]:
 # defining a custom Trainer to calculate loss
class CustomTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False, num_items_in_batch=None):
        # Get labels from inputs dictionary
        labels = inputs.get("labels")
        if labels is None:
            raise ValueError("Labels should not be None")
        outputs = model(**inputs)
        logits = outputs.get("logits")

        logits = logits.contiguous()
        labels = labels.contiguous()

        loss = F.cross_entropy(logits, labels)

        return (loss, outputs) if return_outputs else loss

# initializing the trainer
trainer = CustomTrainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_val,
)

# training the model
trainer.train()

Epoch,Training Loss,Validation Loss


In [None]:
# saving the model
model_path = "./saved_model"
os.makedirs(model_path, exist_ok=True)
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)

('./saved_model/tokenizer_config.json',
 './saved_model/special_tokens_map.json',
 './saved_model/vocab.txt',
 './saved_model/added_tokens.json',
 './saved_model/tokenizer.json')

In [None]:
# Evaluating the model
evaluation = trainer.evaluate()
evaluation

{'eval_loss': 7.1653828620910645,
 'eval_runtime': 15.9373,
 'eval_samples_per_second': 37.146,
 'eval_steps_per_second': 4.643,
 'epoch': 3.0}

In [None]:
# Evaluate model on test data
predictions = trainer.predict(tokenized_test)

# Get the unique labels in the predictions
unique_labels_in_data = np.unique(predictions.label_ids)

# Check if all unique labels are present in the label mapping
filtered_label_mapping = {
    label: idx for label, idx in label_mapping.items() if idx in unique_labels_in_data
}

# Ensure target_names includes all labels in unique_labels_in_data as strings
# Sort to maintain the order between labels and target_names
target_names = [
    str(label) for idx, label in sorted(filtered_label_mapping.items(), key=lambda item: item[1])
]

# Function to compute metrics
def compute_metrics(pred):
    labels = pred.label_ids
    preds = np.argmax(pred.predictions, axis=1)

    # Ensure the report uses the correct label names
    report = classification_report(labels, preds, labels=unique_labels_in_data, target_names=target_names)
    conf_matrix = confusion_matrix(labels, preds, labels=unique_labels_in_data)
    return report, conf_matrix

# Compute the metrics
report, conf_matrix = compute_metrics(predictions)
print("Classification Report:\n", report)
print("Confusion Matrix:\n", conf_matrix)


Classification Report:
               precision    recall  f1-score   support

           1       0.00      0.00      0.00         1
           3       0.00      0.00      0.00         1
           6       0.00      0.00      0.00         1
          19       0.00      0.00      0.00         1
          23       0.00      0.00      0.00         1
          26       0.00      0.00      0.00         1
          31       0.00      0.00      0.00         1
          46       0.00      0.00      0.00         1
          48       0.00      0.00      0.00         1
          50       0.00      0.00      0.00         1
          56       0.00      0.00      0.00         1
          59       0.00      0.00      0.00         2
          62       0.00      0.00      0.00         1
          66       0.00      0.00      0.00         1
          69       0.00      0.00      0.00         1
          71       0.00      0.00      0.00         1
          81       0.00      0.00      0.00         2
   

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


In [None]:
# Generating a response category
def predict_category(instruction, input_text):

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)
    # Clean the instruction and input text
    cleaned_instruction = clean_text(instruction)
    cleaned_input = clean_text(input_text)

    # Getting the BERT prediction
    inputs = tokenizer(f"{cleaned_instruction} {cleaned_input}", return_tensors="pt", truncation=True, padding="max_length", max_length=512)
    inputs = {k: v.to(device) for k, v in inputs.items()}

    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    predicted_id = torch.argmax(logits, dim=-1).item()
    bert_prediction = list(label_mapping.keys())[list(label_mapping.values()).index(predicted_id)]

    # Getting the semantic search results
    semantic_results = semantic_search(f"{cleaned_instruction} {cleaned_input}")

    if bert_prediction in [result[0] for result in semantic_results]:
        return bert_prediction
    else:
        return semantic_results[0][0]



instruction_input = "Answer the following question."
sample_input = "What is agroforestry?"
predicted_label = predict_category(instruction_input, sample_input)
print(f"Instruction: {instruction_input}")
print(f"Input: {sample_input}")
print(f"Predicted Response: {predicted_label}")

Instruction: Answer the following question.
Input: What is agroforestry?
Predicted Response: Agroforestry involves the integration of trees with crops and/or livestock, which can help to increase soil organic matter, reduce erosion, and improve soil structure.


In [None]:
# Load GPT-2 model and tokenizer for Perplexity calculation
gpt2_model = GPT2LMHeadModel.from_pretrained("gpt2")
gpt2_tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

# Ensure models are on the correct device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
gpt2_model.to(device)  # Move GPT-2 model to device

def calculate_perplexity(text):
    input_ids = gpt2_tokenizer.encode(text, return_tensors='pt').to(device)  # Move input to device
    with torch.no_grad():
        outputs = gpt2_model(input_ids, labels=input_ids)
    loss = outputs.loss
    perplexity = torch.exp(loss).item()
    return perplexity

# Function to calculate BLEU, ROUGE, and Perplexity scores
def calculate_metrics(prediction, true_output):
    bleu_score = sentence_bleu([true_output.split()], prediction.split())
    rouge_scorer_obj = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True)
    rouge_scores = rouge_scorer_obj.score(true_output, prediction)
    perplexity = calculate_perplexity(prediction)
    metrics = {
        "BLEU": bleu_score,
        "ROUGE-1": rouge_scores['rouge1'].fmeasure,
        "ROUGE-L": rouge_scores['rougeL'].fmeasure,
        "Perplexity": perplexity
    }
    return metrics

# Function to generate response category and calculate metrics
def predict_category(instruction, input_text, true_output):
    # Clean the instruction and input text
    cleaned_instruction = clean_text(instruction)
    cleaned_input = clean_text(input_text)

    # Get BERT prediction
    inputs = tokenizer(
        f"{cleaned_instruction} {cleaned_input}",
        return_tensors="pt",
        truncation=True,
        padding="max_length",
        max_length=512
    ).to(device)  # Move inputs to device

    model.to(device)  # Move BERT model to device

    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    predicted_class_id = torch.argmax(logits, dim=-1).item()
    bert_prediction = list(label_mapping.keys())[list(label_mapping.values()).index(predicted_class_id)]

    # Get semantic search results
    semantic_results = semantic_search(f"{cleaned_instruction} {cleaned_input}")

    # If BERT prediction is in top semantic results, return it; otherwise, return top semantic result
    if bert_prediction in [result[0] for result in semantic_results]:
        prediction = bert_prediction
    else:
        prediction = semantic_results[0][0]  # Return the top semantic search result

    # Calculate metrics
    metrics = calculate_metrics(prediction, true_output)

    return prediction, metrics

# Test the model with a sample input
sample_instruction = "Answer the following question"
sample_input = "What are some techniques for reducing nutrient leaching in ebb and flow hydroponic systems for lettuce cultivation?"
true_output = "Reducing nutrient leaching in ebb and flow hydroponic systems for lettuce cultivation involves practices such as optimizing flood and drain cycles to minimize excess nutrient solution runoff and leaching, implementing recirculating nutrient systems to capture and reuse drained nutrient solution, and adjusting nutrient solution formulations to match plant uptake rates and minimize waste. Implementing proper system drainage and aeration to prevent waterlogging and promote oxygenation of root zones, utilizing root zone barriers or substrates with high water retention capacity to prevent nutrient solution migration, and monitoring nutrient solution EC and pH levels to prevent nutrient imbalances can also help reduce nutrient leaching in ebb and flow hydroponic systems for lettuce cultivation."
predicted_label, metrics = predict_category(sample_instruction, sample_input, true_output)
print(f"Instruction: {sample_instruction}")
print(f"Input: {sample_input}")
print(f"True Output: {true_output}")
print(f"Predicted Response: {predicted_label}")
print(f"Metrics: {metrics}")


Instruction: Answer the following question
Input: What are some techniques for reducing nutrient leaching in ebb and flow hydroponic systems for lettuce cultivation?
True Output: Reducing nutrient leaching in ebb and flow hydroponic systems for lettuce cultivation involves practices such as optimizing flood and drain cycles to minimize excess nutrient solution runoff and leaching, implementing recirculating nutrient systems to capture and reuse drained nutrient solution, and adjusting nutrient solution formulations to match plant uptake rates and minimize waste. Implementing proper system drainage and aeration to prevent waterlogging and promote oxygenation of root zones, utilizing root zone barriers or substrates with high water retention capacity to prevent nutrient solution migration, and monitoring nutrient solution EC and pH levels to prevent nutrient imbalances can also help reduce nutrient leaching in ebb and flow hydroponic systems for lettuce cultivation.
Predicted Response: R

In [None]:
import torch
import torch.nn.functional as F
from transformers import Trainer, TrainingArguments, AdamW
import numpy as np

# Define a set of hyperparameters to tune
hyperparameter_sets = [
    {"learning_rate": 1e-5, "batch_size": 8, "weight_decay": 0.01},
    {"learning_rate": 1e-5, "batch_size": 16, "weight_decay": 0.01},
    {"learning_rate": 1e-5, "batch_size": 32, "weight_decay": 0.05},
]

best_acc = 0
best_params = None

# Define your custom Trainer class outside of the loop
class CustomTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False, **kwargs):
        labels = inputs.get("labels")
        if labels is None:
            raise ValueError("Labels should not be None")
        outputs = model(**inputs)
        logits = outputs.get("logits")

        # Make sure logits and labels are contiguous for cross entropy
        logits = logits.contiguous()
        labels = labels.contiguous()

        loss = F.cross_entropy(logits, labels)
        return (loss, outputs) if return_outputs else loss

# Iterate over different hyperparameters
for params in hyperparameter_sets:
    print(f"Training with: {params}")

    training_args = TrainingArguments(
        output_dir="./results",
        num_train_epochs=3,
        per_device_train_batch_size=params["batch_size"],
        per_device_eval_batch_size=params["batch_size"],
        warmup_steps=500,
        weight_decay=params["weight_decay"],
        logging_dir="./logs",
        # You might want to change these to the new parameter names if available:
        evaluation_strategy="epoch",  # (Deprecated; consider using `eval_strategy`)
        save_strategy="epoch",
        load_best_model_at_end=True,
        learning_rate=params["learning_rate"],
        report_to="none"
    )

    # Initialize the custom trainer for each set of hyperparameters
    trainer = CustomTrainer(
        model=model,
        args=training_args,
        train_dataset=tokenized_train,
        eval_dataset=tokenized_val,
    )

    trainer.train()

    # Evaluate the model
    eval_result = trainer.evaluate()
    # Change this if you're tracking another metric (e.g., "eval_accuracy")
    accuracy = eval_result.get("eval_loss", None)

    print(f"Accuracy for {params}: {accuracy}")

    # Keep track of best parameters (update the condition based on your metric)
    if accuracy is not None and accuracy > best_acc:
        best_acc = accuracy
        best_params = params

# Print the best hyperparameters
print("\nBest Hyperparameters:")
print(best_params)
print(f"Best Accuracy: {best_acc}")


Training with: {'learning_rate': 1e-05, 'batch_size': 8, 'weight_decay': 0.01}




Epoch,Training Loss,Validation Loss
1,6.6584,7.098837
2,6.4803,6.996778
3,6.3361,6.968686


Accuracy for {'learning_rate': 1e-05, 'batch_size': 8, 'weight_decay': 0.01}: 6.968685626983643
Training with: {'learning_rate': 1e-05, 'batch_size': 16, 'weight_decay': 0.01}




Epoch,Training Loss,Validation Loss
1,No log,6.903368
2,6.355900,6.839574
3,6.355900,6.824488


Accuracy for {'learning_rate': 1e-05, 'batch_size': 16, 'weight_decay': 0.01}: 6.824488162994385
Training with: {'learning_rate': 1e-05, 'batch_size': 32, 'weight_decay': 0.05}




Epoch,Training Loss,Validation Loss
1,No log,6.792413
2,No log,6.768473


Epoch,Training Loss,Validation Loss
1,No log,6.792413
2,No log,6.768473
3,No log,6.749439


Accuracy for {'learning_rate': 1e-05, 'batch_size': 32, 'weight_decay': 0.05}: 6.749438762664795

Best Hyperparameters:
{'learning_rate': 1e-05, 'batch_size': 8, 'weight_decay': 0.01}
Best Accuracy: 6.968685626983643
