# ***Installing Dependencies***

In [1]:
!pip install --upgrade transformers datasets evaluate rouge_score bert_score nltk

import nltk
# Download required NLTK data for METEOR
nltk.download('wordnet')
nltk.download('punkt')

Collecting datasets
  Downloading datasets-4.4.1-py3-none-any.whl.metadata (19 kB)
Collecting evaluate
  Downloading evaluate-0.4.6-py3-none-any.whl.metadata (9.5 kB)
Collecting rouge_score
  Downloading rouge_score-0.1.2.tar.gz (17 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting bert_score
  Downloading bert_score-0.3.13-py3-none-any.whl.metadata (15 kB)
Collecting nltk
  Downloading nltk-3.9.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pyarrow>=21.0.0 (from datasets)
  Downloading pyarrow-22.0.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (3.2 kB)
Downloading datasets-4.4.1-py3-none-any.whl (511 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m511.6/511.6 kB[0m [31m11.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading evaluate-0.4.6-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading bert_score-0.3.13-py3-none-any.whl (61 kB)
[2K   

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

First, the `!pip install --upgrade transformers datasets evaluate rouge_score bert_score nltk` is a command that installs and updates several important Python packages needed for our summarization task. We’re using it to make sure all the tools are present and in their latest versions, so they work properly together. The packages include “transformers” for handling the summarization model, “datasets” for working with data, “evaluate” for measuring performance, and “rouge_score,” “bert_score,” and “nltk” for different ways to check the quality of summaries.

Next, `import nltk` imports the NLTK library into the project so we can use its tools for text processing. We’re including it because some evaluation methods, like METEOR, depend on NLTK’s functions to compare text. It allows the program to handle words, sentences, and grammar more effectively during evaluation.

Then `nltk.download('wordnet')` tells NLTK to download a set of language data called “WordNet.” We’re doing this because WordNet helps NLTK understand word meanings and relationships, which are needed for some text comparison scores. Without it, the program might show errors when running METEOR or other metrics.

Lastly, `nltk.download('punkt')` downloads another NLTK data package called “punkt,” which helps the program break text into sentences or words properly. We’re downloading it because many text-based tools rely on sentence and word separation to work correctly. It ensures that when we evaluate summaries, the program can read and process text smoothly without mistakes.

In [2]:
!pip install textstat

Collecting textstat
  Downloading textstat-0.7.11-py3-none-any.whl.metadata (15 kB)
Collecting pyphen (from textstat)
  Downloading pyphen-0.17.2-py3-none-any.whl.metadata (3.2 kB)
Downloading textstat-0.7.11-py3-none-any.whl (176 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m176.4/176.4 kB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyphen-0.17.2-py3-none-any.whl (2.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m23.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pyphen, textstat
Successfully installed pyphen-0.17.2 textstat-0.7.11


This line is a command that installs the textstat library into the environment. We’re using it so the program can measure how easy or hard a piece of text is to read. The textstat package gives us several ways to check readability, such as scoring a summary’s simplicity or complexity. By installing it, we make sure the program can later calculate different reading scores, which help us understand the quality of the generated summaries.

In [3]:
!pip install textstat datasets transformers



This line is a command that installs three important Python libraries that the program will use. The textstat library is added so we can check the readability of the text, helping us know if the sentences are simple or complex. The datasets library is installed to make it easier to handle, load, and prepare data that will be used for training or testing. The transformers library is also installed because it contains tools and models used for text generation and summarization. By running this line, we make sure all three libraries are ready to be used in the rest of the code without any errors.

# ***Setting up the Environment and Loading Data***

In [4]:
import pandas as pd
from datasets import Dataset, DatasetDict
import re # Import the regular expression library

# --- (A) CREATE A CLEANING FUNCTION ---
def clean_text(text):
    if not isinstance(text, str): # Handle potential non-string data
        return ""
    text = text.lower()
    text = re.sub(r'<.*?>', '', text)
    text = re.sub(r'https?://\S+|www\.\S+', '', text)
    text = re.sub(r'\s+', ' ', text).strip()
    return text

# --- 1. Load Your Custom Dataset ---
try:
    # Changed encoding to 'utf-8', which is standard for Kaggle datasets
    df = pd.read_csv('news-article-categories.csv', encoding='utf-8')
    print("Successfully loaded 'news-article-categories.csv'")

except FileNotFoundError:
    print("Error: 'news-article-categories.csv' not found.")
    df = None # Set df to None if file not found

if df is not None:
    # --- 2. Preprocess and Prepare the Dataset ---
    # --- THIS IS THE FIX ---
    # Select the correct columns from the new dataset ('body' and 'title')
    df = df[['body', 'title']]
    # Rename them to the standard names the rest of the script expects ('text' and 'summary')
    df.columns = ['text', 'summary']

    # Handle potential missing values in the new dataset
    df.dropna(inplace=True)

    # --- (B) APPLY THE CLEANING FUNCTION TO YOUR DATA ---
    print("\n--- Applying preprocessing to the dataset ---")
    df['text'] = df['text'].apply(clean_text)
    df['summary'] = df['summary'].apply(clean_text)
    print("Preprocessing complete. Example of cleaned article:")
    print(df.iloc[0]['text'])

    # --- 3. Convert to a Hugging Face Dataset ---
    hg_dataset = Dataset.from_pandas(df)

    # --- 4. Split into Training and Validation Sets ---
    train_test_split = hg_dataset.train_test_split(test_size=0.2)
    dataset = DatasetDict({
        'train': train_test_split['train'],
        'test': train_test_split['test']
    })

    print("\nDataset structure:")
    print(dataset)

Successfully loaded 'news-article-categories.csv'

--- Applying preprocessing to the dataset ---
Preprocessing complete. Example of cleaned article:

Dataset structure:
DatasetDict({
    train: Dataset({
        features: ['text', 'summary', '__index_level_0__'],
        num_rows: 5497
    })
    test: Dataset({
        features: ['text', 'summary', '__index_level_0__'],
        num_rows: 1375
    })
})


This `import pandas as pd` command brings in the pandas library and gives it the short name "pd" so that it is easier to use in the code. Pandas makes it possible for the program to work with data in tables, which is similar to how spreadsheets work. We can read files, clean the data, and get it ready for later steps like training a model with this library.

This line of code, `from datasets import Dataset, DatasetDict`, brings in two tools from the datasets library: Dataset and DatasetDict. These are used to change pandas data into a format that text models can better understand. It also lets us divide the data into different sets, like groups for training and testing, to make it easier to keep track of and check performance.

This `import re` imports the re library, which helps the program work with text patterns. It allows us to find and remove certain parts of a text, like links, symbols, or tags, using simple pattern rules. This makes cleaning messy text easier and ensures the data looks consistent.

This line, `def clean_text(text):` starts a function called clean_text that is meant to clean and fix the text before the model uses it. The function will take one piece of text at a time and give it back in a cleaner form. It's a simple but important step to make sure the data is clean and ready.

The `if not isinstance(text, str):` checks to see if the text is not really a text type, like a number or an empty value. It handles things correctly so that they don't cause problems if they aren't text. This check helps the program avoid making mistakes when it tries to clean the data later.

If the input wasn't a valid text, this `return ""` will send back an empty string. It makes sure that bad data doesn't stay in the dataset. The program keeps the dataset clean and safe to work with by replacing bad inputs with blanks.

This `text = text.lower()` command makes all the letters in the text lowercase. This helps make sure that the words "News" and "news" are treated the same way. It's a small but useful step toward making the data more consistent.

This line of code, `text = re.sub(r'<.*?>', '', text)`, takes out any HTML tags or symbols that might be in the text. Some articles or web pages may have tags like `<p> or <br>` that aren't needed. Taking them out makes the text clearer and easier to read.

This `text = re.sub(r'https?://\S+|www\.\S+', '', text)` takes out any links to websites that are in the text. Links don't help us understand what an article means, so they are removed. This keeps the text focused on the main points.

This `text = re.sub(r'\s+', ' ', text)` line.strip() takes out extra spaces or newlines in the text and replaces them with a single space. It also takes away any spaces at the beginning or end of the text. The goal is to make the text look nice and well-organized.

The `return text` finishes the cleaning process and sends back the cleaned text. We can then use the function anywhere in the code to get text ready for analysis. It makes sure that all the words are neat and in order.

This `df = pd.read_csv('news-article-categories.csv', encoding='utf-8')` reads a CSV file named news-article-categories.csv using pandas. It loads all the data into a table format called a DataFrame. The encoding “utf-8” makes sure special characters like accents or symbols are read properly.

This `print("Successfully loaded 'news-article-categories.csv'")` just prints a message to say that the file was loaded without any problems. It lets the user know that the file is in the right place and is ready to be used. It's helpful to get confirmation messages like this to make sure that each step works as it should.

This `except FileNotFoundError:` part takes care of what happens when the file isn't in the folder. It tells the program what to do if the file is missing so it doesn't just stop working. This lets the program keep running smoothly and let the user know about the problem.

This `print("Error: 'news-article-categories.csv' not found.")` tells the user that the file could not be found. It helps the user figure out what went wrong so they can fix it. Giving clear error messages keeps testing from getting confusing.

This `df = None` makes the df variable empty, which means it has no value. This is a placeholder that keeps the rest of the code from breaking. It makes sure that the program can still run even if the file is missing.

This `df = df[['body', 'title']]` keeps only the columns for the body and title from the dataset. The body is usually the whole article, and the title is usually the summary. Keeping only these two makes the dataset better for training.

This line of code, `df.columns = ['text', 'summary']`, changes the names of the two columns to make them easier to understand. The article's title is now called "text" and the summary is now called "summary." These names are easier to remember, and later parts of the script will use them. It makes the code easier to read and more consistent.

This line of code, `df.columns = ['text', 'summary']`, changes the names of the two columns to text for the article and summary for the title. These names are easier to remember, and later parts of the script will use them. It makes the code easier to read and more consistent.

This `df.dropna(inplace=True)` takes out any rows from the dataset that have empty or missing values. It makes sure that the model is only trained on complete data. This step helps keep the model from trying to read text that isn't there.

This `df['text'] = df['text'].apply(clean_text)` uses the clean_text function to clean every article in the text column. Each piece of text goes through the cleaning process we defined earlier. This makes all the articles neat, readable, and uniform.

This `df['summary'] = df['summary'].apply(clean_text)` does the same cleaning process but for the summary column. It ensures that even the short summaries are free from unwanted symbols or spaces. Both columns now have clean and ready-to-use text.

This line of code, `hg_dataset = Dataset.from_pandas(df)`, turns the cleaned pandas DataFrame into a Hugging Face Dataset. When using machine learning tools, the new format is easier to work with. It also makes it easier and faster to work with data during training.

This line of code, `train_test_split = hg_dataset.train_test_split(test_size=0.2)`, splits the dataset into two parts: one for training and one for testing. The test_size=0.2 setting means that 20% of the data will be used for testing. This helps the program see how well the model works on data it hasn't seen before.

Lastly, this `print(dataset)` shows the structure and contents of the dataset that was prepared. It lets the user know that the split worked and that both sets were made. This information proves that the data preparation step worked.



# ***Tokenization***

In [5]:
from transformers import AutoTokenizer

# --- 4. Define the Model Checkpoint ---
# ## <-- KEY CHANGE: Switched to the BART model ---
model_checkpoint = "facebook/bart-base"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

# --- 5. Create a BART-Specific Preprocessing Function ---
def preprocess_function(examples):
    # Tokenize the inputs
    model_inputs = tokenizer(examples["text"], max_length=1024, truncation=True)

    # Tokenize the target summaries (labels)
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["summary"], max_length=128, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

# --- 6. Apply the Tokenization ---
dataset = dataset.filter(lambda x: len(x["text"].split()) < 500)
tokenized_datasets = dataset.map(preprocess_function, batched=True)
print("\nSample of tokenized data prepared for BART:")
print(tokenized_datasets['train'][0].keys())

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Filter:   0%|          | 0/5497 [00:00<?, ? examples/s]

Filter:   0%|          | 0/1375 [00:00<?, ? examples/s]

Map:   0%|          | 0/2797 [00:00<?, ? examples/s]



Map:   0%|          | 0/730 [00:00<?, ? examples/s]


Sample of tokenized data prepared for BART:
dict_keys(['text', 'summary', '__index_level_0__', 'input_ids', 'attention_mask', 'labels'])


This `from transformers import AutoTokenizer` brings in the AutoTokenizer tool from the transformers library. This tool is useful because it knows how to get text ready for a certain model, like BART, without us having to do anything. We can later use it to break up our text and summaries into smaller parts that the model can understand by importing it.

This line of code, `model_checkpoint = "facebook/bart-base"`, gives the model the name "facebook/bart-base" and saves it in a variable called model_checkpoint. In short, we're telling the program which version of the BART model we want to use to summarize text. It's also easier to change the model later if we store it in a variable instead of changing other parts of the code.

We are making a tokenizer with the model name we saved earlier in the line `tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)`. The "from_pretrained" part means we're getting the tokenizer that comes with the "facebook/bart-base" model. This line makes sure that the tokenizer will handle text in the same way that the model expects it to be set up.

This line, `def preprocess_function(examples):,` starts the process of making a function called preprocess_function. We are defining this function so that it can clean and get several text samples ready before sending them to the model. One input, called examples, has both the main text and the summary for each data entry.

This line of code, `model_inputs = tokenizer(examples["text"], max_length=1024, truncation=True)`, uses the tokenizer to break the article text into smaller pieces that the model can read. It limits the number of tokens to 1024 so that the model doesn't get too much text at once. If the text is too long, the truncation=True part tells the program to cut off any extra words.

Here, we are changing the tokenizer to focus on the target part, which is the summary, with the line `with tokenizer.as_target_tokenizer():`. This makes sure that the tokenizer handles the summary correctly, apart from how it handles the main text of the article. We can think of it as saying to the tokenizer, "Now we're working on the sentences that come out."

The tokenizer also turns the summaries into tokens in this line: `labels = tokenizer(examples["summary"], max_length=128, truncation=True)`.  We set a lower maximum length of 128 because summaries are usually much shorter than the main text.  Once more, truncation is used to get rid of any extra words that go over the limit.

This line of code, `model_inputs["labels"] = labels["input_ids"]`, links the tokenized summaries (labels) to the tokenized text data we made earlier. It basically connects the input tokens of each article to the tokens that make up its summary. This pairing is important because it helps the model learn which summary goes with which article.

Here `return model_inputs`, the function sends back the fully prepared input data. This includes both the tokenized text and its connected summary, which can now be used for model training. Returning this data makes it available for other parts of the code that need it.

This `dataset = dataset.filter(lambda x: len(x["text"].split()) < 500)`, keeps only articles with fewer than 500 words. The goal is to get rid of entries that are too long and might make the model slow down or get confused while it's learning. It looks at each text and only keeps the ones that fit within the word limit.

In this line, `tokenized_datasets = dataset.map(preprocess_function, batched=True),` we use our preprocess_function on each entry in the dataset. If we set batched=True, the data will be processed in groups, which speeds up the process. The result is a new dataset in which all of the text and summaries have already been tokenized and are ready for the model.

This `print("\nSample of tokenized data prepared for BART:")` command sends a message to the screen so we can see that tokenization has been done. It lets the user know that the process has reached this point successfully. The \n adds a line break before the message to make it easier to read.

Lastly, this `print(tokenized_datasets['train'][0].keys())` shows the keys or labels of the first item in the tokenized training dataset. We can see what kind of information is in each entry, like "input_ids" or "labels." This printout shows that the preprocessing worked and that the structure looks the way it should.

# ***Model Training***

## ***Fine-Tuning the Model***

In [6]:
# --- INSTALL REQUIRED LIBRARIES FOR HYPERPARAMETER SEARCH ---
!pip install optuna

Collecting optuna
  Downloading optuna-4.5.0-py3-none-any.whl.metadata (17 kB)
Collecting colorlog (from optuna)
  Downloading colorlog-6.10.1-py3-none-any.whl.metadata (11 kB)
Downloading optuna-4.5.0-py3-none-any.whl (400 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m400.9/400.9 kB[0m [31m12.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading colorlog-6.10.1-py3-none-any.whl (11 kB)
Installing collected packages: colorlog, optuna
Successfully installed colorlog-6.10.1 optuna-4.5.0


In [7]:
import transformers
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer
import numpy as np
import textstat
import evaluate
import optuna # Import Optuna for hyperparameter search

print("Transformers library version:", transformers.__version__)

# --- Model Checkpoint ---
model_checkpoint = "facebook/bart-base"

# --- 1. DEFINE A MODEL INITIALIZATION FUNCTION ---
def model_init():
    return AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

# --- Initialize ROUGE metric ---
rouge = evaluate.load("rouge")

# --- 2. DEFINE METRICS ---
def compute_metrics(eval_pred):
    predictions, labels = eval_pred

    if predictions.ndim > 2:
        predictions = predictions[:, 0, :]

    # Decode
    decoded_preds = []
    for pred in predictions:
        pred = np.clip(pred, 0, tokenizer.vocab_size - 1)
        text = tokenizer.decode(pred, skip_special_tokens=True, clean_up_tokenization_spaces=True)
        decoded_preds.append(text)

    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = [tokenizer.decode(l, skip_special_tokens=True, clean_up_tokenization_spaces=True) for l in labels]

    # --- ROUGE ---
    filtered_preds_labels = [(p, l) for p, l in zip(decoded_preds, decoded_labels) if p.strip() and l.strip()]
    if filtered_preds_labels:
        filtered_preds, filtered_labels = zip(*filtered_preds_labels)
        rouge_scores = rouge.compute(
            predictions=list(filtered_preds),
            references=list(filtered_labels),
            use_stemmer=True
        )
        rouge1 = rouge_scores["rouge1"] * 100
        rouge2 = rouge_scores["rouge2"] * 100
        rougeL = rouge_scores["rougeL"] * 100
        rougeLsum = rouge_scores["rougeLsum"] * 100
    else:
        rouge1 = rouge2 = rougeL = rougeLsum = 0.0

    # --- Readability ---
    readability_scores = [textstat.flesch_reading_ease(pred) for pred in decoded_preds if pred.strip()]
    avg_readability = np.mean(readability_scores) if readability_scores else 0

    # --- Average Length ---
    prediction_lens = [len(pred.split()) for pred in decoded_preds if pred.strip()]
    avg_length = np.mean(prediction_lens) if prediction_lens else 0

    return {
        "rouge1": round(rouge1, 4),
        "rouge2": round(rouge2, 4),
        "rougeL": round(rougeL, 4),
        "rougeLsum": round(rougeLsum, 4),
        "avg_readability": round(avg_readability, 2),
        "avg_length": round(avg_length, 2),
    }

# --- Data Collator ---
# Pass only tokenizer to DataCollatorForSeq2Seq, the trainer will handle the model
data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer)

# --- 3. DEFINE STATIC TRAINING ARGUMENTS ---
training_args = Seq2SeqTrainingArguments(
    output_dir="./bart_hyperparameter_search_batch_epochs",
    do_eval=True,
    eval_strategy="epoch",
    save_strategy="epoch",
    learning_rate=3e-5,          # fixed
    weight_decay=0.02,           # fixed
    warmup_steps=500,            # fixed
    predict_with_generate=True,
    fp16=True,
    report_to="none"
)

# --- 4. INITIALIZE TRAINER ---
trainer = Seq2SeqTrainer(
    args=training_args,
    model_init=model_init,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    tokenizer=tokenizer,
    # processing_class=tokenizer, # Alternative to address FutureWarning
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

# --- 5. DEFINE SEARCH SPACES ---

# Random Search
def random_search_hp_space(trial):
    return {
        "per_device_train_batch_size": trial.suggest_int("per_device_train_batch_size", 2, 4), # Increased lower bound
        "per_device_eval_batch_size": trial.suggest_int("per_device_eval_batch_size", 4, 8),
        "num_train_epochs": trial.suggest_int("num_train_epochs", 3, 6), # Decreased upper bound
    }

# Grid Search
def grid_search_hp_space(trial):
    return {
        "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [8, 2]),
        "per_device_eval_batch_size": trial.suggest_categorical("per_device_eval_batch_size", [4, 8]),
        "num_train_epochs": trial.suggest_categorical("num_train_epochs", [2, 3, 5]),
    }

# --- 6. RUN SEARCH ---
print("\nStarting automated hyperparameter search...")

best_trial = trainer.hyperparameter_search(
    direction="maximize",
    compute_objective=lambda metrics: metrics["eval_avg_readability"],
    n_trials=2,  # adjust for speed or coverage - decreased for faster results
    hp_space=grid_search_hp_space,
    backend="optuna" # Specify optuna as the backend
)

# --- 7. DISPLAY RESULTS ---
print("\n--- Hyperparameter Search Complete ---")
print(f"Best Objective (Readability): {best_trial.objective}")
print("Best Hyperparameters:")
for param, value in best_trial.hyperparameters.items():
    print(f"  - {param}: {value}")

# --- 8. TRAIN FINAL MODEL ---
print("\n--- Training final model with best hyperparameters ---")
# Update training_args with best hyperparameters from best_trial.hyperparameters
for param, value in best_trial.hyperparameters.items():
    setattr(training_args, param, value)

final_trainer = Seq2SeqTrainer(
    model_init=model_init,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    tokenizer=tokenizer,
    # processing_class=tokenizer, # Alternative to address FutureWarning
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

final_trainer.train()
model_save_path = "./my_best_bart_model_automated_batch_epochs"
final_trainer.save_model(model_save_path)
print(f"Final optimized BART model saved to {model_save_path}")

Transformers library version: 4.57.1


Downloading builder script: 0.00B [00:00, ?B/s]

  trainer = Seq2SeqTrainer(


model.safetensors:   0%|          | 0.00/558M [00:00<?, ?B/s]

[I 2025-11-07 20:27:31,325] A new study created in memory with name: no-name-2e3171ab-6b6f-4913-8101-b84147c8cc62



Starting automated hyperparameter search...


Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum,Avg Readability,Avg Length
1,No log,1.938679,39.3954,20.1564,36.2064,36.2113,55.17,9.85
2,2.508100,1.884513,39.2172,19.5954,35.9328,35.9711,54.05,9.32
3,1.833400,1.876926,40.5294,20.3032,36.865,36.9081,54.66,9.7


[I 2025-11-07 20:39:29,332] Trial 0 finished with value: 54.66 and parameters: {'per_device_train_batch_size': 8, 'per_device_eval_batch_size': 4, 'num_train_epochs': 3}. Best is trial 0 with value: 54.66.


Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum,Avg Readability,Avg Length
1,2.3882,2.02412,38.9221,19.6379,35.6467,35.674,53.26,10.23
2,1.7739,1.984813,40.3824,20.203,36.9553,36.9891,54.79,9.23
3,1.3076,2.031463,40.3793,19.8086,36.9404,36.9578,54.95,10.03
4,1.0,2.111982,40.0385,19.4625,36.2858,36.3282,54.24,10.24
5,0.8448,2.179211,39.7763,19.2,36.0109,36.057,54.67,9.9


[I 2025-11-07 21:07:03,415] Trial 1 finished with value: 54.67 and parameters: {'per_device_train_batch_size': 2, 'per_device_eval_batch_size': 4, 'num_train_epochs': 5}. Best is trial 1 with value: 54.67.
  final_trainer = Seq2SeqTrainer(



--- Hyperparameter Search Complete ---
Best Objective (Readability): 54.67
Best Hyperparameters:
  - per_device_train_batch_size: 2
  - per_device_eval_batch_size: 4
  - num_train_epochs: 5

--- Training final model with best hyperparameters ---


Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum,Avg Readability,Avg Length
1,2.3889,2.02326,38.9507,19.6743,35.5683,35.6297,52.99,10.2
2,1.7752,1.983928,40.2072,20.1259,36.7819,36.8196,55.01,9.27
3,1.3072,2.03048,40.2359,19.6139,36.8176,36.845,55.83,10.06
4,1.0004,2.109243,40.0115,19.4476,36.2727,36.3119,54.58,10.18
5,0.8444,2.178728,39.7107,19.0453,35.8251,35.8916,54.65,9.92




Final optimized BART model saved to ./my_best_bart_model_automated_batch_epochs
