# AgreeMate Finetuner Notebook

## Table of Contents
1. [Introduction](#introduction)
2. [Setup and Initialization](#setup-and-initialization)
3. [Data Loading and Preparation](#data-loading-and-preparation)
4. [Model Loading](#model-loading)
5. [Dataset Creation](#dataset-creation)
6. [Training Configuration](#training-configuration)
7. [Model Fine-tuning](#model-fine-tuning)
8. [Finetuned Models Evaluation](#evaluate-finetuned-models)


## 1. Introduction

Welcome to the **AgreeMate Finetuner Notebook**. This notebook is designed to fine-tune the **Llama-3.2-1B-Instruct** model for specialized negotiation roles within bargaining environments. The objective is to create three distinct model variants:

1. **Buyer Specialist**: Optimized for buyer-specific negotiation strategies.
2. **Seller Specialist**: Optimized for seller-specific negotiation strategies.
3. **Generalist Negotiator**: Capable of handling both buyer and seller roles effectively.

This finetuning process leverages existing modules (`model_loader.py` and `data_loader.py`) to ensure seamless integration and efficient path management within the Jupyter environment. Additionally, the notebook incorporates robust checkpointing mechanisms to prevent loss of progress in case of unexpected failures.

## 2. Setup and Initialization

In [1]:
import os, logging, shutil, torch
from copy import deepcopy
from logging.handlers import RotatingFileHandler
from tqdm.auto import tqdm
from transformers import TrainingArguments, get_linear_schedule_with_warmup
from accelerate import Accelerator
from sklearn.model_selection import train_test_split
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128,expandable_segments:True"
from torch.utils.data import Dataset, DataLoader
from torch.optim import AdamW

from data_loader import NegotiationDialogueDataLoader
from model_loader import ModelLoader


# configure logging
handler = RotatingFileHandler('./progress/finetuner.log', maxBytes=10**6, backupCount=5)
logging.basicConfig(
    handlers=[handler], # limit log file size to 5MB
    format='%(asctime)s - %(levelname)s - %(message)s',
    level=logging.DEBUG
)
logger = logging.getLogger(__name__)

logger.info("Starting AgreeMate Finetuner Notebook")
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
logger.info(f"Training on {device}")


# define directories
finetuning_dir = os.getcwd()
pretrained_dir = os.path.join(finetuning_dir, 'models--meta-llama--Llama-3.2-1B-Instruct')
buyer_finetuned_dir = os.path.join(finetuning_dir, 'models--buyer-finetuned--Llama-3.2-1B-Instruct')
seller_finetuned_dir = os.path.join(finetuning_dir, 'models--seller-finetuned--Llama-3.2-1B-Instruct')
generalist_finetuned_dir = os.path.join(finetuning_dir, 'models--generalist-finetuned--Llama-3.2-1B-Instruct')
os.makedirs(buyer_finetuned_dir, exist_ok=True)
os.makedirs(seller_finetuned_dir, exist_ok=True)
os.makedirs(generalist_finetuned_dir, exist_ok=True)
logger.info(f"Finetuned models will be saved to: {finetuning_dir}")

INFO:__main__:Starting AgreeMate Finetuner Notebook
INFO:__main__:Training on cuda
INFO:__main__:Finetuned models will be saved to: c:\Tocho\umd\fall_2024\CMSC723\agreemate\finetuning


## 3. Data Loading and Preparation

Use the existing `NegotiationDialogueDataLoader` to load and prepare the datasets for finetuning.

In [2]:
data_loader = NegotiationDialogueDataLoader()
logger.info("Initialized NegotiationDialogueDataLoader")

# load datasets
buyer_df = data_loader.load_dataset("buyer")
seller_df = data_loader.load_dataset("seller")
generalist_df = data_loader.load_dataset("generalist")
logger.info(f"Loaded buyer dataset with {len(buyer_df)} examples")
logger.info(f"Loaded seller dataset with {len(seller_df)} examples")
logger.info(f"Loaded generalist dataset with {len(generalist_df)} examples")

INFO:__main__:Initialized NegotiationDialogueDataLoader
INFO:data_loader:Loading buyer dataset from c:\Tocho\umd\fall_2024\CMSC723\agreemate\data\deal_or_no_deal\buyer_training.csv
INFO:data_loader:Loaded 36596 examples
INFO:data_loader:Loading seller dataset from c:\Tocho\umd\fall_2024\CMSC723\agreemate\data\deal_or_no_deal\seller_training.csv
INFO:data_loader:Loaded 36595 examples
INFO:data_loader:Loading generalist dataset from c:\Tocho\umd\fall_2024\CMSC723\agreemate\data\deal_or_no_deal\generalist_training.csv
INFO:data_loader:Loaded 73191 examples
INFO:__main__:Loaded buyer dataset with 36596 examples
INFO:__main__:Loaded seller dataset with 36595 examples
INFO:__main__:Loaded generalist dataset with 73191 examples


### Prepare Finetuning Data

Format the loaded data into input-target pairs suitable for model training.

In [3]:
buyer_data = data_loader.prepare_for_training(buyer_df)
seller_data = data_loader.prepare_for_training(seller_df)
generalist_data = data_loader.prepare_for_training(generalist_df)
logger.info("Prepared training data for buyer, seller, and generalist models")

INFO:__main__:Prepared training data for buyer, seller, and generalist models


## 4. Model Loading

Use the provided `ModelLoader` to handle model loading and caching.

Load the base Llama-3.2-1B-Instruct model, which will be finetuned for each specific role.

In [4]:
model_loader = ModelLoader()
logger.info("Initialized ModelLoader")

# load base model and tokenizer
base_model, base_tokenizer = model_loader.load_model_and_tokenizer()
if base_tokenizer.pad_token_id is None: # set pad token to eos token
    base_tokenizer.pad_token = base_tokenizer.eos_token
    base_tokenizer.pad_token_id = base_tokenizer.eos_token_id
torch.cuda.empty_cache() # clear any residual memory
torch.cuda.reset_peak_memory_stats() # reset memory tracking stats
logger.info("Loaded base Llama-3.2-1B-Instruct model and tokenizer")

INFO:__main__:Initialized ModelLoader
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): huggingface.co:443


Using cache directory: c:\Tocho\umd\fall_2024\CMSC723\agreemate\finetuning


DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /meta-llama/Llama-3.2-1B-Instruct/resolve/main/tokenizer_config.json HTTP/11" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /meta-llama/Llama-3.2-1B-Instruct/resolve/main/config.json HTTP/11" 200 0


✓ Tokenizer loaded successfully


  new_value = value.to(device)
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /meta-llama/Llama-3.2-1B-Instruct/resolve/main/generation_config.json HTTP/11" 200 0
INFO:__main__:Loaded base Llama-3.2-1B-Instruct model and tokenizer


✓ Model loaded successfully


## 5. Dataset Creation

### Define Custom Dataset Class

Define a `NegotiationDataset` class to interface with Hugging Face's `Trainer`. This class handles the encoding of inputs and targets.

In [5]:
class NegotiationDataset(Dataset):
    """
    Custom Dataset for Negotiation Finetuning.

    Attributes:
        encodings (Dict): Tokenized inputs.
        labels (List[List[int]]): Tokenized target responses.
    """
    def __init__(self, encodings, labels):
        """
        Initializes the dataset with encodings and labels.

        Args:
            encodings (Dict): Tokenized input data.
            labels (List[List[int]]): Tokenized target responses.
        """
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        """
        Retrieves the input-target pair at the specified index.

        Args:
            idx (int): Index of the data point.

        Returns:
            Dict: A dictionary containing input_ids, attention_mask, and labels.
        """
        return {
            'input_ids': self.encodings['input_ids'][idx].clone().detach(),
            'attention_mask': self.encodings['attention_mask'][idx].clone().detach(),
            'labels': self.labels[idx].clone().detach()
        }

    def __len__(self):
        """
        Returns the total number of examples in the dataset.

        Returns:
            int: Number of examples.
        """
        return len(self.labels)

### Split Data and Tokenize Inputs and Targets

Split the prepared data into training and validation sets at a 90-10 ratio.
Tokenize the prepared data using the respective tokenizers for each model variant.

In [6]:
def split_dataset(data, test_size=0.1):
    """Splits the dataset into training and validation sets."""
    inputs_train, inputs_val, labels_train, labels_val = train_test_split(
        data['input'], data['target'], test_size=test_size, random_state=42
    )
    return inputs_train, inputs_val, labels_train, labels_val

def tokenize_data(inputs, targets, tokenizer):
    """Tokenizes input and target data by concatenating them and setting labels to -100 for input tokens."""

    # concatenate inputs and targets with a separator (e.g., space or special token if needed)
    concatenated = [f"{inp} {tgt}" for inp, tgt in zip(inputs, targets)]

    # tokenize the concatenated sequences
    encodings = tokenizer(
        concatenated,
        truncation=True,
        padding=True,
        return_tensors='pt'
    )

    # tokenize inputs separately to determine the boundary between input and target tokens
    input_encodings = tokenizer(
        inputs,
        truncation=True,
        padding=True,
        return_tensors='pt'
    )

    # initialize labels with -100 (representing tokens to be ignored in loss computation)
    labels = torch.full_like(encodings['input_ids'], -100)

    # set labels for target tokens
    for i in range(len(inputs)):
        input_len = input_encodings['input_ids'][i].ne(tokenizer.pad_token_id).sum().item()
        # ensure we don't exceed the sequence length
        target_len = encodings['input_ids'][i].ne(tokenizer.pad_token_id).sum().item() - input_len
        if target_len > 0:
            labels[i, input_len:input_len + target_len] = encodings['input_ids'][i, input_len:input_len + target_len]

    return NegotiationDataset(encodings, labels)

def create_dataloaders(train_dataset, val_dataset, batch_size):
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, pin_memory=True)
    val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, pin_memory=True)
    return train_loader, val_loader


# buyer, seller, and generalist dataset splitting
buyer_inputs_train, buyer_inputs_val, buyer_labels_train, buyer_labels_val = split_dataset(buyer_data)
seller_inputs_train, seller_inputs_val, seller_labels_train, seller_labels_val = split_dataset(seller_data)
generalist_inputs_train, generalist_inputs_val, generalist_labels_train, generalist_labels_val = split_dataset(generalist_data)

# tokenize datasets
buyer_train_dataset = tokenize_data(buyer_inputs_train, buyer_labels_train, base_tokenizer)
buyer_val_dataset = tokenize_data(buyer_inputs_val, buyer_labels_val, base_tokenizer)
seller_train_dataset = tokenize_data(seller_inputs_train, seller_labels_train, base_tokenizer)
seller_val_dataset = tokenize_data(seller_inputs_val, seller_labels_val, base_tokenizer)
generalist_train_dataset = tokenize_data(generalist_inputs_train, generalist_labels_train, base_tokenizer)
generalist_val_dataset = tokenize_data(generalist_inputs_val, generalist_labels_val, base_tokenizer)

# create dataloaders
batch_size = 1
buyer_train_loader, buyer_val_loader = create_dataloaders(buyer_train_dataset, buyer_val_dataset, batch_size)
seller_train_loader, seller_val_loader = create_dataloaders(seller_train_dataset, seller_val_dataset, batch_size)
generalist_train_loader, generalist_val_loader = create_dataloaders(generalist_train_dataset, generalist_val_dataset, batch_size)

logger.info("Created DataLoaders for buyer, seller, and generalist models")

INFO:__main__:Created DataLoaders for buyer, seller, and generalist models


## 6. Training Configuration

Set up the training parameters, including learning rate, batch size, number of epochs, and checkpointing strategies.

In [7]:
# base training arguments
base_training_args = {
    "num_train_epochs": 2,               # number of training epochs
    "per_device_train_batch_size": 1,    # batch size per device during training
    "per_device_eval_batch_size": 1,     # batch size for evaluation
    "gradient_accumulation_steps": 16,    # accumulates gradients over 8 steps, simulating batch size of 1*16=16
    "fp16": True,                        # use mixed precision training to reduce memory usage
    "warmup_steps": 500,                 # number of warmup steps for learning rate scheduler
    "learning_rate": 5e-5,               # learning rate for optimizer
    "weight_decay": 0.01,                # weight decay for optimizer
    "logging_steps": 10,                 # log every 10 steps
    "eval_strategy": "steps",            # evaluate every few steps
    "save_strategy": "steps",            # save model every few steps (CHECKPOINTS)
    "save_steps": 100,                   # save checkpoint every 100 steps
    "load_best_model_at_end": True,      # load/save the best model when finished training (also works with CHECKPOINTS)
    "save_total_limit": 3,               # only keep the last 3 checkpoints to save disk space
    "report_to": "none"                  # disable reporting to WandB or other services
}

# buyer, seller, generalist output, logging specifications
buyer_training_args = TrainingArguments(
    output_dir=os.path.join(finetuning_dir, 'progress', 'buyer'),
    logging_dir=os.path.join(finetuning_dir, 'progress', 'buyer', 'logs'),
    **base_training_args
)
seller_training_args = TrainingArguments(
    output_dir=os.path.join(finetuning_dir, 'progress', 'seller'),
    logging_dir=os.path.join(finetuning_dir, 'progress', 'seller', 'logs'),
    **base_training_args
)
generalist_training_args = TrainingArguments(
    output_dir=os.path.join(finetuning_dir, 'progress', 'generalist'),
    logging_dir=os.path.join(finetuning_dir, 'progress', 'generalist', 'logs'),
    **base_training_args
)

logger.info("Defined TrainingArguments for Trainer")

INFO:__main__:Defined TrainingArguments for Trainer


## 7. Model Fine-tuning

Train each model variant sequentially, ensuring that progress is logged and checkpoints are saved.

In [8]:
def train_model(model, train_loader, optimizer, training_args, save_path, tokenizer):
    """
    Trains the given model using the provided training data loader and optimizer.
    """
    # initialize accelerator for distributed training
    accelerator = Accelerator(
        mixed_precision="fp16" if training_args.fp16 else None,
        gradient_accumulation_steps=training_args.gradient_accumulation_steps
    )
    device = accelerator.device

    # create scheduler for warmup
    total_steps = len(train_loader) * training_args.num_train_epochs
    scheduler = get_linear_schedule_with_warmup(
        optimizer,
        num_warmup_steps=training_args.warmup_steps,
        num_training_steps=total_steps
    )

    # prepare model, optimizer, scheduler and dataloader
    model, optimizer, scheduler, train_loader = accelerator.prepare(
        model, optimizer, scheduler, train_loader
    )
    model.gradient_checkpointing_enable() # for memory optimization
    model.config.use_cache = False # disable for compatibility with gradient checkpointing

    # create scaler for mixed precision
    scaler = torch.amp.GradScaler()

    # clear memory
    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats()

    best_loss = float('inf')
    saved_checkpoints = []
    model.train() # set model to training mode

    for epoch in range(int(training_args.num_train_epochs)):
        total_loss = 0
        progress_bar = tqdm(train_loader, desc=f"Epoch {epoch + 1}/{int(training_args.num_train_epochs)}")

        for step, batch in enumerate(progress_bar):
            # accumulation step tracking
            is_accumulation_step = (step + 1) % training_args.gradient_accumulation_steps != 0

            # move batch to device
            batch = {k: v.to(device) for k, v in batch.items()}

            # forward pass with autocasting
            with torch.amp.autocast(device_type='cuda' if torch.cuda.is_available() else 'cpu', dtype=torch.float16):
                outputs = model(**batch)
                loss = outputs.loss / training_args.gradient_accumulation_steps

            # backward pass with scaling
            scaler.scale(loss).backward()

            # only step if we're at accumulation boundary
            if not is_accumulation_step:
                scaler.step(optimizer)
                scaler.update()
                scheduler.step()
                optimizer.zero_grad()

                # logging based on TrainingArguments
                if step % training_args.logging_steps == 0:
                    logger.info(
                        f"Epoch {epoch}, Step {step}: Loss = {loss.item() * training_args.gradient_accumulation_steps}"
                    )

                # save checkpoints based on strategy on main process only
                if (training_args.save_strategy == "steps" and 
                    step % training_args.save_steps == 0):
                    checkpoint_path = os.path.join(
                        training_args.output_dir,
                        f"checkpoint-{epoch}-{step}"
                    )
                    if accelerator.is_main_process:
                        unwrapped_model = accelerator.unwrap_model(model)
                        unwrapped_model.save_pretrained(checkpoint_path)
                        tokenizer.save_pretrained(checkpoint_path)

                        # manage checkpoint limit
                        saved_checkpoints.append(checkpoint_path)
                        if (len(saved_checkpoints) > training_args.save_total_limit 
                            and training_args.save_total_limit > 0):
                            shutil.rmtree(saved_checkpoints.pop(0))

            total_loss += loss.item() * training_args.gradient_accumulation_steps
            progress_bar.set_postfix({"Loss": loss.item() * training_args.gradient_accumulation_steps})

        avg_loss = total_loss / len(train_loader)
        print(f"Epoch {epoch + 1} Loss: {avg_loss}")

        # save best model
        if training_args.load_best_model_at_end and avg_loss < best_loss:
            best_loss = avg_loss
            if accelerator.is_main_process:
                best_model_path = os.path.join(training_args.output_dir, "best_model")
                unwrapped_model = accelerator.unwrap_model(model)
                unwrapped_model.save_pretrained(best_model_path)
                tokenizer.save_pretrained(best_model_path)

    print(f"Training completed. Model saved to {save_path}")


# training loop to sequentially finetune buyer, seller, and generalist models
for model_type, loader, save_dir, args in [
    ("buyer", buyer_train_loader, buyer_finetuned_dir, buyer_training_args),
    ("seller", seller_train_loader, seller_finetuned_dir, seller_training_args),
    ("generalist", generalist_train_loader, generalist_finetuned_dir, generalist_training_args)
]:
    print(f"\nTraining {model_type} model...")
    model_copy = deepcopy(base_model).to(device).to(torch.float32)

    optimizer = AdamW(
        model_copy.parameters(),
        lr=args.learning_rate,
        weight_decay=args.weight_decay
    )

    train_model(
        model=model_copy,
        train_loader=loader,
        optimizer=optimizer,
        training_args=args,
        save_path=save_dir,
        tokenizer=base_tokenizer
    )

    # clear memory
    del model_copy, optimizer
    torch.cuda.empty_cache()


Training buyer model...


Epoch 1/2:   0%|          | 0/32936 [00:00<?, ?it/s]

OutOfMemoryError: CUDA out of memory. Tried to allocate 1002.00 MiB. GPU 0 has a total capacity of 6.00 GiB of which 0 bytes is free. Of the allocated memory 11.76 GiB is allocated by PyTorch, and 201.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

## 8. Finetuned Models Evaluation

Ensure that each finetuned model is saved correctly in its designated directory with all necessary files.

In [None]:
def verify_model_save(save_path, model_name):
    """
    Verifies that the finetuned model is saved correctly by checking for essential files.

    Args:
        save_path (str): Directory path where the model is saved.
        model_name (str): Name identifier for logging purposes.

    Raises:
        FileNotFoundError: If essential model files are missing.
    """
    essential_files = ['config.json', 'pytorch_model.bin', 'tokenizer.json', 'tokenizer_config.json']
    for file in essential_files:
        if not os.path.exists(os.path.join(save_path, file)):
            logger.error(f"Missing {file} in {model_name} at {save_path}")
            raise FileNotFoundError(f"Missing {file} in {model_name} at {save_path}")
    logger.info(f"All essential files found for {model_name} at {save_path}")

# verify buyer, seller, and generalist specialist models
verify_model_save(buyer_finetuned_dir, "Buyer Specialist")
verify_model_save(seller_finetuned_dir, "Seller Specialist")
verify_model_save(generalist_finetuned_dir, "Generalist Negotiator")
logger.info("All finetuned models have been successfully saved and verified.")

Do a quick evaluation of the finetuned models to ensure that they are functioning as expected.

In [None]:
def evaluate_model(model, val_loader):
    """
    Evaluates the model on the validation dataset using DataLoader.

    Args:
        model: Trained model to evaluate.
        val_loader: DataLoader for validation data.

    Returns:
        Perplexity of the model on the validation dataset.
    """
    model = model.to(device)
    model.eval()
    total_loss = 0
    total_tokens = 0

    with torch.no_grad():
        for batch in tqdm(val_loader, desc="Evaluating"):
            batch = {key: val.to(device) for key, val in batch.items()}
            outputs = model(**batch)
            loss = outputs.loss.item()
            total_loss += loss * batch['input_ids'].size(0)
            total_tokens += batch['input_ids'].numel()

    # calculate average loss and perplexity
    avg_loss = total_loss / total_tokens
    perplexity = torch.exp(torch.tensor(avg_loss))
    return perplexity.item()


# load, evaluate, and remove buyer model
buyer_model = model_loader.load_model(buyer_finetuned_dir)
buyer_perplexity = evaluate_model(buyer_model, buyer_val_loader)
logger.info(f"Buyer Specialist Perplexity: {buyer_perplexity}")
del buyer_model
torch.cuda.empty_cache()

# load, evaluate, and remove seller model
seller_model = model_loader.load_model(seller_finetuned_dir)
seller_perplexity = evaluate_model(seller_model, seller_val_loader)
logger.info(f"Seller Specialist Perplexity: {seller_perplexity}")
del seller_model
torch.cuda.empty_cache()

# load, evaluate, and remove generalist model
generalist_model = model_loader.load_model(generalist_finetuned_dir)
generalist_perplexity = evaluate_model(generalist_model, generalist_val_loader)
logger.info(f"Generalist Negotiator Perplexity: {generalist_perplexity}")
del generalist_model
torch.cuda.empty_cache()

Generate sample responses from each model variant to verify role-specific behaviors and negotiation strategies.

In [None]:
def generate_response(model, tokenizer, prompt, max_tokens=50):
    """
    Generates a response from the model based on the provided prompt.

    Args:
        model (AutoModelForCausalLM): The finetuned model.
        tokenizer (AutoTokenizer): The corresponding tokenizer.
        prompt (str): The input prompt for the model.
        max_tokens (int): Maximum number of tokens to generate.

    Returns:
        str: The generated response.
    """
    model = model.to(device) # move model to device
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=max_tokens,
        do_sample=True, # use sampling
        temperature=0.7, # control randomness
        top_p=0.9, # nucleus sampling
        pad_token_id=tokenizer.eos_token_id
    )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response


# define sample prompts
buyer_prompt = (
    "You are a buyer negotiating over items.\n"
    "Analyze the situation and determine if you should accept, reject, or counteroffer.\n"
    "If you counteroffer, provide a new price for the item.\n"
    "Your item values: {'book': {'count':1, 'value':10}}\n"
    "Partner's values: {'book': {'count':1, 'value':8}}\n"
    "Previous messages:\n"
    "Seller: I can offer the book for $15.\n"
    "Your response:"
)
seller_prompt = (
    "You are a seller negotiating over items.\n"
    "Analyze the situation and determine if you should accept, reject, or counteroffer.\n"
    "If you counteroffer, provide a new price for the item.\n"
    "Your item values: {'book': {'count':1, 'value':10}}\n"
    "Partner's values: {'book': {'count':1, 'value':8}}\n"
    "Previous messages:\n"
    "Buyer: I can offer $5 for the book.\n"
    "Your response:"
)
generalist_buyer_prompt = (
    "You are a buyer negotiating over items.\n"
    "Analyze the situation and determine if you should accept, reject, or counteroffer.\n"
    "If you counteroffer, provide a new price for the item.\n"
    "Your item values: {'book': {'count':1, 'value':10}}\n"
    "Partner's values: {'book': {'count':1, 'value':8}}\n"
    "Previous messages:\n"
    "Seller: I can offer the book for $15.\n"
    "Your response:"
)
generalist_seller_prompt = (
    "You are a seller negotiating over items.\n"
    "Analyze the situation and determine if you should accept, reject, or counteroffer.\n"
    "If you counteroffer, provide a new price for the item.\n"
    "Your item values: {'book': {'count':1, 'value':10}}\n"
    "Partner's values: {'book': {'count':1, 'value':8}}\n"
    "Previous messages:\n"
    "Buyer: I can offer $5 for the book.\n"
    "Your response:"
)


# load, generate response, and remove buyer model
buyer_model = model_loader.reload_model(buyer_finetuned_dir)
logger.info("Loaded finetuned buyer model")
buyer_response = generate_response(buyer_model, base_tokenizer, buyer_prompt)
print(f"Buyer Specialist Prompt:\n{buyer_prompt}\n")
print(f"Buyer Specialist Response:\n{buyer_response}\n")
del buyer_model
torch.cuda.empty_cache()

# load, generate response, and remove seller model
seller_model = model_loader.reload_model(seller_finetuned_dir)
logger.info("Loaded finetuned seller model")
seller_response = generate_response(seller_model, base_tokenizer, seller_prompt)
print(f"Seller Specialist Prompt:\n{seller_prompt}\n")
print(f"Seller Specialist Response:\n{seller_response}\n")
del seller_model
torch.cuda.empty_cache()

# load, generate response, and remove generalist model for buyer prompt
generalist_model = model_loader.reload_model(generalist_finetuned_dir)
logger.info("Loaded finetuned generalist model")
generalist_buyer_response = generate_response(generalist_model, base_tokenizer, generalist_buyer_prompt)
print(f"Generalist Negotiator (Buyer) Prompt:\n{generalist_buyer_prompt}\n")
print(f"Generalist Negotiator (Buyer) Response:\n{generalist_buyer_response}\n")
del generalist_model
torch.cuda.empty_cache()

# load, generate response, and remove generalist model for seller prompt
generalist_model = model_loader.reload_model(generalist_finetuned_dir)
generalist_seller_response = generate_response(generalist_model, base_tokenizer, generalist_seller_prompt)
print(f"Generalist Negotiator (Seller) Prompt:\n{generalist_seller_prompt}\n")
print(f"Generalist Negotiator (Seller) Response:\n{generalist_seller_response}\n")
del generalist_model
torch.cuda.empty_cache()