<a href=''># LLM Prompt Recovery using LLM
This notebook investigates the potential of pre-trained or fine-tuned LLMs to recover the original prompt used to generate a text. 

### Table of contents¶
* [Offline Installation](#installation)
* [Importing Libraries](#libraries)
* [Model Training](#training)
    - [Fine-tuning Gemma-2b model using Keras library](#gemma-2b)
    - [Fine-tuning Gemma-7b model using pytorch library](#pretrained)
* [Model Inference](#infer)
    - [Generate prompts using fine-tuned Phi LLM](#phi)
    - [Generate the prompts using pretrained Gemma-7b LLM](#llm)
    - <a href='#mistral'>Generate the prompts using pretrained Mistral-7b LLM (version 2)]</a>
    - [Modify the prompts using Spacy](#mean-prompt)
    - [Evaluation](#evaluation)
    
### Change Logs:

- [version 7] (LB=0.63) use Spacy library to rewrite the prompts with most representative words  
- [version 5] (LB=0.63) use pretrained Gemma-7b and only one rewrite template

   


## Offline Installation <a class="anchor"  id="installation"></a>
This section details the installation of required packages for offline use: bitandbytes (if applicable), accelerate, datasets, transformers, TRL (supervised fine-tuning trainer), and PEFT (LoRa layers), optimum (transformer performance optimization tools)

In [1]:
# Install bitsandbytes for loading the LLM model faster
!pip install --no-index --find-links=/kaggle/input/bitsandbytes -r /kaggle/input/bitsandbytes/requirements.txt
!pip install --no-index --find-links=/kaggle/input/accelerate -r /kaggle/input/accelerate/requirements.txt
!pip install --no-index --find-links=/kaggle/input/transformers -r /kaggle/input/transformers/requirements.txt
# Install datasets
!pip install --no-index --find-links=/kaggle/input/datasets-installation -r /kaggle/input/datasets-installation/requirements.txt
# Install TRL for using Supervised Fine-tuning Trainer
!pip install --no-index --find-links=/kaggle/input/transformer-reinforcement-learning -r /kaggle/input/transformer-reinforcement-learning/requirements.txt
# Install PEFT
!pip install --no-index --find-links=/kaggle/input/peft-installation -r /kaggle/input/peft-installation/requirements.txt
# Install optimum
!pip install --no-index --find-links=/kaggle/input/optimum-installation -r /kaggle/input/optimum-installation/requirements.txt
# Install sentence transformer
!pip install --no-index --find-links=/kaggle/input/sentence-transformers -r /kaggle/input/sentence-transformers/requirements.txt

Looking in links: /kaggle/input/bitsandbytes
Processing /kaggle/input/bitsandbytes/bitsandbytes-0.43.0-py3-none-manylinux_2_24_x86_64.whl (from -r /kaggle/input/bitsandbytes/requirements.txt (line 1))
Installing collected packages: bitsandbytes
Successfully installed bitsandbytes-0.43.0
Looking in links: /kaggle/input/accelerate
Looking in links: /kaggle/input/transformers
Looking in links: /kaggle/input/datasets-installation
Processing /kaggle/input/datasets-installation/datasets-2.16.0-py3-none-any.whl (from -r /kaggle/input/datasets-installation/requirements.txt (line 1))
Processing /kaggle/input/datasets-installation/dill-0.3.7-py3-none-any.whl (from datasets==2.16.0->-r /kaggle/input/datasets-installation/requirements.txt (line 1))
Processing /kaggle/input/datasets-installation/fsspec-2023.10.0-py3-none-any.whl (from fsspec[http]<=2023.10.0,>=2023.1.0->datasets==2.16.0->-r /kaggle/input/datasets-installation/requirements.txt (line 1))
INFO: pip is looking at multiple ver

# Import libraries <a class="anchor"  id="libraries"></a>

In [2]:
import os, random
import pandas as pd
import numpy as np
# from string import Template
from pathlib import Path

from torch import nn
# Transformer
from accelerate import Accelerator
import transformers
from transformers import (pipeline, AutoTokenizer, AutoModelForCausalLM, 
                          BitsAndBytesConfig, AutoConfig, TrainingArguments)
# Supervised Trainser
from datasets import Dataset

# Split data into training and test (valid) dataset
from sklearn.model_selection import train_test_split

# For quantization
import bitsandbytes, accelerate
from tqdm.notebook import tqdm
from sklearn.metrics.pairwise import cosine_similarity
import optimum

2024-04-16 04:16:22.893135: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-16 04:16:22.893247: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-16 04:16:23.030910: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [3]:
import ctypes, gc
import torch

libc = ctypes.CDLL("libc.so.6")
# Seed the same seed to all 
def seed_everything(seed=42):
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = True
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    
def clear_memory():
    libc.malloc_trim(0)
    torch.cuda.empty_cache()
    gc.collect()

SEED = 42
seed_everything(SEED)
# Set the GPUs
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Model Training <a class='anchor' id='training'></a>
## Fine-tuning Gemma-2b model using Keras library <a class="anchor"  id="gemma-2b"></a>


Ref: @JUAN MERINO [Fine Tuning with Gemma 2b](https://www.kaggle.com/code/juanmerinobermejo/fine-tuning-with-gemma-2b)

In [4]:
# Import keras and Keras-NLP for training
import keras
import keras_nlp
from trl import SFTTrainer, DataCollatorForCompletionOnlyLM
from peft import LoraConfig, get_peft_model, TaskType, PeftConfig, PeftModel
import os
os.environ["KERAS_BACKEND"] = "jax"
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

import warnings
warnings.simplefilter("ignore")

In [5]:
class CFG:
    model_name = 'gemma_2b_en'
    model_path = '/kaggle/input/gemma/keras/gemma_2b_en/2'
    data_path = '/kaggle/input/rewritten-texts-with-gemma-2b/rewritten_texts_csv.csv'
    output_path = f'outputs'
    model_save_path =  f'{model_name}_adapter'
    
    # Model training argument
    epochs=20
    batch_size=1 
    max_length=512 
    lr = 1e-3
    
print(CFG.model_save_path)

gemma_2b_en_adapter


### Load and Train the model

Quantization technique is used to reduce memory and computational costs by representing weights and activations with lower-precision data types like 8-bit integers (int8). 

In [6]:
## Load data and split into training and valid dataset
def load_data():
    df = pd.read_csv(CFG.data_path, encoding='latin-1')
    output_texts = []
    for index in range(len(df)):
        row = df.iloc[index]
        original_text = row['original_text']
        prompt = row['prompt']
        rewritten_text = row['rewritten_text']
        # Format the prompt with original and rewritten texts
        formatted_prompt = f"""Original Text:\n{original_text}\n\n
                               Prompt:\n{prompt}\n\n
                               Rewritten text:\n{rewritten_text}"""
        if len(formatted_prompt) < CFG.max_length:
            output_texts.append(formatted_prompt)
    del df
    return output_texts

In [7]:
def train_model():
    # Load the training data
    training_data = load_data() 
    # Load the Gemma and add lora layer
    # ref: https://ai.google.dev/gemma/docs/lora_tuning
    gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset(CFG.model_name)
    gemma_lm.summary()
    # This will freeze all weights on the backbone,
    # while enabling Lora on the query & value layers of the attention layers.
    gemma_lm.backbone.enable_lora(rank=4)
    gemma_lm.preprocessor.sequence_length = CFG.max_length
    # Create the optimizer (AdamW)
    optimizer = keras.optimizers.AdamW(learning_rate=CFG.lr,
                                       weight_decay=0.001,
                                       beta_1=0.9,
                                       beta_2=0.999)
    optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])
    # Add optimizer, loss function and evalution metrics
    gemma_lm.compile(loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                     optimizer=optimizer,
                     weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()])
    # Train the model with 
    gemma_lm.fit(training_data, epochs=CFG.epochs, batch_size=1, verbose=1)      
    # Save the model
    gemma_lm.save_weights(CFG.model_save_path)
    gemma_lm.preprocessor.tokenizer.save_assets(CFG.model_save_path)

In [8]:
TRAINING = False # True: Enable training, False: Infer only
if TRAINING:
    train_model()
    os._exit(0)

##  Fine-tuning Gemma-7b using pytorch library<a class="anchor" id="pretrained"></a>
Fine-tuned pretrained LLM (Gemma/Mistral/Phi) to infer the testing data's prompt.

- @ZHANSAYA YUSSUPOVA [Gemma 7B with LoRa | Prompt Recovery](https://www.kaggle.com/code/yujansaya/gemma-7b-with-lora-prompt-recovery)


In [9]:
class CFG:
    model_name = 'gemma_7b'
    model_paths = {'gemma_7b': '/kaggle/input/gemma/transformers/7b-it/2'}
    model_path = model_paths[model_name]
    
    # Model training argument
    data_path = '/kaggle/input/gemma-rewrite-nbroad/nbroad-v2.csv'
    model_save_path =  f'{model_name}_adapter'
    max_length=150 # truncate the text to the first 150 words to avoid OOM issues.
    NROWS = 10 # Read 1000 texts from dataset
    batch_size = 1
    lr = 2e-4

## Load the model

In [10]:
def load_model():
    accelerator = Accelerator()
    # Use quantization technique to reduce the memory usage
    quantization_config = BitsAndBytesConfig(
        load_in_4bit = True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
        bnb_4bit_use_double_quant=True,
    )
    # Load the tokenizer
    tokenizer = AutoTokenizer.from_pretrained(CFG.model_path)
    # Load the model
    model = AutoModelForCausalLM.from_pretrained(
                                CFG.model_path,
                                device_map = "auto",
                                trust_remote_code = True,
                                quantization_config=quantization_config)
    model = accelerator.prepare(model)
    return model, tokenizer

## Model training with prompts generated by Gemma LLM

In [11]:
# Formate the row (example) data with an instruction
def formatting_func(example):
    prompt = f"""Original Essay:\n{example['original_text'][0]}\n\n
               Rewritten Essay:\n{example['rewritten_text'][0]}\n\n
               Instruction:\n Given are 2 essays, the Rewritten essay was created from the Original essay using the google Gemma model.
               You are trying to understand how the original essay was transformed into a new version. 
               Analyzing the changes in style, theme, etc., please come up with a prompt that must have been used to guide the transformation from the original to the rewritten essay.
               Only give me the PROMPT. Start directly with the prompt, that's all I need.
               Output should be only line ONLY.\n\n
               Response: \n{example['rewrite_prompt'][0]}"""
    return [prompt]

def train_model(model, tokenizer):
    # Load the training data
    df = pd.read_csv(CFG.data_path, nrows=CFG.NROWS)
    # Create the dataset
    training_ds = Dataset.from_pandas(df)
    # Tokenizer 
    training_ds = training_ds.map(lambda samples: tokenizer(samples["original_text"]), batched=True)
    training_ds = training_ds.map(lambda samples: tokenizer(samples["rewritten_text"]), batched=True)
    training_ds = training_ds.map(lambda samples: tokenizer(samples["rewrite_prompt"]), batched=True)    
    # Add PEFT (lora) layer
    lora_config = LoraConfig(r=32, # Rank
                             lora_alpha=32,
                             target_modules=["q_proj", "o_proj", "k_proj", 
                                             "v_proj", "gate_proj", "up_proj", "down_proj"],
                             lora_dropout=0.05,
                             bias="none",
                             task_type=TaskType.CAUSAL_LM)
    # Training arguments
    args = TrainingArguments(
            per_device_train_batch_size=CFG.batch_size,
            gradient_accumulation_steps=4,
            warmup_steps=2,
            max_steps=10,
            learning_rate=CFG.lr,
            fp16=True,
            logging_steps=1,
            output_dir="outputs",
            optim="paged_adamw_8bit",
            report_to="none"
        )
    # Create a trainer (supervised fine-tuned trainer)
    trainer = SFTTrainer(model=model,
                         train_dataset=training_ds,
                         args=args,
                         peft_config=lora_config,
                         formatting_func=formatting_func)
    trainer.train()
    # Save the model
    trainer.save_model(CFG.model_save_path)
    tokenizer.save_pretrained(CFG.save_path)
    print(f"Save the model to {CFG.save_path}")
    

In [12]:
TRAINING = False # True: Enable training, False: Infer only
if TRAINING:
    model, tokenizer = load_model()
    train_model(model, tokenizer)
    os._exit(0)

# Model Inference <a class='anchor' id='infer'></a>
- [Load testing data](#load_data)
- [Generate prompts using fine-tuned Phi LLM](#phi)
- [Generate the prompts using pretrained Gemma-7b LLM](#llm)
- [Generate the prompts using pretrained Mistral-7b LLM (version 2)](#mistral)
- [Modify the prompts using Spacy](#mean-prompt): This is inspired by [Spacy mean-prompt modification (just CPU!)](https://www.kaggle.com/code/richolson/spacy-mean-prompt-modification-just-cpu/notebook)

In [13]:
class CFG:
    # Get device (CPUs or GPUs)
    DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model_paths = {'phi': '/kaggle/input/phi/transformers/2/1',
                   'gemma-7b': '/kaggle/input/gemma/transformers/7b-it/2', 
                   'mistral-7b': '/kaggle/input/mistral/pytorch/7b-instruct-v0.1-hf/1',
                   'mistral-7b-v2': '/kaggle/input/mistral-7b-it-v02',
                   'mistral-8x7b':'/kaggle/input/mixtral/pytorch/8x7b-instruct-v0.1-hf/1'
                   }


## Load testing data <a class='anchor' id='load_data'></a> 

In [14]:
# Load the testing data
test_df = pd.read_csv('/kaggle/input/llm-prompt-recovery/test.csv', index_col='id')
test_df["rewrite_prompt"] = "-" # Empty
test_df.head()

Unnamed: 0_level_0,original_text,rewritten_text,rewrite_prompt
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
-1,The competition dataset comprises text passage...,Here is your shanty: (Verse 1) The text is rew...,-


# Generate prompts using fine-tuned Phi LLM <a class='anchor' id='phi'></a>
Use the Microsoft Phi LLM fined-tuned by @LUMOS [phi2-public-data-sft-adapter](https://www.kaggle.com/models/mozhiwenmzw/phi2-public-data-sft-adapter/frameworks/PyTorch/variations/public-data-sft/versions/1) to generate the prompts of testing data

Credits:
- @Lumos [[0.61+]LLMPR phi2 sft model training](https://www.kaggle.com/code/mozhiwenmzw/0-61-llmpr-phi2-sft-model-training)
- @Lumos [[0.61+]LLMPR phi2 sft model generate infer](https://www.kaggle.com/code/mozhiwenmzw/0-61-llmpr-phi2-sft-model-generate-infer)

In [15]:
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
adapter_path = '/kaggle/input/phi2-public-data-sft-adapter/pytorch/public-data-sft/1/phi2_public_data_sft'

In [16]:
class PhiModelRecover:
    def __init__(self):
        self.model_name = 'phi'
        self.load_model()
        self.input_token_len = 1024
        self.output_token_len = 100 
        
    # Load tokenizer and model
    def load_model(self):
        model_path = CFG.model_paths[self.model_name]
        print(f"model_path = {model_path}")
         # Load the tokenizer
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.tokenizer.pad_token = self.tokenizer.eos_token
        # Load the pretrained LLM in 4bit quantization  
        q_config = BitsAndBytesConfig(
            load_in_4bit = True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.bfloat16,
            bnb_4bit_use_double_quant=True,
        )
        # Load the model
        base_model = AutoModelForCausalLM.from_pretrained(model_path,
                                                          device_map="auto",
                                                          trust_remote_code=True,
                                                          quantization_config=q_config)
        # Load PEFT adapter to the model
        self.model = PeftModel.from_pretrained(base_model, adapter_path)
        print(f"Complete loading PEFT adapter {adapter_path}")
        self.model.to(CFG.DEVICE)
        self.model.eval()
        print("Complete loading the model")
        
    # Generate the prompts using Phi models
    def prompt_generate(self, original_text, rewrite_text):
        prompt = f"""Instruct: Original Text:{original_text}\n
                     Rewritten Text:{rewrite_text}\n
                     Write a prompt that was likely given to the LLM to rewrite original text
                     to rewritten text.\nOutput:"""
        # print(f"prompt = {prompt}")
        # Tokenize the prompt and truncate to '1024' tokens
        inputs = self.tokenizer(prompt, max_length=self.input_token_len,
                                truncation=True, return_tensors="pt", return_attention_mask=False)
        try:
            max_length = len(inputs.input_ids[0]) + self.output_token_len
            #print(f"max_length = {max_length}")
            # Move inputs to GPU
            inputs = {k:v.to(CFG.DEVICE) for k,v in inputs.items()}
            # print(f"inputs = {inputs}")        
            # Generate the prompt
            outputs = self.model.generate(**inputs,
                                         do_sample=False,
                                         max_length=max_length,
                                         pad_token_id=self.tokenizer.pad_token_id)
            # Encode the output to texts (strings)
            text = self.tokenizer.batch_decode(outputs,
                                               skip_special_tokens=True,
                                               clean_up_tokenization_spaces=False)[0]
            text_arr = text.split("Output:")
            generated_prompt = text_arr[1].strip()
            # print(f"generated_prompt = {generated_prompt}")
            return generated_prompt
        except Exception as e:
            print(f"ERROR: {e}")
            pass # Add the default prompt if errors occur
    
    def infer(self, test_df):
        default_prompt = """Please improve the following text using the writing style of, 
                            maintaining the original meaning but altering the tone, diction, 
                            and stylistic elements to match the new style.Enhance the clarity, 
                            elegance, and impact of the following text by adopting the writing style of,
                            ensuring the core message remains intact while transforming the tone,
                            word choice, and stylistic features to align with the specified style."""
        rewrite_prompts = []
        for i in range(len(test_df)):
            row = test_df.iloc[i]
            prompt = default_prompt
            try:
                prompt = self.prompt_generate(row['original_text'], row['rewritten_text'])
            except Exception as e:
                print(f"ERROR: {e}")
                pass # Add the default prompt if errors occur
            rewrite_prompts.append(prompt)
        del self.model, self.tokenizer
        return rewrite_prompts

In [17]:
SUBMISSION = False
if SUBMISSION:
    recover = PhiModelRecover() 
    rewrite_prompts = recover.infer(test_df)
    print(f"rewrite_prompts = {rewrite_prompts}")
    del recover
    # Submission
    submission = pd.read_csv('/kaggle/input/llm-prompt-recovery/sample_submission.csv')
    submission["rewrite_prompt"] = rewrite_prompts
    submission.to_csv('submission.csv', index=False)
    display(submission)


# Generate the prompts using pretrained Gemma-7b LLM <a class='anchor' id='llm'></a>
Use pretrained Gemma-7b LLM to generate the prompts directly from testing data.
- @RENOIR [Perplexity Baseline [Phi-2,Gemma-7b-it]](https://www.kaggle.com/code/itahiro/perplexity-baseline-phi-2-gemma-7b-it)
- @PSI [h2oGPT Perplexity Ranking](https://www.kaggle.com/code/philippsinger/h2ogpt-perplexity-ranking)

In [18]:
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [19]:
# Perplexity is a metric that measures the quality of language models
# Perplexity is calculated as the exponent of the loss obtained from the model.
class Perplexity(nn.Module):
    def __init__(self, reduce: bool = True):
        super().__init__()
        self.loss_fn = nn.CrossEntropyLoss()
        self.reduce = reduce

    def forward(self, logits, labels):
        shift_logits = logits[..., :-1, :].contiguous()
        shift_labels = labels[..., 1:].contiguous()

        perplexity = []
        for i in range(labels.shape[0]):
            perplexity.append(self.loss_fn(shift_logits[i], shift_labels[i]))
        perplexity = torch.stack(perplexity, dim=0)
        #perplexity = torch.exp(perplexity)
        if self.reduce:
            perplexity = torch.mean(perplexity)
        return perplexity

In [20]:
rewrite_prompt_templates = [
"""Please improve this text using the writing style with maintaining the original meaning but
   altering the tone.""",
"""Please improve the following text by reimagining it through the lens of 
   [insert desired style here], retaining the original essence while elevating its clarity,
   eloquence, and potency by modulating the tone, word choice, and stylistic nuances to 
   harmoniously embody the stylistic features while ensuring the core message remains intact.""",
"""Please improve the following text using the writing style of, 
   maintaining the original meaning but altering the tone, diction,  
   and stylistic elements to match the new style.Enhance the clarity, 
   elegance, and impact of the following text by adopting the writing style of,
   ensuring the core message remains intact while transforming the tone,
   word choice, and stylistic features to align with the specified style.""",
]

In [21]:
class GemmaModelRecover:
    def __init__(self):
        self.model_name = 'gemma-7b'
        self.perp_nn = Perplexity() # Compute the perplexity
        self.load_model()
        
    # Load tokenizer and model
    def load_model(self):
        model_path = CFG.model_paths[self.model_name]
        print(f"model_path = {model_path}")
         # Load the tokenizer
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.tokenizer.pad_token = self.tokenizer.eos_token
        # Load the pretrained LLM in 4bit quantization  
        q_config = BitsAndBytesConfig(
            load_in_4bit = True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.bfloat16,
            bnb_4bit_use_double_quant=True,
        )
        # Load the model
        self.model = AutoModelForCausalLM.from_pretrained(model_path,
                                                          device_map="auto",
                                                          trust_remote_code=True,
                                                          quantization_config=q_config)
        print("Complete loading the model")
        
    # Infer the prompt for given texts (df)
    def infer(self, df):
        prompts = []
        for idx in range(len(df)):
            row = df.iloc[idx]
            p_scores = []
            with torch.no_grad():
                 # # Combine the rewrite prompt with row data (original text, rewritten text) as a prompt
                rw_prompts = []
                for rw_prompt in rewrite_prompt_templates:
                    rw_prompt = rw_prompt.replace("\n", " ")
                    rw_prompt = rw_prompt.replace("\s+", " ")
                    # print(f"rw_prompt = {rw_prompt}")
                    rw_prompts.append(f"""<start_of_turn>
                                            user {rw_prompt} {row["original_text"]}
                                          <end_of_turn>
                                          <start_of_turn>
                                              model{row["rewritten_text"]}
                                          <end_of_turn>""")
                # Encode prompts to embeddings
                inputs = self.tokenizer(rw_prompts, return_tensors="pt",
                                        add_special_tokens=False,
                                        padding=True, truncation=True).to(CFG.DEVICE)
                # Get the output
                output = self.model(input_ids=inputs["input_ids"],
                                    attention_mask=inputs["attention_mask"])
                logits = output.logits

                labels = inputs["input_ids"]
                # Attention masks has three kinds of scores:
                # 1 = attend; 0 = ignore; -100: nullifying their impact on the sequence.
                labels.masked_fill_(~inputs["attention_mask"].bool(), -100) # -100 

                # Compute the perplexity of model output (logits) and actual labels
                for i in range(len(rewrite_prompt_templates)):
                    p_score = self.perp_nn(logits[i].unsqueeze(0), 
                                           labels[i].unsqueeze(0))
                    p_scores.append(p_score.detach().cpu())
                del inputs, labels, output, logits
            # Convert 'perps' as numpy array
            p_scores = np.array(p_scores)
            # Display the perplexity metric
            print(f"p_scores = {p_scores}")
            # Get the best output results of the lowest 
            best_pred = [np.array(rewrite_prompt_templates)[np.argsort(p_scores)][0]]
            print(f"best_pred = {best_pred}")
            prompts.append(str(best_pred[0]))
            clear_memory()
        return prompts

In [22]:
SUBMISSION = False
if SUBMISSION:
    recover = GemmaModelRecover() 
    rewrite_prompts = recover.infer(test_df)
    print(f"rewrite_prompts = {rewrite_prompts}")
    del recover
    # Submission
    submission = pd.read_csv('/kaggle/input/llm-prompt-recovery/sample_submission.csv')
    submission["rewrite_prompt"] = rewrite_prompts
    submission.to_csv('submission.csv', index=False)
    display(submission)


# Generate the prompts using pretrained Mistral-8x7b/ Mistral-7b-v2 <a class='anchor' id='mistral'></a>
Use pretrained Mistral LLM to generate the prompts directly from testing data. No training is required. To guide its predictions, provide a few examples and leverage the Mistral LLM to predict prompts for the testing data.

- @RICH OLSON [Mistral 7B Prompt Recovery (Version 2)](https://www.kaggle.com/code/richolson/mistral-7b-prompt-recovery-version-2)
- @AATIF FRAZ [Prompt Prediction w/ Mixtral/Mistral7B/Gemma/Llama](https://www.kaggle.com/code/aatiffraz/prompt-prediction-w-mixtral-mistral7b-gemma-llama/notebook)

In [23]:
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [24]:
# # Disable effiency to avoid the issues reported by https://github.com/Lightning-AI/lit-gpt/issues/327
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)

In [25]:
MISTRAL_INSTRUCTION = """
Provide the new text and I will tell you what new element was added or change in tone was made
to improve it - with no references to the original.
I will avoid mentioning names of characters.
It is crucial no person, place or thing from the original text be mentioned.
For example - I will not say things like 'change the puppet show into a book report'
- I would just say 'improve this text into a book report'.
If the original text mentions a specific idea, person, place, or thing - I will not mention it in my answer.
For example if there is a 'dog' or 'office' in the original text - the word 'dog' or 'office' must not be in my response.
My answer will be a single sentence."""

MISTRAL_INSTRUCTION = MISTRAL_INSTRUCTION.replace("\n", " ")
MISTRAL_INSTRUCTION = MISTRAL_INSTRUCTION.replace("\s+", " ")

In [26]:
# 10 examples of rewritten prompts 
df1 = pd.read_csv('/kaggle/input/rewrite-prompts-examples/rewrite_examples.csv')
df2 = pd.read_csv('/kaggle/input/gemma-rewrite-nbroad/nbroad-v2.csv')

example_df = pd.concat([df1, df2], axis=0)
example_df = example_df.sample(n=10) # Sample 10 examples
del df1, df2

display(example_df)
examples = []
# Add 10 examples
for example_text, example_rewrite, example_prompt in zip(example_df['original_text'],
                                                         example_df['rewritten_text'],
                                                         example_df['rewrite_prompt']):
    examples.append({"role": "user", "content": f"Original Text: {example_text}"})
    examples.append({"role": "assistant", "content": MISTRAL_INSTRUCTION})
    examples.append({"role": "user", "content": f"Re-written Text: {example_rewrite}"})
    examples.append({"role": "assistant", "content": f"The request was:  {example_prompt}"})

Unnamed: 0.1,Unnamed: 0,original_text,rewritten_text,rewrite_prompt,id
2173,,"`` Jacques, you're digging your teeth into my ...","In the dim, smoky haze of the Parisian catacom...",Rewrite the story as a noir with gangsters and...,opuqnkOAyP
2159,,"There is wind in her eyes, playing with her la...","There's wind in her eyes, like a playful breez...",Rewrite the story as a comedy where they have ...,IVxxhZAhPH
1370,,"Sometimes, a girl up and gets herself pregnant...","However it happens, it happens often enough th...",Rewrite the story with more absurd humor and r...,xwYYIkBtms
1522,,"I'm a simple guy, and I do a simple job. Get i...","The music cuts out, and the hallway erupts in ...",Rewrite the story as a chase scene from a Holl...,SRzHlChqxE
1117,,He never did care for anything. Not the smile ...,He never did care for anything. Not the smile ...,Rewrite the essay with a darker twist,ZrcyKRomnC
1754,,"Darius hits the wall with a loud thump, flung ...","Darius hits the wall with a loud thump, flung ...","Rewrite by merging this prompt with: ""In a fan...",GPxUDTLGvY
2395,,"`` Dave, have you seen this?'' There is no ans...","Alas, poor Dave, a victim of fate's cruel hand...",Rewrite this as a Shakespearean tragedy,ljSaWTXjFc
1019,,"`` You know, Pickering,'' said the man, `` fir...","In the twilight realm of dreams and illusion, ...",Rewrite your description to be surreal and dre...,gZFRWPfQCP
921,,"Dear Santa, \n \n Thanks for the BB gas gun. I...","In the quaint town of Snow Creek, where snowfl...",Rewrite the story as a heartwarming tale,SCwCSnMXwE
202,,"Oh, thank you doctor. I've been excited to lea...","Oh, thank you doctor. I've been excited to lea...",Rewrite the essay so the results come back fro...,VwwAXvxbdL


## Utility function
Utility function processes the output/responses from the Mistral LLM

In [27]:
#mistral v02 tends to respond with the input after providing the answer  
#This trims response text to the requested number of sentences (or first LF or double-space sequence)
def trim_to_first_num_sentences(text, num_sentences):
    if num_sentences <= 0:
        return "" # Return empty string

    # Split text at the first linefeed
    text_chunks = text.split('\n', 1)
    first_chunk = text_chunks[0]

    # Split the first chunk into sentences, considering the space after each period
    sentences = [sentence.strip() for sentence in first_chunk.split('.') if sentence]

    # If there's a linefeed, return the text up to the first linefeed
    if len(text_chunks) > 1:
        # Check if the first chunk has fewer sentences than x, and if so, just return it
        if len(sentences) < num_sentences:
            trimmed_text = first_chunk
        else:
            # Otherwise, trim to x sentences within the first chunk
            trimmed_text = '. '.join(sentences[:num_sentences]).strip()
    else:
        # If there's no linefeed, determine if the number of sentences is less than or equal to x
        if len(sentences) <= num_sentences:
            trimmed_text = '. '.join(sentences).strip()  # Ensure space is preserved after periods
        else:
            # Otherwise, return the first x sentences, again ensuring space after periods
            trimmed_text = '. '.join(sentences[:num_sentences]).strip()

    # Add back the final period if it was removed and the text needs to end with a sentence.
    if len(sentences) > 0 and not trimmed_text.endswith('.'):
        trimmed_text += '.'

    return trimmed_text


# Get text after last [/INST]
def trim_output(text):
    TERMINATE = "[/INST]"
    text = text.replace('</s>', '')
    #just in case it puts things in quotes
    text = text.replace('"', '')
    text = text.replace("'", '')
    # Get the last [/INST]
    last_pos = text.rfind(TERMINATE)
    return text[last_pos + len(TERMINATE):] if last_pos != -1 else text

# remove all number bullets
def remove_numbered_bullets(text):
    processed_lines = []
    lines = text.split('\n')
    for line in lines:
        # Split each line at the first occurrence of '. '
        parts = line.split('. ', 1)
        # Part is likely a numbered list item, remove the numbering
        if len(parts) > 1 and parts[0].isdigit():
            processed_lines.append(parts[1])
        else: # Not a numbered lis. Add the line
            processed_lines.append(line)
    # Combine all processed lines to a single text
    return '\n'.join(processed_lines)

# Returns only response text that occurs after "the request was: "
# for example, "The request was:  Improve this text by making it a shanty."
def get_response(text):
    repsonse = text
    parts = text.rsplit("The request was: ", 1)
    if len(parts) > 1: # Check if the text contain "The request was: "
        response = parts[1].strip()  # Get the texts after "The request was"
    #Clean up numbered lists
    response = remove_numbered_bullets(response)
    return response

### Model Inference

In [28]:
default_prompt = """
Refine the following passage by emulating the writing style of [insert desired style here], 
with a focus on enhancing its clarity, elegance, and overall impact.
Preserve the essence and original meaning of the text, while meticulously adjusting its tone, vocabulary, and stylistic elements to resonate with the chosen style.
Please improve the following text using the writing style of, maintaining the original meaning but altering the tone, diction, and stylistic elements to match the new style.
Enhance the clarity, elegance, and impact of the following text by adopting the writing style of ,
ensuring the core message remains intact while transforming the tone, word choice, and stylistic features
to align with the specified style.
"""

In [29]:
class MistralModelRecover:
    def __init__(self, examples=examples):
        self.model_name = 'mistral-7b-v2'
        self.examples = examples
        self.max_new_tokens = 40 # number of generated prompts (output)
        self.max_sentences = 1 # number of sentences of generated prompts (output)
        self.load_model()
        
    # Load tokenizer and model
    def load_model(self):
        model_path = CFG.model_paths[self.model_name]
        print(f"model_path = {model_path}")
         # Load the tokenizer
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.tokenizer.padding_side = 'left'
        self.tokenizer.add_eos_token = True
        self.tokenizer.add_bos_token = True
        # Load the pretrained LLM in 4bit quantization  
        q_config = BitsAndBytesConfig(
            load_in_4bit = True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.bfloat16,
            bnb_4bit_use_double_quant=True,
        )
        # Get model configuration
        config = AutoConfig.from_pretrained(model_path)
        config.gradient_checkpointing = True
        # Load the model
        self.model = AutoModelForCausalLM.from_pretrained(model_path,
                                                          device_map="auto",
                                                          trust_remote_code=True,
                                                          quantization_config=q_config, 
                                                          torch_dtype=torch.bfloat16,
                                                          config=config) 
        print(f"Complete loading the model")
      
    def generate_prompt(self, original_text, rewritten_text):
        messages = self.examples.copy() 
        # Add testing data
        messages.append({"role": "user", "content": f"Original Text: {original_text}"})
        messages.append({"role": "assistant", "content": MISTRAL_INSTRUCTION})
        messages.append({"role": "user", "content": f"Re-written Text: {rewritten_text}"})
        messages.append({"role": "assistant", "content": f"The request was:  Improve this text by"})

        # Pass messages to Mistral
        model_inputs = self.tokenizer.apply_chat_template(messages, return_tensors="pt")
        # Move to GPUs
        model_inputs = model_inputs.to(CFG.DEVICE) 
        # Generate the prompts 
        generated_ids = self.model.generate(model_inputs,
                                            max_new_tokens=self.max_new_tokens,
                                            pad_token_id=self.tokenizer.eos_token_id)

        # Decode and trim to actual response
        decoded_output = self.tokenizer.batch_decode(generated_ids)
        # print(f"decoded_output[0] = {decoded_output[0]}")
        trimed_output = trim_output(decoded_output[0])
        # print(f"trimed_output = {trimed_output}")
        response = get_response(trimed_output)
        # Trim the first number of sentences
        # print(f"Before trimming first number of sentences: {response}")
        response = trim_to_first_num_sentences(response, self.max_sentences)
        # print(f"After trimming first number of sentences: {response}")
        del messages, model_inputs, generated_ids, decoded_output, trimed_output
        clear_memory()
        return response

    # Infer the prompt for given texts (df)
    def infer(self, df):
        prompts = []
        for idx in range(len(df)):
            row = df.iloc[idx]
            prompt = self.generate_prompt(row['original_text'], row['rewritten_text'])
            #default to baseline if empty or unusually short
            if len(prompt) < 15:
                prompt = default_prompt
            prompts.append(prompt)
            
        del self.model, self.tokenizer
        return prompts

In [30]:
SUBMISSION = False
if SUBMISSION:
    recover = MistralModelRecover() 
    rewrite_prompts = recover.infer(test_df)
    print(f"rewrite_prompts = {rewrite_prompts}")
    del recover
    # Submission
    submission = pd.read_csv('/kaggle/input/llm-prompt-recovery/sample_submission.csv')
    submission["rewrite_prompt"] = rewrite_prompts
    submission.to_csv('submission.csv', index=False)
    display(submission)

# Modify the prompts using Spacy <a class='anchor' id='mean-prompt'></a>
This approach is inspired by @ RICH OLSON [Spacy mean-prompt modification (just CPU!)](https://www.kaggle.com/code/richolson/spacy-mean-prompt-modification-just-cpu/notebook)

- **Identify unique words:** find words present in the rewritten text but absent from the original.
- **Discover similar words:** search for semantically similar words (synonyms) for this set of unique words.
- **Replace unique words with similar words:** replace these unique words with their similar words while ensuring they maintain the same part of speech. 

The goal is to refine the prompt to bring it closer to the actual prompt, hopefully improving performance.

## Import spacy

In [31]:
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
REWRITE_PREFIX = 'Please improve this text'

In [32]:
from collections import OrderedDict

import spacy
from scipy.spatial import distance

# Load SpaCy's English language model
nlp = spacy.load("en_core_web_lg")

# loads spacy's vocuabulary...
for s in nlp.vocab.vectors:
    _ = nlp.vocab[s]

In [33]:
#how many words get swapped out of base prompt
words_to_swap = 5

#start this many away from the word that most represent the entire text 
#increasing this number will result in more specific words (0 = most general)
word_semantic_offset = 3

### Step 1. Identify unique words

In [34]:
# Get the texts that contain the words unique to rewrites (not in originals)
def identify_unique_words(original_text, rewrite_text):
    # Tokenize original and rewrite text using NLP
    original_doc = nlp(original_text)
    rewrite_doc = nlp(rewrite_text)
    # Collect all nouns and verbs from original text as a set of removing words
    words_to_remove = set([token.lemma_.lower() for token in original_doc
                           if token.pos_ in ["NOUN", "VERB"] or token.is_stop])
    
    # Remove nouns and verbs from the rewrite text, ensuring no noun and verb duplicates
    words_to_keep = OrderedDict((token.text, None) for token in rewrite_doc 
                             if token.lemma_.lower() not in words_to_remove 
                             and token.pos_ != "PROPN" and token.text_with_ws.strip())
    
    # Combine all kept words to form the filtered text
    filtered_text = " ".join(words_to_keep.keys())
    filtered_text = filtered_text.replace("'s", "") # remove 's
    filtered_text = filtered_text.replace('"', '') # remove "
    # print(f"filtered_text = {filtered_text}")
    unique_words = nlp(filtered_text) # Tokenize the text to doc
#     print(f"Unique words = {unique_words}")
    return unique_words

### Step 2. Discover similar words of unique words

In [35]:
# Collect all words from NLP library
all_words = [word for word in nlp.vocab 
             if word.has_vector and not word.is_stop # filter stop words, alpahbets, and words of single character (like a)
             and word.is_alpha and len(word.text) > 1]

def get_most_similar_words(unique_words):
#     print(f"unique_words = {unique_words}")
    # Get word vectors of all unique words
    unique_word_vectors = [token.vector for token in unique_words if token.has_vector]
    if unique_word_vectors:
        # Get the mean vector
        average_word_vector = np.mean(unique_word_vectors, axis=0) # Compute the average word vector
        # print(f"all_words = {all_words}")
#         print(f"all_words[0] = {all_words[0]}")
        # Get word vectors to compute word similarity
        all_word_vectors = np.array([word.vector for word in all_words])
        # Compute cosine distances and find the closest words
        distances = distance.cdist([average_word_vector], all_word_vectors, "cosine")[0]
        # Sort the distance (from)
        sorted_indexes = distances.argsort()[:(word_semantic_offset + words_to_swap)]
        # Sort the words by its similarity to average vector
        sorted_words = [all_words[idx].text for idx in sorted_indexes]
        # print(f"sorted words = {sorted_words}")
        similar_words = sorted_words[words_to_swap * -1:]
#         print(f"Most similar word to average unique word = {similar_words}")
        # Map the similar words to their parts of speech (pos) for a single use
        similar_words_by_pos = {}
        for token in nlp(" ".join(similar_words)):
            if token.pos_ not in similar_words_by_pos:
                similar_words_by_pos[token.pos_] = []
            similar_words_by_pos[token.pos_].append(token.text)
        return similar_words, similar_words_by_pos
    else:
        print("No average vector could be computed.")
        return []

### Step 3. Replace the prompt with simiar words

In [36]:
#Substitutes words matching part-of-speech, managing spacing, preserves structure
def replace_prompt_with_similar_words_pos(doc_prompt, similar_words, similar_words_by_pos):
    rewrite_words = [] # 
    # List of POS tags of similar words 
    pos_tags = [token.pos_ for token in nlp(" ".join(similar_words))]  
    # Iterate over the doc tokens
    for token in doc_prompt:
        rewrite_word = token.text_with_ws
        # Check if the token's POS is in our list of POS tags for substitution
        if token.pos_ in pos_tags and not token.is_punct and not token.is_space:
            for pos in pos_tags:
                if token.pos_ == pos and pos in similar_words_by_pos and similar_words_by_pos[pos]:
                    rewrite_word = similar_words_by_pos[pos].pop(0)
                    break
        rewrite_words.append(rewrite_word)
    # Combine words to a text
    return " ".join(rewrite_words)

#returns prompt with relevant words swapped
def replace_prompt_with_similar_words(rewrite_prompt, similar_words, similar_words_by_pos):
    # Process the template sentence
    doc_prompt = nlp(rewrite_prompt)

    # Perform substitutions while managing spacing and punctuation
    rewritten_prompt = replace_prompt_with_similar_words_pos(doc_prompt,
                                                             similar_words,
                                                             similar_words_by_pos)
    # Clean up the prompt
    rewritten_prompt = rewritten_prompt.lower()
    rewritten_prompt = rewritten_prompt.replace("  ", " ")
    rewritten_prompt = rewritten_prompt.replace("  ", " ")
    rewritten_prompt = rewritten_prompt.replace(" ,", ",")
    rewritten_prompt = rewritten_prompt.replace("[ ", "[")
    rewritten_prompt = rewritten_prompt.replace(" ]", "]")
    rewritten_prompt = rewritten_prompt.replace(" .", ".")
    # Special case for incorrect spelling word 'somethin'?
    rewritten_prompt = rewritten_prompt.replace("somethin ", "something ")
    rewritten_prompt = rewritten_prompt.replace("somethin, ", "something, ")
    return f"{rewritten_prompt}"

### Load demo data 
Load @DIPAM CHAKRABORTY [3000 Rewritten texts - Prompt recovery Challenge](https://www.kaggle.com/datasets/dipamc77/3000-rewritten-texts-prompt-recovery-challenge) that contain original text, rewrite text and rewrite prompt.

In [37]:
TESTING = False
if TESTING:
    demo_df = pd.read_csv("/kaggle/input/3000-rewritten-texts-prompt-recovery-challenge/prompts_0_500_wiki_first_para_3000.csv")
    demo_texts = demo_df[:5]
    # Test with demo data
    for index, row in demo_texts.iterrows():
        print("------------")
        print("Actual Rewrite Prompt:", row['rewrite_prompt'])
        unique_words = identify_unique_words(row['original_text'], row['rewritten_text'])
        # print("\nUnique words:", unique_words)
        similar_words, similar_words_by_pos = get_most_similar_words(unique_words)
        # print("\nMost similar words to unqiue words:", similar_words)
        rewrite_prompt = 'Please improve this text using the writing style with maintaining the original meaning but altering the tone.'
        print ("\nPredicted prompt:", replace_prompt_with_similar_words(rewrite_prompt, similar_words))

### Load and submit testing data

In [38]:
SUBMISSION = True
if SUBMISSION:
    test_df = pd.read_csv("/kaggle/input/llm-prompt-recovery/test.csv")
    test_df['original_text'] = test_df['original_text'].fillna('')
    test_df['rewritten_text'] = test_df['rewritten_text'].fillna('')
    # # Add a list of rewrite prompt
#     rewrite_prompt = 'Please improve this text using the writing style with maintaining the original meaning but altering the tone.'
#     rewrite_prompts = [rewrite_prompt for i in range(len(test_df))]
    # # # # Use fine-tuned Phi to generate the prompts
#     recover = PhiModelRecover() 
#     rewrite_prompts = recover.infer(test_df)
    # # # Use pretrained Gemma-7b to generate the prompts
    recover = GemmaModelRecover()  
    rewrite_prompts = recover.infer(test_df)    
    # Use Mistral-7b-v2 model to generate the prompts
#     recover = MistralModelRecover() 
#     rewrite_prompts = recover.infer(test_df)
    del recover
    clear_memory()
    print(f"rewrite_prompts = {rewrite_prompts}")
    rewrite_prompts_spacy = []
    for index, row in test_df.iterrows():
        unique_words = identify_unique_words(row['original_text'], row['rewritten_text'])
        similar_words, similar_words_by_pos  = get_most_similar_words(unique_words)
        print(f"unique_words = {unique_words} Similar words = {similar_words}")
        rewrite_prompt = rewrite_prompts[index]
        rewrite_prompt_spacy = replace_prompt_with_similar_words(rewrite_prompt,
                                                                 similar_words,
                                                                 similar_words_by_pos)
        print(f"Rewrite prompt spacy = {rewrite_prompt_spacy}")
        del unique_words, similar_words, similar_words_by_pos
        rewrite_prompts_spacy.append(rewrite_prompt_spacy)
    # Add rewrite prompts
    test_df['rewrite_prompt'] = rewrite_prompts_spacy
    test_df.to_csv('test_df.csv', index=False)
    # Write to csv file
    test_df[['id', 'rewrite_prompt']].to_csv('submission.csv', index=False)
    display(test_df['rewrite_prompt'])

model_path = /kaggle/input/gemma/transformers/7b-it/2


Gemma's activation function should be approximate GeLU and not exact GeLU.
Changing the activation function to `gelu_pytorch_tanh`.if you want to use the legacy `gelu`, edit the `model.config` to set `hidden_activation=gelu`   instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


Complete loading the model
p_scores = [7.1796036 7.777508  7.0690274]
best_pred = ['Please improve the following text using the writing style of, \n   maintaining the original meaning but altering the tone, diction,  \n   and stylistic elements to match the new style.Enhance the clarity, \n   elegance, and impact of the following text by adopting the writing style of,\n   ensuring the core message remains intact while transforming the tone,\n   word choice, and stylistic features to align with the specified style.']
rewrite_prompts = ['Please improve the following text using the writing style of, \n   maintaining the original meaning but altering the tone, diction,  \n   and stylistic elements to match the new style.Enhance the clarity, \n   elegance, and impact of the following text by adopting the writing style of,\n   ensuring the core message remains intact while transforming the tone,\n   word choice, and stylistic features to align with the specified style.']
unique_words = Here 

0    please overlaying the overlapping thermosettin...
Name: rewrite_prompt, dtype: object

# Evaluation metric <a class='anchor' id='evaluation'></a>
Here, we describe how to evaluate the recovered prompts using cosine similarity with sentence-t5-base from HuggingFace:

- **Compute Embeddings:**
We first compute the sentence embeddings for both the predicted prompts and the original prompts using sentence-t5-base.
- **Calculate Cosine Similarity:**
For each pair of corresponding predicted and original prompts (each row), we calculate the cosine similarity. Cosine similarity measures how similar the embeddings are, ranging from -1 (completely dissimilar) to 1 (identical).

- **Average the Scores:** 
Finally, we average the cosine similarity scores (or the cubed values if applied in step 3) across all prompt pairs. This provides a single overall score (CV score) representing the average similarity between predicted and original prompts.


@SAMBHAV DIXIT [submission with evaluation metric](https://www.kaggle.com/code/sambhavdixit/submission-with-evaluation-metric)

In [39]:
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity

def CVScore():
    test = pd.read_csv("/kaggle/working/test_df.csv")
    
    scs = lambda row: abs((cosine_similarity(row["actual_embeddings"],
                                             row["pred_embeddings"])) ** 3)
    
    model = SentenceTransformer('/kaggle/input/sentence-t5-base-hf/sentence-t5-base')

    test["actual_embeddings"] = test["rewrite_prompt"].apply(lambda x: model.encode(x, normalize_embeddings=True,
                                                                                    show_progress_bar=False).reshape(1, -1))
    test["pred_embeddings"] = test["rewrite_prompt"].apply(lambda x: model.encode(x, normalize_embeddings=True,
                                                                                  show_progress_bar=False).reshape(1, -1))
    test["score"] = test.apply(scs, axis=1)
    
    return np.mean(test['score'])[0][0]

In [40]:


# print(f"CV Score: {CVScore(test_df)}")