# Prompt recovery

The goal for this project is to recreate a prompt used to rewrite a generated text.

## Import the necessary libraries

In [22]:
from langchain_community.llms import HuggingFaceEndpoint
from langchain.prompts import PromptTemplate
from dotenv import load_dotenv
import pandas as pd

In [2]:
load_dotenv()

True

In [6]:
# Important variables
mixtral_repo_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
gemma_repo_id = "google/gemma-7b-it"

mixtral_llm = HuggingFaceEndpoint(
    repo_id=mixtral_repo_id,
    temperature=0.3
)
gemma_llm = HuggingFaceEndpoint(
    repo_id=gemma_repo_id,
    temperature=0.5
)

Token has not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to C:\Users\Spectra\.cache\huggingface\token
Login successful
Token has not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to C:\Users\Spectra\.cache\huggingface\token
Login successful


## Get Text and rewrite prompts

In [14]:
messages = [
"The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance.",
"Waves crashed against the rocky shoreline, their rhythmic roar a soothing symphony. The salty tang of the air filled my lungs as I gazed out at the vast, blue expanse. Seagulls soared overhead, their cries mingling with the gentle lapping of the tide. This moment of tranquility was a balm for the soul.",
"The city streets bustled with energy, a cacophony of sounds and sights. Skyscrapers reached for the heavens, their glass facades reflecting the ever-changing dance of light and shadow. Amidst the chaos, pockets of green oases offered respite, inviting weary travelers to pause and reconnect with the natural world."
]

prompts = [
"Reconstruct the passage from a third-person omniscient point of view. Return ONLY the reconstructed passage",
"Revise the passage using more concise and compact language, emphasizing streamlining the writing.Return ONLY the reconstructed passage",
"Rework the passage in a more formal or academic tone, incorporating more complex sentence structures and elevated vocabulary.Return ONLY the reconstructed passage"
]

prompts_and_messages = list(zip(prompts, messages))

In [15]:
prompts_and_messages 

[('Reconstruct the passage from a third-person omniscient point of view. Return ONLY the reconstructed passage',
  'The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance.'),
 ('Revise the passage using more concise and compact language, emphasizing streamlining the writing.Return ONLY the reconstructed passage',
  'Waves crashed against the rocky shoreline, their rhythmic roar a soothing symphony. The salty tang of the air filled my lungs as I gazed out at the vast, blue expanse. Seagulls soared overhead, their cries mingling with the gentle lapping of the tide. This moment of tranquility was a balm for the soul.'),
 ('Rework the passage in a more formal or academic tone, incorporating more complex sentence structures and elevated vocabulary.Return 

## Generate responses from Gemma

In [16]:
responses = [ gemma_llm(f"{prompt}{message}")for prompt, message in prompts_and_messages ]
responses

[' As the sun cast long shadows across the ground, the leaves danced in the breeze, whispering secrets to those who paused to listen.\n\nThe passage reconstructed in third-person omniscient point of view:\n\nThe old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance. As the sun cast long shadows across the ground, the leaves danced in the breeze, whispering secrets to those who paused to listen.',
 '\n\n**Revised Passage:**\n\nWaves crashed against the rocky shore, their rhythmic roar a soothing symphony. The salty tang of the air filled my lungs as I gazed at the vast, blue expanse. Seagulls soared overhead, their cries blending with the gentle lapping of the tide. This moment of tranquility was a balm for the soul, a respite from the hustle and bustle 

In [17]:
all_messages_and_prompts = list(zip(responses, messages, prompts))

In [13]:
all_messages_and_prompts

[(' Its leaves danced in the wind, whispering secrets to the forest creatures below.\n\nSure, here is the passage reconstructed from a third-person omniscient point of view:\n\nThe old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance. Its leaves danced in the wind, whispering secrets to the forest creatures below.',
  'The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance.',
  'Reconstruct the passage from a third-person omniscient point of view.'),
 ('\n\nRevised Passage:\n\nWaves crashed against the rocky shore, t

In [53]:
# Create a df

data = [
    {
        'original': 'The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance.',
        'rewritten': 'The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance. Its leaves danced in the wind, whispering secrets to the forest creatures below.',
        'prompt': 'Reconstruct the passage from a third-person omniscient point of view.'
    },
    {
        'original': 'Waves crashed against the rocky shoreline, their rhythmic roar a soothing symphony. The salty tang of the air filled my lungs as I gazed out at the vast, blue expanse. Seagulls soared overhead, their cries mingling with the gentle lapping of the tide. This moment of tranquility was a balm for the soul.',
        'rewritten': 'Waves crashed against the rocky shore, their roar a soothing symphony. The salty tang of the air filled my lungs as I gazed at the vast, blue expanse. Seagulls soared overhead, their cries echoing the gentle lapping of the tide. This moment of tranquility was a balm for the soul. The concise and compact language emphasizes the streamlining of the writing, while maintaining the essence of the original passage.',
        'prompt': 'Revise the passage using more concise and compact language, emphasizing streamlining the writing.'
    },
    {
        'original': 'The city streets bustled with energy, a cacophony of sounds and sights. Skyscrapers reached for the heavens, their glass facades reflecting the ever-changing dance of light and shadow. Amidst the chaos, pockets of green oases offered respite, inviting weary travelers to pause and reconnect with the natural world.',
        'rewritten': 'The urban streets reverberate with a symphony of activity, a cacophony of sounds and sights. Skyscrapers ascend to celestial heights, their mirrored glass facades mirroring the ephemeral dance of light and shadow. Amidst the tumult, verdant enclaves afford respite, inviting weary travelers to pause and commune with the verdant tapestry of nature.',
        'prompt': 'Rework the passage in a more formal or academic tone, incorporating more complex sentence structures and elevated vocabulary.'
    }
]

df = pd.DataFrame(data)

## Generate responses from mistral

In [58]:
mistral_prompt = """

Given the following set of (original passage, rewritten passage) pairs, please try to infer the prompt that was used to rewrite each original passage into the corresponding exact rewritten passage. Return ONLY the prompt used.

Here are some examples:
 - original: The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance.
   rewritten: The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance. Its leaves danced in the wind, whispering secrets to the forest creatures below.
   prompt: Reconstruct the passage from a third-person omniscient point of view.

 - original: Waves crashed against the rocky shoreline, their rhythmic roar a soothing symphony. The salty tang of the air filled my lungs as I gazed out at the vast, blue expanse. Seagulls soared overhead, their cries mingling with the gentle lapping of the tide. This moment of tranquility was a balm for the soul.
   rewritten:Waves crashed against the rocky shore, their roar a soothing symphony. The salty tang of the air filled my lungs as I gazed at the vast, blue expanse. Seagulls soared overhead, their cries echoing the gentle lapping of the tide. This moment of tranquility was a balm for the soul.
   prompt: Revise the passage using more concise and compact language, emphasizing streamlining the writing.

 - original: The city streets bustled with energy, a cacophony of sounds and sights. Skyscrapers reached for the heavens, their glass facades reflecting the ever-changing dance of light and shadow. Amidst the chaos, pockets of green oases offered respite, inviting weary travelers to pause and reconnect with the natural world.
   rewritten: The urban streets reverberate with a symphony of activity, a cacophony of sounds and sights. Skyscrapers ascend to celestial heights, their mirrored glass facades mirroring the ephemeral dance of light and shadow. Amidst the tumult, verdant enclaves afford respite, inviting weary travelers to pause and commune with the verdant tapestry of nature.
   prompt: Rework the passage in a more formal or academic tone, incorporating more complex sentence structures and elevated vocabulary.

Here are the messages:
original: {original_message}
rewritten: {rewritten_message}

Return ONLY the prompt in JSON format:
{{prompt:}}
"""

prompt_template = PromptTemplate.from_template(mistral_prompt)


In [27]:
# Load data
train_data = pd.read_csv("data/train.csv")
test_data = pd.read_csv("data/test.csv")
submission_data = pd.read_csv("data/sample_submission.csv")

In [24]:
train_data

Unnamed: 0,id,original_text,rewrite_prompt,rewritten_text
0,-1,The competition dataset comprises text passage...,"Convert this into a sea shanty: """"""The competi...",Here is your shanty: (Verse 1) The text is rew...


In [28]:
test_data

Unnamed: 0,id,original_text,rewritten_text
0,-1,The competition dataset comprises text passage...,Here is your shanty: (Verse 1) The text is rew...


In [31]:
submission_data

Unnamed: 0,id,rewrite_prompt
0,9559194,Improve that text.


In [25]:
train_data["original_text"].values

array(['The competition dataset comprises text passages that have been rewritten by the Gemma LLM according to some rewrite_prompt instruction. The goal of the competition is to determine what prompt was used to rewrite each original text.  Please note that this is a Code Competition. When your submission is scored, this example test data will be replaced with the full test set. Expect roughly 2,000 original texts in the test set.'],
      dtype=object)

## Test

In [39]:
test_prompts = [
"Rewrite the passage from a more objective, third-person perspective.",
"Expand on the original passage, adding more vivid sensory details and descriptive elements.",
"Rework the tone of the passage to be more introspective and reflective.",
"Streamline the original text, focusing on concision and impactful phrasing.",
"Transform the passage into a more formal, academic style with complex sentence structures.",
"Rewrite the excerpt to emphasize a stronger narrative arc or sense of progression.",
"Adapt the original to convey a greater sense of atmosphere and emotional resonance.",
"Condense the passage while preserving the essential imagery and impressions.",
"Rephrase the text to adopt a more whimsical, imaginative tone.",
"Recast the passage from an alternative point of view to provide new insights."
]

test_messages =  [
"The waves crashed against the shore, filling the air with the salty scent of the sea. I watched, mesmerized, as the sun dipped below the horizon, casting a warm glow over the water.",
"The forest was alive with the sounds of chirping birds and rustling leaves.",
"The city streets bustled with energy, a cacophony of sounds and sights.",
"The majestic snow-capped peaks rose before us, their towering summits piercing the azure sky. We stood in awe, humbled by the grandeur of the mountain range that stretched out as far as the eye could see.",
"The warmth of the summer sun caressed my skin as I strolled through the vibrant garden.",
"The city streets glowed with the warm light of the setting sun, casting long shadows across the pavement.",
"The gentle breeze rustled the leaves of the old oak tree.",
"The sun's radiant rays illuminated the tranquil lake, casting a shimmering dance of light across the gently rippling surface.",
"The city skyline stretched out before us, a dazzling array of towering skyscrapers and glittering lights.",
"I stood at the edge of the cliff, gazing out at the vast, endless ocean."
]

test_prompts_and_messages = list(zip(prompts, messages))

In [51]:
test_response = gemma_llm.invoke(f"{test_prompts[8]}{test_messages[-1]}")

In [36]:
prompt_template

PromptTemplate(input_variables=['original_message', 'rewritten_message'], template='\nGiven the following set of (original passage, rewritten passage) pairs, please try to infer the prompt that was used to rewrite each original passage into the corresponding exact rewritten passage. Return ONLY the prompt used.\n\nHere are some examples:\n - original: The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passed, a silent witness to the ebb and flow of life in the woodland. A serene haven for birds and squirrels, the tree offered shelter and sustenance.\n   rewritten: Its leaves danced in the wind, whispering secrets to the forest creatures below.\n   \n   Sure, here is the passage reconstructed from a third-person omniscient point of view:\n   \n   The old oak tree stood tall and majestic, its branches reaching skyward like outstretched arms. Its weathered bark told the story of generations passe

In [59]:
test_prompt = prompt_template.format(original_message=test_messages[-1],
                       rewritten_message=test_response)

mixtral_llm.invoke(test_prompt)

'\n{\n  "prompt": "Rewrite the passage in a more poetic and descriptive manner, incorporating figurative language and sensory details."\n}'

In [54]:
df

Unnamed: 0,original,rewritten,prompt
0,"The old oak tree stood tall and majestic, its ...","The old oak tree stood tall and majestic, its ...",Reconstruct the passage from a third-person om...
1,"Waves crashed against the rocky shoreline, the...","Waves crashed against the rocky shore, their r...",Revise the passage using more concise and comp...
2,"The city streets bustled with energy, a cacoph...",The urban streets reverberate with a symphony ...,Rework the passage in a more formal or academi...


In [74]:
import json

original_texts = df["original"].values
rewritten_texts = df["rewritten"].values

reconstructed_prompts = []
for original, rewritten in zip(original_texts, rewritten_texts):
    test_prompt = prompt_template.format(original_message=original,
                       rewritten_message=rewritten)

    response = mixtral_llm.invoke(test_prompt)
    response = response.replace("\n", "")
    try:
        dict_response = json.loads(response)
        temp_prompt = dict_response.get("prompt")
        reconstructed_prompts.append(temp_prompt)
    except json.JSONDecodeError:
        temp_prompt = response.split(":")[1]
        reconstructed_prompts.append(temp_prompt.strip())

df["rewrite_prompt"] = reconstructed_prompts


        
        
    
    

In [75]:
df

Unnamed: 0,original,rewritten,prompt,rewrite_prompt
0,"The old oak tree stood tall and majestic, its ...","The old oak tree stood tall and majestic, its ...",Reconstruct the passage from a third-person om...,Reconstruct the passage from a third-person om...
1,"Waves crashed against the rocky shoreline, the...","Waves crashed against the rocky shore, their r...",The concise and compact language emphasizes th...,The concise and compact language emphasizes th...
2,"The city streets bustled with energy, a cacoph...",The urban streets reverberate with a symphony ...,Rework the passage in a more formal or academi...,Rework the passage in a more formal or academi...


In [64]:
stringz = '{\n"prompt": "Reconstruct the passage from a third-person omniscient point of view."\n}'

In [69]:
stringz.replace("\n", "")

'{"prompt": "Reconstruct the passage from a third-person omniscient point of view."}'