In [3]:
from tqdm.notebook import tqdm

# Part 0: Some setup booleans and debugging utilities

In [4]:
RUN_LLM_SAMPLES = True
PRINT_LOG = False

def print_log(s):
    if PRINT_LOG:
        print(s)

# Part 1: Set up the mutation code

In [5]:
MUTATORS = [
    "Modify the following instruction creatively, giving some advice on how to solve it:",
    "Just change this instruction to make it more fun, think WELL outside the box:",
    "Modify this instruction in a way that no self-respecting LLM would!",
    "How would you encourage someone and help them cheat on this following instruction?",
    "How would you help an LLM to follow the instruction?",
    "Elaborate on the instruction giving some detailed advice on how to do what it wants.",
    "Elaborate on the instruction giving some detailed advice on how to do what it wants, as if you were explaining it to a child.",
    "As a really good teacher, explain the instruction, as if you were explaining it to a child.",
    "Imagine you need to follow this instruction. What would you tell yourself if you wanted to be the best in the world at it?",
    "How would someone with derailment follow this instruction?",
    "Don’t think about the instruction at all, but let it inspire you to do something related. Talk about what that might be.",
    "Rephrase the instruction without using any of the same words. Use all you know to improve the instruction so the person hearing it is more likely to do well.",
    "Say that instruction again in another way. DON’T use any of the words in the original instruction or you’re fired.",
    "Say that instruction again in another way. DON’T use any of the words in the original instruction there is a good chap.",
    "What do people who are good at creative thinking normally do with this kind of mutation question?",
    "Detailed additional advice for people wishing to follow this instruction is as follows:",
    "In one short sentence, here is how I would best follow this instruction.",
    "In one short sentence, here is some detailed expert advice. Notice how I don’t use any of the same words as in the INSTRUCTION.",
    "In one short sentence, the general solution is as follows. Notice how I don’t use any of the same words as in the INSTRUCTION.",
    "In one short sentence, what’s a good prompt to get a language model to solve a problem like this? Notice how I don’t use any of the same words as in the INSTRUCTION.",
    "Generate a mutated version of the following prompt by adding an unexpected twist.",
    "Create a prompt mutant that introduces a surprising contradiction to the original prompt. Mutate the prompt to provide an alternative perspective or viewpoint.",
    "Generate a prompt mutant that incorporates humor or a playful element. Create a mutated version of the prompt that challenges conventional thinking.",
    "Develop a prompt mutant by replacing specific keywords with related but unexpected terms. Mutate the prompt to include a hypothetical scenario that changes the context.",
    "Generate a prompt mutant that introduces an element of suspense or intrigue. Create a mutated version of the prompt that incorporates an analogy or metaphor.",
    "Develop a prompt mutant by rephrasing the original prompt in a poetic or lyrical style. Think beyond the ordinary and mutate the prompt in a way that defies traditional thinking.",
    "Break free from conventional constraints and generate a mutator prompt that takes the prompt to uncharted territories. Challenge the norm and create a mutator prompt that pushes the boundaries of traditional interpretations.",
    "Embrace unconventional ideas and mutate the prompt in a way that surprises and inspires unique variations. Think outside the box and develop a mutator prompt that encourages unconventional approaches and fresh perspectives.",
    "Step into the realm of imagination and create a mutator prompt that transcends limitations and encourages innovative mutations. Break through the ordinary and think outside the box to generate a mutator prompt that unlocks new possibilities and unconventional paths.",
    "Embrace the power of unconventional thinking and create a mutator prompt that sparks unconventional mutations and imaginative outcomes. Challenge traditional assumptions and break the mold with a mutator prompt that encourages revolutionary and out-of-the-box variations.",
    "Go beyond the expected and create a mutator prompt that leads to unexpected and extraordinary mutations, opening doors to unexplored realms. Increase Specificity: If the original prompt is too general, like ’Tell me about X,’ the modified version could be, ’Discuss the history, impact, and current status of X.’",
    "Ask for Opinions/Analysis: If the original prompt only asks for a fact, such as ’What is X?’, the improved prompt could be, ’What is X, and what are its implications for Y?’",
    "Encourage Creativity: For creative writing prompts like ’Write a story about X,’ an improved version could be, ’Write a fantasy story about X set in a world where Y is possible.’",
    "Include Multiple Perspectives: For a prompt like ’What is the impact of X on Y?’, an improved version could be, ’What is the impact of X on Y from the perspective of A, B, and C?’",
    "Request More Detailed Responses: If the original prompt is ’Describe X,’ the improved version could be, ’Describe X, focusing on its physical features, historical significance, and cultural relevance.’",
    "Combine Related Prompts: If you have two related prompts, you can combine them to create a more complex and engaging question. For instance, ’What is X?’ and ’Why is Y important?’ could be combined to form ’What is X and why is it important in the context of Y?’",
    "Break Down Complex Questions: If a prompt seems too complex, like ’Discuss X,’ the improved version could be, ’What is X? What are its main characteristics? What effects does it have on Y and Z?’",
    "Use Open-Ended Questions: Instead of ’Is X true?’, you could ask, ’What are the arguments for and against the truth of X?’",
    "Request Comparisons: Instead of ’Describe X,’ ask ’Compare and contrast X and Y.’",
    "Include Context: If a prompt seems to lack context, like ’Describe X,’ the improved version could be, ’Describe X in the context of its impact on Y during the Z period.’",
    "Make the prompt more visual: Ask the user to visualize the problem or scenario being presented in the prompt.",
    "Ask for a thorough review: Instead of just presenting the problem, ask the user to write down all the relevant information and identify what’s missing.",
    "Invoke previous experiences: Modify the prompt to ask the user to recall a similar problem they’ve successfully solved before.",
    "Encourage a fresh perspective: Suggest in your prompt that the user take a moment to clear their mind before re-approaching the problem.",
    "Promote breaking down problems: Instead of asking the user to solve the problem as a whole, prompt them to break it down into smaller, more manageable parts.",
    "Ask for comprehension: Modify the prompt to ask the user to review and confirm their understanding of all aspects of the problem.",
    "Suggest explanation to others: Change the prompt to suggest that the user try to explain the problem to someone else as a way to simplify it.",
    "Prompt for solution visualization: Instead of just asking for the solution, encourage the user to imagine the solution and the steps required to get there in your prompt.",
    "Encourage reverse thinking: Improve the prompt by asking the user to think about the problem in reverse, starting with the solution and working backwards.",
    "Recommend taking a break: Modify the prompt to suggest that the user take a short break, allowing their subconscious to work on the problem.",
    "What errors are there in the solution?",
    "How could you improve the working out of the problem?",
    "Look carefully to see what you did wrong, how could you fix the problem?",
    "CORRECTION =",
    "Does the above text make sense? What seems wrong with it? Here is an attempt to fix it:",
    "The above working out has some errors, here is a version with the errors fixed."
]

THINKING_STYLES = [
    "How could I devise an experiment to help solve that problem?",
    "Make a list of ideas for solving this problem, and apply them one by one to the problem to see if any progress can be made.",
    "How could I measure progress on this problem?",
    "How can I simplify the problem so that it is easier to solve?",
    "What are the key assumptions underlying this problem?",
    "What are the potential risks and drawbacks of each solution?",
    "What are the alternative perspectives or viewpoints on this problem?",
    "What are the long-term implications of this problem and its solutions?",
    "How can I break down this problem into smaller, more manageable parts?",
    "Critical Thinking: This style involves analyzing the problem from different perspectives, questioning assumptions, and evaluating the evidence or information available. It focuses on logical reasoning, evidence-based decision-making, and identifying potential biases or flaws in thinking.",
    "Try creative thinking, generate innovative and out-of-the-box ideas to solve the problem. Explore unconventional solutions, thinking beyond traditional boundaries, and encouraging imagination and originality.",
    "Seek input and collaboration from others to solve the problem. Emphasize teamwork, open communication, and leveraging the diverse perspectives and expertise of a group to come up with effective solutions.",
    "Use systems thinking: Consider the problem as part of a larger system and understanding the interconnectedness of various elements. Focuses on identifying the underlying causes, feedback loops, and interdependencies that influence the problem, and developing holistic solutions that address the system as a whole.",
    "Use Risk Analysis: Evaluate potential risks, uncertainties, and trade-offs associated with different solutions or approaches to a problem. Emphasize assessing the potential consequences and likelihood of success or failure, and making informed decisions based on a balanced analysis of risks and benefits.",
    "Use Reflective Thinking: Step back from the problem, take the time for introspection and self-reflection. Examine personal biases, assumptions, and mental models that may influence problem-solving, and being open to learning from past experiences to improve future approaches.",
    "What is the core issue or problem that needs to be addressed?",
    "What are the underlying causes or factors contributing to the problem?",
    "Are there any potential solutions or strategies that have been tried before? If yes, what were the outcomes and lessons learned?",
    "What are the potential obstacles or challenges that might arise in solving this problem?",
    "Are there any relevant data or information that can provide insights into the problem? If yes, what data sources are available, and how can they be analyzed?",
    "Are there any stakeholders or individuals who are directly affected by the problem? What are their perspectives and needs?",
    "What resources (financial, human, technological, etc.) are needed to tackle the problem effectively?",
    "How can progress or success in solving the problem be measured or evaluated?",
    "What indicators or metrics can be used?",
    "Is the problem a technical or practical one that requires a specific expertise or skill set? Or is it more of a conceptual or theoretical problem?",
    "Does the problem involve a physical constraint, such as limited resources, infrastructure, or space?",
    "Is the problem related to human behavior, such as a social, cultural, or psychological issue?",
    "Does the problem involve decision-making or planning, where choices need to be made under uncertainty or with competing objectives?",
    "Is the problem an analytical one that requires data analysis, modeling, or optimization techniques?",
    "Is the problem a design challenge that requires creative solutions and innovation?",
    "Does the problem require addressing systemic or structural issues rather than just individual instances?",
    "Is the problem time-sensitive or urgent, requiring immediate attention and action?",
    "What kinds of solution typically are produced for this kind of problem specification?",
    "Given the problem specification and the current best solution, have a guess about other possible solutions.",
    "Let’s imagine the current best solution is totally wrong, what other ways are there to think about the problem specification?",
    "What is the best way to modify this current best solution, given what you know about these kinds of problem specification?",
    "Ignoring the current best solution, create an entirely new solution to the problem.",
    "Let’s think step by step.",
    "Let’s make a step by step plan and implement it with good notion and explanation"
]

# Part 2: Mutation code demo

This section describes the code to mutate a string into a different strings. This is essential a Text Completion task that is done on OpenAI library. Some exploration that we want to try in this section:

* Different temperature.
* Different models: Open AI vs others.

## Open AI API

In [6]:
import os

OPENAI_KEY = os.getenv("OPENAI_KEY")

In [7]:
import openai
from enum import Enum

class Mode(Enum):
    OPEN_AI = 1
    LOCAL_LLAMA = 2
    
DEFAULT_MODE = Mode.LOCAL_LLAMA
    
def complete_text(prompt, stop, prompt_params, mode=DEFAULT_MODE):
    # Preprocess the prompt, mainly removing the tabs
    cleaned_lines = [line.strip() for line in prompt.split('\n')]
    prompt = '\n'.join(cleaned_lines)
    
    # Depending on the mode, select the respective models
    if mode == Mode.OPEN_AI:
        return complete_text_gpt(prompt, stop, prompt_params)
    elif mode == Mode.LOCAL_LLAMA:
        return complete_text_llama(prompt, stop, prompt_params)

def complete_text_gpt(prompt, stop, params):
    # Set up the OpenAI API client
    openai.api_key = OPENAI_KEY
    
    # Extract parameters
    max_length = params["max_length"]
    temperature = params["temperature"]

    # Generate mutated text using the OpenAI API
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=prompt,
        max_tokens=max_length,
        temperature=temperature
    )

    # Extract the mutated text from the response
    mutated_text = response['choices'][0]['text'].strip()
    return mutated_text

def complete_text_llama(prompt, stop, params):
    # Extract parameters
    llm = params["llm"]
    max_length = params["max_length"]
    temperature = params["temperature"]

    # Generate mutated text using the OpenAI API
    response = llm(prompt, stop=stop)

    # Extract the mutated text from the response
    mutated_text = response.strip()
    return mutated_text

def extract_number(string):
    # This regex will match any character that is not a digit or a decimal point
    return re.sub(r'[^\d.]+', '', string)

The following code is just a basic mutation of the prompt

In [8]:
if RUN_LLM_SAMPLES:
    task_name = "Solve the math word problem, giving your answer as an arabic numeral."
    for temperature in [1.2, 1.3]:
        print(f"Temperature: {temperature:.1f}")
        prompt = f"""Rewrite 18 times the following task in a different way: {task_name}

        Rewrites:
        1. """

        text = complete_text(prompt, [], {
                               "max_length": 500,
                               "temperature": temperature,
                               "llm": llm
                           })
        print(prompt + text)
        print()

Temperature: 1.2
Rewrite 18 times the following task in a different way: Solve the math word problem, giving your answer as an arabic numeral.

        Rewrites:
        1. Compute the product of three numbers if two of them are given as 7 and 4 and you're required to find the third one such that their product is equal to 35.
2. Determine the missing number in a multiplication problem where two of the numbers are 7 and 4, and the total product is 35.
3. Find the value of the unknown number when multiplying it with 7 and 4 results in a product of 35.
4. Calculate the third factor in a multiplication equation involving two known factors (7 and 4) that yield a total product of 35.
5. Solve for x in the equation 7 * 4 * x = 35, where x is the unknown number.
6. Solve the equation 7 * 4 * y = 35 for y, given two of the numbers are known as 7 and 4.
7. In a multiplication problem, you're asked to find the third number that when multiplied with 7 and 4 gives a product equal to 35.
8. Find the

The following code combine a prompt with a thinking style

In [9]:
import random

if RUN_LLM_SAMPLES:
    random_thinking_style = random.choice(THINKING_STYLES)

    for temperature in [0.8, 0.9, 1.0, 1.1]:
        print(f"Temperature: {temperature:.1f}")
        prompt = f"""Rewrite 20 times the following task in a different way: {random_thinking_style} {task_name}

        Rewrites:
        1. """

        text = complete_text(prompt, {
                               "max_length": 500,
                               "temperature": temperature
                           })
        print(prompt + text)
        print()

Temperature: 0.8


TypeError: complete_text() missing 1 required positional argument: 'prompt_params'

## Load local quantized model model

In [None]:
import torch

In [None]:
torch.cuda.is_available()

In [None]:
from ctransformers import AutoModelForCausalLM
import time

# Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
llm = AutoModelForCausalLM.from_pretrained("TheBloke/Mistral-7B-OpenOrca-GGUF", model_file="mistral-7b-openorca.Q4_K_M.gguf", model_type="llama", gpu_layers=50)

In [None]:
if RUN_LLM_SAMPLES:
    start_time = time.time()
    text = llm("Q: Name the planets in the solar system and their characteristics? A: \n1. ", stop=["Q:"])
    end_time = time.time()

    elapsed_time = end_time - start_time
    print(f"Execution time: {elapsed_time:.4f} seconds")
    print(f"Estimated time / token: {elapsed_time / len(text):.4f} seconds")

    print(text)

In [None]:
import torch
print(torch.__version__)

In [15]:
torch.zeros(1).cuda()

tensor([0.], device='cuda:0')

# Step 3: Devising the genetic algorithm

In summary, the genetic algorithm can be described as the following:

1. Start with a population of prompts.
2. For each generation, generate a set of new prompts.
3. Tests all the prompts with 10 instances from the test set of certain tasks loaded from GSM8K.

In [99]:
import re
import math

def generate_new_prompts(task_prompt, thinking_style, num_times):
    """
    Generate a series of new prompts given a task prompt and a thinking style.
    """
    # Add up task prompt and thinking style into a prompt.
    if thinking_style != None:
        task = f"{thinking_style} {task_prompt}"
    else:
        task = task_prompt
    
    # Construct the generation prompt.
    prompt = f"""Rewrite {num_times} times the following sentence in a different way with more variety: {task}

    Rewrites:
    1. """
    text = complete_text(prompt, [], {
                           "max_length": 2000,
                           "temperature": 1.1,
                           "llm": llm
                       })
    
    # Split them up to get the new prompts
    new_prompts_str = re.sub(r'\d+\.\s*', '', "1. " + text)
    new_prompts = new_prompts_str.split("\n")
    new_prompts = [prompt.strip() for prompt in new_prompts]
    return new_prompts

def evaluate_prompt(prompt, testset, print_info=False):
    # Evaluate the prompts on a random sample of test tasks from GSM8K
    print(f"Prompt being evaluated: {prompt}")
    score = 0
    selected_indices = random.sample(range(testset.num_rows), NUM_TASKS)
    for index in selected_indices:
        # Get the task
        sample = testset[index]

        # Get the component of the task
        question = sample["question"]
        answer = sample["answer"]
        numeric_answer = float(extract_number(answer.split("####")[1]))

        # Ask the model to generate
        generated_text = complete_text(f"{prompt}\nQuestion: {question}\nAnswer: ", [], {
                               "max_length": 500,
                               "temperature": 1.0,
                               "llm": llm,
                           })
        print_log(f"Question:\n{question}")
        print_log(f"Answer:\n{answer}")
        print_log(f"Generated text:\n{generated_text}")
        generated_numeric = complete_text(
            f"Get final numeric answer from the following text: {generated_text}\nNumber without units: ", 
            ["\n"], {"max_length": 10, "temperature": 1.0, "llm": llm})
        extracted_number = extract_number(generated_numeric)
        print(f"Numeric answer: {numeric_answer}")
        print(f"Generated numeric from LLM: {extracted_number}")
        try:
            if math.isclose(float(extracted_number), float(numeric_answer), rel_tol=1e-6, abs_tol=1e-9):
                print("Equal!")
                score += 1
        except:
            pass
        print_log("---")
        print_log("")
    return score

def select_prompts(population, scores):
    # Perform tournament selection to select the top 20 prompts for the next generation
    selected_prompts = []
    for _ in range(NUM_POPULATION):
        tournament_size = 5  # Adjust the tournament size as needed
        tournament_candidates = random.sample(range(NUM_POPULATION), tournament_size)
        tournament_winner = max(tournament_candidates, key=lambda x: scores[x])
        selected_prompts.append(population[tournament_winner])
    return selected_prompts

def mutate_prompt(prompt):
    # Mutate a prompt with a certain probability
    if random.random() < MUTATION_RATE:
        # Implement your mutation strategy here
        # For simplicity, we'll just generate a new prompt
        return generate_prompt()
    else:
        return prompt

In [None]:
import random
from datasets import load_dataset

# Algorithm constants
STARTING_PROMPT = "Solve the math word problem, giving your answer as an arabic numeral."
NUM_POPULATION = 10  # Number of prompts in each generation
NUM_MUTATION = 10    # NUmber of prompts 
NUM_TASKS = 5       # Number of tasks to evaluate per generation
MUTATION_RATE = 0.1  # Probability of mutation for each prompt
GENERATIONS = 1     # Number of generations (you can adjust this)

# New thinking styles
NEW_THINKING_STYLES = 5

In [87]:
# Load the GSM8K dataset (use the 'main' split and split into train and test)
dataset = load_dataset("gsm8k", "main")

In [88]:
dataset["test"][0]

{'question': "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
 'answer': 'Janet sells 16 - 3 - 4 = <<16-3-4=9>>9 duck eggs a day.\nShe makes 9 * 2 = $<<9*2=18>>18 every day at the farmer’s market.\n#### 18'}

In [105]:
# Set up the initial population
initial_task = "Solve the math word problem, giving your answer as an arabic numeral."
thinking_style = "Let’s make a step by step plan and implement it with good notion and explanation."
starting_task = [f"{thinking_style} {initial_task}"]

new_population = generate_new_prompts(initial_task, thinking_style, 9)

In [109]:
initial_population = starting_task + new_population
initial_population

['Let’s make a step by step plan and implement it with good notion and explanation. Solve the math word problem, giving your answer as an arabic numeral.',
 "Let's devise a plan of action, taking into account proper explanation and execution.",
 'Formulate a step-by-step approach, making sure there is deliberate consideration for its implementation.',
 'Construct a plan that can be accomplished gradually, requiring conscientious effort.',
 'Draft a stepwise outline while ensuring sound explanations surrounding the implementation.',
 'Build a sequential blue-print and execute it with a ranged understanding.',
 'Determine a step-wise program and put it into practice with appropriate and clear analysis.',
 'Design a procedural layout and undertake it prudently with lucid reasoning.',
 'Specify a plan with stages and execute it duly with proper justification.',
 'Produce an arrangement by steps and execute it with just and justified thought.']

In [107]:
# Evaluate
for prompt in initial_population:
    score = evaluate_prompt(prompt, dataset["test"])
    print(f"Score = {score}")

Prompt being evaluated: Let’s make a step by step plan and implement it with good notion and explanation. Solve the math word problem, giving your answer as an arabic numeral.
Question:
For each small task accomplished, Jairus gets $0.8 while Jenny gets $0.5. If each of them finished 20 tasks, how much more will Jairus get than Jenny?
Answer:
The difference of the amount received by Jairus and Jenny is $0.8/task - $0.5/task = $<<0.8-0.5=0.3>>0.3/task.
So, Jairus will get $0.3/task x 20 tasks = $<<0.3*20=6>>6 more than Jenny.
#### 6
Generated text:
$4 

Step 1: Calculate how much Jairus will get. Since Jairus gets $0.8 for each task accomplished, multiplying this by the number of tasks he accomplished, 20, will show us how much Jairus will receive.

20 x $0.8 = $16 

Step 2: Calculate how much Jenny will get. Jenny gets $0.5 for each task accomplished, so multiplying this by the number of tasks Jenny accomplished, 20, will give us her total. 

20 x $0.5 = $10 

Step 3: Find the differen

In [110]:
# Evaluate
for prompt in initial_population:
    score = evaluate_prompt(prompt, dataset["test"])
    print(f"Score = {score}")

Prompt being evaluated: Let’s make a step by step plan and implement it with good notion and explanation. Solve the math word problem, giving your answer as an arabic numeral.
Numeric answer: 48.0
Generated numeric from LLM: 43
Numeric answer: 10.0
Generated numeric from LLM: 10
Equal!
Numeric answer: 255.0
Generated numeric from LLM: 255
Equal!
Numeric answer: 90.0
Generated numeric from LLM: 90
Equal!
Numeric answer: 36.0
Generated numeric from LLM: 36
Equal!
Score = 4
Prompt being evaluated: Let's devise a plan of action, taking into account proper explanation and execution.
Numeric answer: 10.0
Generated numeric from LLM: 30
Numeric answer: 180.0
Generated numeric from LLM: 180
Equal!
Numeric answer: 110.0
Generated numeric from LLM: 110
Equal!
Numeric answer: 78.0
Generated numeric from LLM: 69
Numeric answer: 539.0
Generated numeric from LLM: 539
Equal!
Score = 3
Prompt being evaluated: Formulate a step-by-step approach, making sure there is deliberate consideration for its imple

## Let's do a bit of a mutation

The initial population is just a rewrite, but what about sampling a bunch of thinking style and then rewrite too?