Skip to content

WeeAris/gpt-prompt-engineer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

gpt-prompt-engineer

Twitter Follow Open In Colab

Overview

Prompt engineering is kind of like alchemy. There's no clear way to predict what will work best. It's all about experimenting until you find the right prompt. gpt-prompt-engineer is a tool that takes this experimentation to a whole new level.

Simply input a description of your task and some test cases, and the system will generate, test, and rank a multitude of prompts to find the ones that perform the best.

Features

  • Prompt Generation: Using GPT-4 and GPT-3.5-Turbo, gpt-prompt-engineer can generate a variety of possible prompts based on a provided use-case and test cases.

  • Prompt Testing: The real magic happens after the generation. The system tests each prompt against all the test cases, comparing their performance and ranking them using an ELO rating system.

Screen Shot 2023-07-04 at 11 41 54 AM
  • ELO Rating System: Each prompt starts with an ELO rating of 1200. As they compete against each other in generating responses to the test cases, their ELO ratings change based on their performance. This way, you can easily see which prompts are the most effective.

  • Weights & Biases Logging: Optional logging to Weights & Biases of your configs such as temperature and max tokens, the system and user prompts for each part, the test cases used and the final ranked ELO rating for each candidate prompt. Set use_wandb to True to use.

Setup

  1. Open the notebook in Google Colab or in a local Jupyter notebook.

  2. Add your OpenAI API key to the line openai.api_key = "ADD YOUR KEY HERE".

  3. If you have GPT-4 access, you're ready to move on. If not, change CANDIDATE_MODEL='gpt-4' to CANDIDATE_MODEL='gpt-3.5-turbo'.

How to Use

  1. Define your use-case and test cases. The use-case is a description of what you want the AI to do. Test cases are specific prompts that you would like the AI to respond to. For example:
description = "Given a prompt, generate a landing page headline." # this style of description tends to work well

test_cases = [
    {
        'prompt': 'Promoting an innovative new fitness app, Smartly',
    },
    {
        'prompt': 'Why a vegan diet is beneficial for your health',
    },
    {
        'prompt': 'Introducing a new online course on digital marketing',
    },
    {
        'prompt': 'Launching a new line of eco-friendly clothing',
    },
    {
        'prompt': 'Promoting a new travel blog focusing on budget travel',
    },
    {
        'prompt': 'Advertising a new software for efficient project management',
    },
    {
        'prompt': 'Introducing a new book on mastering Python programming',
    },
    {
        'prompt': 'Promoting a new online platform for learning languages',
    },
    {
        'prompt': 'Advertising a new service for personalized meal plans',
    },
    {
        'prompt': 'Launching a new app for mental health and mindfulness',
    }
]
  1. Choose how many prompts to generate. Keep in mind, this can get expensive if you generate many prompts. 10 is a good starting point.

  2. Call generate_optimal_prompt(description, test_cases, number_of_prompts) to generate a list of potential prompts, and test and rate their performance.

  3. The final ELO ratings will be printed in a table, sorted in descending order. The higher the rating, the better the prompt.

Screen Shot 2023-07-04 at 11 48 45 AM

Contributions are welcome! Some ideas:

  • have a number of different system prompt generators that create different styles of prompts, to cover more ground (ex. examples, verbose, short, markdown, etc.)
  • automatically generate the test cases

License

This project is MIT licensed.

Contact

Matt Shumer - @mattshumer_

Project Link: https://github.com/mshumer/gpt-prompt-engineer

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%