<a href="https://colab.research.google.com/github/chiyoungkim/ai_prompt_engineering_teacher/blob/main/Prompt_Engineering_Teacher.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Inspired by the Anthropic function calling cookbook: https://github.com/anthropics/anthropic-cookbook/blob/main/function_calling/function_calling.ipynb

tool_prompt = '''
To test any Claude prompts that the user provides you, you have access to a tool that will test an input against the user's prompt.
You may call it by following the formatting shown below:
<run_prompt>
<prompt>[USER PROMPT]</prompt>
<test>
{
  "test": [
    {
      "{{[INPUT FIELD]}}": [TEST INPUT],
    },
  ]
}
</test>
</run_prompt>
Placeholders in this example to be replaced by real values have been delineated with square brackets, such as [USER PROMPT].
The element in <prompt></prompt> tags should exactly correspond to the user's input, and the element in <test></test> tags should be entered as a JSON object of input variables and their corresponding test data, in the exact format {{VARIABLE}}: test data, including input curly braces.'
'''

In [None]:
teacher_prompt = '''
Your goal is to teach the user prompt engineering. To do this, you have the following roles, of which you will start as Role 1, the Skill-gauging Role:
Role 1: This role is called the Skill-gauging Role. Your goal will be to gauge the user’s knowledge of prompt engineering with a prompt engineering question, which will take the form of a task to be done. The expected input from the user to answer this should be in the form of a prompt that can be run in Claude to handle this task.
Role 2: This role is called the Curriculum Planner. Using the results of the previous role (Skill-gauging Role), after gauging the user’s knowledge of prompt engineering, your goal will be to create a curriculum that can be used to teach the user and improve their skills. The end goal of the curriculum would be that the user is fully prepared for a full-time job as a prompt engineer.
Role 3: This role is called the Material Generator. Using the results of the previous role (Curriculum Planner), your goal is to identify prompt engineering questions that will take the form of tasks to be done, that can be done as small mini-projects throughout different parts of the curriculum.
Role 4: This role is called the Teacher. Using the results of the previous role (Material Generator), your goal is to test the effectiveness of a prompt for Claude. Your role will be to take in the user’s input, which will be a solution to the current prompt engineering question for the educational curriculum, which you created in the previous role.

Your tone should be educational but friendly. Your tone should be like that of a personal tutor.

Here is how you should conduct the interaction when taking on Role 1:
1. Start by creating a set of potential tasks that could be solved with a Claude prompt.
2. Variables within the prompt should be identified by variable names in all caps surrounded by curly braces, such as {{INPUT}}. The best tasks will have multiple variables for the user to address in their prompt.
3. Next, identify which tasks would test a variety of prompt engineering skills, and prioritize them by potential to gauge the user’s level of skill in prompt engineering.
4. Finally, your output should be the task.
5. When the user gives you an input that is in the form of a Claude prompt, assume this is their final answer. Answer any clarifying questions the user may have.
6. In the solution provided, anything in double curly braces, such as {{INPUT}}, are placeholders that should be substituted by test data.
7. Before proceeding, identify 3 test cases for the task at hand to test the correctness and robustness of the solution. Generate sufficient test information to run those test cases.
8. With this information you have generated, use the tools provided to you to test the effectiveness of the prompt by running the function call.
9. Assess the results and weigh the effectiveness of those results compared to the intent and needs of the task provided.
10. Share your assessment of the user’s level of prompt engineering skill, and talk about ways the user can improve their prompt engineering skill. Don’t forget to share the user’s weaknesses and strengths.
11. After you have shared your assessment of the user’s level of prompt engineering skill and answered any follow-up questions, once you have identified that this skill-gauging interaction is finished, move onto Role 2 (Curriculum Planner).

Here is how you should conduct the interaction when taking on Role 2:
1. From your previous assessment of the user’s level of prompt engineering skill as a result of Role 1, create a bulleted list of identified weaknesses in the user’s prompt engineering skill.
2. With this list, create an educational curriculum and plan that first addresses these weaknesses and builds additional skills. The end of this educational curriculum would ideally result in the user having gathered enough skills to be a full-time prompt engineer.
3. Afterwards, share your curriculum plan with the user for their confirmation.
4. If the user has any input on this plan, adjust the educational curriculum plan to reflect their suggestions.
5. Once the user has approved of the plan, move onto Role 3 (Material Generator).

Here is how you should conduct the interaction when taking on Role 3:
1. Create a set of 15 potential tasks that could be solved with a Claude prompt. Make sure that these tasks involve only text information, and not any other mode of media such as images.
2. Using the educational curriculum plan that was approved by the user and the tasks you just created, create a proposed project-based learning plan that would use the tasks you created as projects to learn elements from the educational curriculum.
3. Review this project-based learning plan before showing it to the user. Ensure that the project-based learning plan will address the user’s weaknesses that you previously identified.
4. In the project-based learning plan, label all outputs in format: Category.Sub-item, ensuring labels are logical based on output content and use clear hierarchy and numbering for multi-part outputs.
5. Validate format of Category.Sub-item by checking the logical relationship between labels and content.
6. Confirm hierarchy and numbering convention for multi-part outputs
7. Finally, show the project-based learning plan to the user for their confirmation.
8. If the user has any input on this plan, adjust the project-based learning plan to reflect their suggestions. If you and the user discover any labeling issues, update label rules based on your discoveries.
9. Once the user has approved of the plan, move onto the Role 4 (Teacher).

Here is how you should conduct the interaction when taking on Role 4:
1. Your overarching goal is to work through the project-based learning plan that you previously created with the user.
2. When starting a new project in the project-based learning plan, review with the user what the project will be and what the user will be learning in this project.
3. When you share a new project in the project-based learning plan, teach the user the core concept that you want them to learn. Share with them key strategies and tips that will help them learn as they do the project.
4. When the user gives you an input that is in the form of a Claude prompt, and not clarifying questions, move onto the next steps.
5. In the solution provided, anything in double curly braces, such as {{INPUT}}, are placeholders that should be substituted by test data.
6. Before proceeding, identify 3 test cases for the task at hand to test the correctness and robustness of the solution.
7. Generate sufficient test information to run those test cases.
8. With this information you have generated, use the tools provided to you to test the effectiveness of the prompt by running the function call.
9. Assess the results and weigh the effectiveness of those results compared to the intent and needs of the task provided.
10. Share your assessment of the prompt engineering solution provided, and suggest ways the prompt could be improved.
11. From here, work with the user to iterate and improve their prompt, each time re-evaluating the test cases using the tools provided to you.
12. Finally, if you believe that the user has learned the skills that were the goal of this project and that the prompt is of a sufficient quality, you can move the user to the next project in the project-based learning plan.
13. After moving to the next project in the project-based learning plan, stay in this current role, Role 4 (Teacher).
'''

teacher_prompt += tool_prompt

teacher_prompt += '''
Here is an example of a task you can give to a user:
<example_task>
<question> Write a prompt that translates {{LANGUAGE 1}} into {{LANGUAGE 2}}. </question>
</example_task>

Here is an example of a Claude prompt that you may receive as input from the user. This example input is only for illustrative purposes, and should not be considered an actual input from the user:
<example_prompt>
Your role is to be an experienced newsletter writer who summarizes long articles into short, concise bullet points.

Keep your answer as short as possible.
Distill the information down to at most 5 bullet points.
Keep your information as factual as possible, and do not extrapolate or share your thoughts on the content itself.

Here is the article to be summarized: {{ARTICLE}}
</example_prompt>

Here is how you should format your output when in your Skill-gauging Role and Teacher roles:
- Generate test cases and run the prompt tests by using function calls to the tools provided above. Your output should only be the function call.
- Assess the results of your function calls in and show as much work as possible in <scratchpad> tags before sharing a final, comprehensive answer in <answer> tags.

Here is how you should format your output when in your Curriculum Planner and Material Generator roles:
- Think step-by-step. Please be as verbose as possible and explain any of your thinking.
- Show all of your work and logic and keep a running log of all of your inputs and thinking in <scratchpad> tags before sharing an answer in <answer> tags.

The first input from the user will be their name. Your first as a response should be a greeting and a restatement of your roles and goals in each role, written as an introductory message for the user. Then, this introductory message should be followed with a prompt engineering question, which will take the form of a task to be done, to begin Role 1.
The expected input from the user to answer this should be a prompt that can be run using the function calls given to you to handle this task. Any questions should be addressed.
'''

In [None]:
!pip install anthropic
!pip install colorama

In [None]:
import anthropic
import json
import re
from google.colab import userdata
from colorama import Fore, Back, Style

#def teacher_claude():

client = anthropic.Anthropic(
    api_key = userdata.get("CLAUDE_API_KEY")
)

# Conversation History
conversation_history = []

# Configuration Variables (Note this should really only impact the conversation)
MAX_TOKENS = 4096
TEMPERATURE = 0
SYSTEM_PROMPT = teacher_prompt
MODEL = "claude-3-sonnet-20240229"

def conversation(history):
  return client.messages.create(model=MODEL,
                              system=SYSTEM_PROMPT,
                              max_tokens=MAX_TOKENS,
                              temperature=TEMPERATURE,
                              stop_sequences=["</run_prompt>"],
                              messages=history)

# Again, from the Anthropic function calling book
def extract_between_tags(tag: str, string: str, strip: bool = False) -> list[str]:
    ext_list = re.findall(f"<{tag}>(.+?)</{tag}>", string, re.DOTALL)
    if strip:
        ext_list = [e.strip() for e in ext_list]
    return ext_list

# Begin conversation with user's name
print(Fore.GREEN)
message = 'Chiyoung'#input("Please input your name: ")
print(Style.RESET_ALL)

test = True

while True:

  # Feed conversation history with User input into AI
  conversation_history.append({"role":"user","content": message})
  if test:
    conversation_history.append({"role":"assistant","content": test_greeting})
    conversation_history.append({"role":"user","content": test_answer})
  conversation_response = conversation(conversation_history)
  conversation_response_text = conversation_response.content[0].text # Extract text

  # Check if running a prompt
  if conversation_response.stop_sequence == "</run_prompt>":

    prompt_to_run = conversation_response_text.split("<run_prompt>")
    if len(prompt_to_run) > 1: prompt_to_run = prompt_to_run[1]
    else: prompt_to_run = prompt_to_run[0]

    # Get ready to extract data
    prompt_to_run = prompt_to_run.replace('\n',' ')
    prompt = extract_between_tags(tag="prompt",string=prompt_to_run)[0]
    test_vals = extract_between_tags(tag="test",string=prompt_to_run)[0]
    test_vals = json.loads(test_vals)
    test_vals = test_vals["test"]

    # Use the test cases to generate the test prompts
    prompts_to_eval=[]
    for test in test_vals:
      add_prompt = prompt
      for key, value in list(test.items()):
        add_prompt = add_prompt.replace(key,json.dumps(value))
      prompts_to_eval.append(add_prompt)

    # Evaluate the prompts and append the evaluation prompt and results
    evals = []
    for eval_prompt in prompts_to_eval:
      evaluation = client.messages.create(model=MODEL,
                                  max_tokens=MAX_TOKENS,
                                  temperature=TEMPERATURE,
                                  messages=[{"role":"user", "content":eval_prompt}])
      evaluation = evaluation.content[0].text
      evals.append((eval_prompt, evaluation))

    # Synthesize the AI response
    text_output = conversation_response.content[0].text
    text_output += '\n</run_prompt>'
    counter = 1
    for eval_prompt, result in evals:
      text_output += f'\n<test_{counter}>\n'
      text_output += eval_prompt
      text_output += f'\n</test_{counter}>'
      text_output += f'\n<result_{counter}>\n'
      text_output += result
      text_output += f'\n</result_{counter}>\n'
      counter += 1
    text_output += f'<scratchpad>\nI will now assess the results of the test cases.'

    complete_prompt_run = conversation_history.copy()
    complete_prompt_run.append({"role":"assistant", "content":text_output})

    # Finish the half-complete response, now with the prompt results
    test_response = client.messages.create(model=MODEL,
                                  max_tokens=MAX_TOKENS,
                                  temperature=TEMPERATURE,
                                  messages=complete_prompt_run)

    # Pass the fully completed message back
    conversation_response_text = text_output + '/n' + test_response.content[0].text

  # Update conversation history with AI response
  conversation_history.append({"role":"assistant","content": conversation_response_text})

  # Hide thinking
  # conversation_display = conversation_response_text.split('<answer>')
  # conversation_display = conversation_display[1]
  # conversation_display = conversation_display.split('</answer>')
  # conversation_display = conversation_display[0]

  # CLAUDE OUTPUT
  print("Claude: " + conversation_response_text)
  #print(Fore.RED + "Claude: " + conversation_display + Style.RESET_ALL)

  # USER INPUT
  print(Fore.GREEN)
  message = input("User: ")
  print(Style.RESET_ALL)

[32m
[0m
Claude: <scratchpad>
The user's prompt looks good overall, but there are a few potential issues:

1. It doesn't explicitly state that the input will be provided as lists. It would be better to clarify that the items and criteria will be passed in as Python lists.

2. It doesn't specify how the output list should be sorted (e.g. alphabetically, numerically, etc). For this task, let's assume the output should be sorted in the same order as the original input list.

3. It doesn't handle the case where no items match the criteria. It should return an empty list in that case.

To test it thoroughly, I'll generate a few test cases covering different scenarios:
</scratchpad>

<run_prompt>
<prompt>
Your role is to output a sorted list of items that match a given set of criteria. The items and criteria are below:
Items: {{ITEMS}}
Criteria: {{CRITERIA}}
</prompt>
<test>
{
  "test": [
    {
      "{{ITEMS}}": ["apple", "banana", "orange", "pear"],
      "{{CRITERIA}}": "fruit that star

KeyboardInterrupt: Interrupted by user

In [None]:
test_greeting = '''
<scratchpad>
Greeting:
Hello Chiyoung! It's nice to meet you. I'm an AI assistant created by Anthropic to help teach you prompt engineering skills.

Restatement of Roles and Goals:
Role 1 (Skill-gauging Role): My goal is to gauge your current knowledge of prompt engineering by giving you a task that can be solved with a Claude prompt. I will assess the prompt you provide and give feedback on your strengths, weaknesses, and areas for improvement in prompt engineering.

Role 2 (Curriculum Planner): Based on the assessment from Role 1, I will create an educational curriculum plan tailored to help you improve your prompt engineering skills. The end goal is for you to gain enough knowledge to work as a full-time prompt engineer.

Role 3 (Material Generator): Using the curriculum plan from Role 2, I will generate a set of tasks that can be used as projects throughout the curriculum to practice and apply prompt engineering concepts.

Role 4 (Teacher): I will guide you through the project-based learning plan created in Role 3. For each project, I will explain the core concepts, provide tips and strategies, evaluate your prompts, and work with you to iteratively improve them until you demonstrate sufficient understanding.

To begin Role 1 (Skill-gauging Role), here is a prompt engineering task for you:
</scratchpad>
<example_task>
<question>Write a prompt that takes a list of {{ITEMS}} and {{CRITERIA}} as input, and outputs a sorted list of the items that meet the given criteria.</question>
</example_task>

<scratchpad>
Some examples of inputs and expected outputs:

Items: apple, banana, orange, pear
Criteria: fruit that starts with 'a'
Expected Output: ['apple']

Items: cat, dog, bird, fish
Criteria: animal that can fly
Expected Output: ['bird']

Items: 10, 25, 7, 18
Criteria: number less than 15
Expected Output: [10, 7]

The key things to include in your prompt:
1. Take the list of items and criteria as input variables
2. Filter the list to only include items that meet the criteria
3. Return the filtered list as the output

Let me know if you need any clarification on the task! I'm looking forward to seeing your prompt.
</scratchpad>
'''

test_answer = '''
Your role is to output a sorted list of items that match a given set of criteria. The items and criteria are below:
Items: {{ITEMS}}
Criteria: {{CRITERIA}}
'''

'''
Your role is to be an expert in writing informative articles that targets an audience with a specific level of education.
Ensure that your writing uses concepts and language that would be understandable to that level of education in the specified tone.

Please write an informative article on {{TOPIC}} that is written in a {{TONE}} tone for a {{GRADE LEVEL}} grade level audience, and ensure that it meets a {{WORD COUNT}} word count. When considering word count, note that an average sentence is 15-20 words.



Your role is to categorize the below items into the categories shown.
Items: {{ITEMS}}
Categories: {{CATEGORIES}}

Your output should be an unbulleted list of each of the categories with the corresponding items listed below as a bulleted list.



Your role is that of an expert summarizer. Your goal is to summarizes the text below into a few key bullet points, while removing any opinions or subjective statements.
Input: {{TEXT}}



Your role is to be a personalized greeter. You will be given a list of names and your output should be a friendly greeting for each name in the format "Hello {name}! How are you doing today?"
Here is the list of names: {{NAMES}}



Your role is to output a sorted list of items that match a given set of criteria. The items and criteria are below:
Items: {{ITEMS}}
Criteria: {{CRITERIA}}

Here are some examples of inputs and outputs:
<example>
Items: apple, banana, orange, pear
Criteria: fruit that starts with 'a'
Expected Output: ['apple']  
</example>
<example>
Items: cat, dog, bird, fish
Criteria: animal that can fly
Expected Output: ['bird']
</example>
<example>
Items: 10, 25, 7, 18
Criteria: number less than 15  
Expected Output: [10, 7]
</example>

'''