# Agents Prompt Modification

In this notebook, we demonstrate an agent that changes the prompt of another agent. This would constitute a kind of quality intelligence improvement.

In [27]:
# an evaluator agent evaluates the performance of a task agent
# the evaluator agent stores a prompt for a task agent in a file
# the task agent performs a task and stores a transcript of its actions to a file
# the evaluator reads the transcript and evaluates the task agent's performance
# the evaluator decides how to change the task agent's prompt based on the performance
# the task agent reads the new prompt and performs the task again

We're going to work on the 20_newsgroups dataset for classification.

## Evaluator Agent

In [33]:
from langchain import SerpAPIWrapper
from langchain.agents import AgentExecutor, OpenAIFunctionsAgent
from langchain.tools import BaseTool
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from pydantic import BaseModel, Field
import os
from dotenv import load_dotenv
from typing import Type

load_dotenv()

search = SerpAPIWrapper(serpapi_api_key=os.environ['SERPAPI_API_KEY'])
llm = ChatOpenAI(model='gpt-3.5-turbo-0613', openai_api_key=os.environ['OPENAI_API_KEY'])

class EvaluateSingleTranscriptToolInput(BaseModel):
    task_agent_prompt: str = Field(description="The prompt used by the task agent to perform the task.")
    transcript: str = Field(description="The transcript of a task agent's actions.")

class EvaluateSingleTranscriptTool(BaseTool):
    """
    This evaluates a single transcript of a task agent's actions. It tries to assess whether
    the task agent's actions were successful or not. It could be a simple loss function, or
    it could be a more complex function that tries to determine the cause of the task agent's
    failure.
    """
    name = "transcript-evaluator"
    description = "Evaluate the transcript of a task agent's actions."
    args_schema: Type[BaseTool] = EvaluateSingleTranscriptToolInput

    def _run(self, transcript: str, prmopt: str):
        pass # could be some sort of loss function?
    def _arun(self, transcript: str, prmopt: str):
        pass

class ModifyPromptToolInput(BaseModel):

    task_agent_prompt: str = Field(description="The prompt used by the task agent to perform the task.")
    evaluation: str = Field(description="The evaluation of the task agent's performance.")

class ModifyPromptTool(BaseTool):
    """
    Seems like it's going to be limited if it can only modify the prompt based on a single transcript.
    Probably the EvaluateTranscriptsTool needs to be able to evaluate multiple transcripts, then
    come up with an abstraction for a common cause as to why the task agent failed, then modify the prompt
    based on that abstraction.
    """
    name = "prompt-modifier"
    description = "Modify the prompt of a task agent based on its performance."
    args_schema: Type[BaseModel] = ModifyPromptToolInput
    def _run(self, evaluation: str, task_agent_prompt: str):
        prompt = PromptTemplate(input_variables=[
            'task_agent_prompt',
            'transcript',
            'evaluation'
        ], template="""
        You are an evaluator of AI agents. You review the diagnosed problem of an agent and decide how to change their prompts.
        The task agent's original prompt was: {task_agent_prompt}
        The task agent's transcript is: {transcript}
        The task agent's evaluation is: {evaluation}

        You decide to change the task agent's prompt to:
        """)
        chain = LLMChain(llm=llm, prompt=prompt)
        return chain.run({'task_agent_prompt': task_agent_prompt, 'evaluation': evaluation})

    def _arun(self, evaluation: str, transcript: str, prmopt: str):
        pass

tools = [
    EvaluateSingleTranscriptTool(),
    ModifyPromptTool()
]

evaluator_agent = OpenAIFunctionsAgent.from_llm_and_tools(
    llm=llm, 
    tools=tools,
    prefix="You are an evaluator of AI agents. You review task agents' transcripts and decide how to change their prompts."
    )

evaluator_agent_executor = AgentExecutor.from_agent_and_tools(evaluator_agent, tools)

## Task Agent

In [36]:
task_agent_prefix = open('./task_agent_prefix.txt', 'r').read()

class ClassifyArticleToolInput(BaseModel):
    article: str = Field(description="The article to classify.")

class ClassifyArticleTool(BaseTool):
    """
    This tool classifies an article based on its content.
    """
    name = "article-classifier"
    description = "Classify an article based on its content."
    args_schema: Type[BaseTool] = ClassifyArticleToolInput

    def _run(self, article: str):
        prompt = PromptTemplate(input_variables=['article'], template="""
                                You are a classifier of articles. You classify articles based on their content.
                                The article you are classifying is: {article}
                                You classify the article as:
                                """)
        chain = LLMChain(llm=llm, prompt=prompt)
        return chain.run({'article': article})

    def _arun(self, article: str):
        pass

task_agent_tools = [ClassifyArticleTool()]
task_agent = OpenAIFunctionsAgent.from_llm_and_tools(
    llm=llm,
    tools=task_agent_tools,
    prefix=task_agent_prefix,
    verbose=True
    )

task_agent_executor = AgentExecutor.from_agent_and_tools(task_agent, task_agent_tools, verbose=True)

# Prep 20_newsgroups dataset

In [37]:
# prep the 20_newsgroups dataset

from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.decomposition import TruncatedSVD
from sklearn.pipeline import Pipeline
import numpy as np
import pandas as pd
import re

def clean_text(text):
    text = re.sub(r'[^A-Za-z0-9]+', ' ', text)
    text = text.lower()
    return text

def get_data():
    newsgroups_train = fetch_20newsgroups(subset='train')
    newsgroups_test = fetch_20newsgroups(subset='test')
    X_train = pd.DataFrame(newsgroups_train.data)
    X_train.columns = ['text']
    X_test = pd.DataFrame(newsgroups_test.data)
    X_test.columns = ['text']
    X_train['text'] = X_train['text'].apply(clean_text)
    X_train['target'] = newsgroups_train.target
    X_train['label'] = X_train['target'].apply(lambda x: newsgroups_train.target_names[x])
    
    X_test['text'] = X_test['text'].apply(clean_text)
    X_test['target'] = newsgroups_test.target
    X_test['label'] = X_test['target'].apply(lambda x: newsgroups_test.target_names[x])

    return X_train, X_test

X_train, X_test = get_data()

## Putting it together

### Run the task agent on an example article

In [39]:
for i in range(1):
    sample = dict(X_train.iloc[i])
    text = sample['text']
    label = sample['label']

    print(task_agent_executor.run(text))


Based on the article, it seems that you are looking for information about a car called the Bricklin. The car is described as a 2-door sports car from the late 60s or early 70s. It is known for having small doors and a separate front bumper from the rest of the body. 

Unfortunately, the article does not provide any additional information about the model name, engine specs, years of production, where the car is made, or its history. If you have any further information or if anyone can provide more details about the Bricklin, please email the requester.
