## Bidding Architecture

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts.chat import SystemMessage, _convert_to_message
from langchain.prompts import PromptTemplate, ChatPromptTemplate, MessagesPlaceholder, HumanMessagePromptTemplate
from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage, AIMessage
from langchain_core.output_parsers.openai_functions import JsonOutputFunctionsParser

from langgraph.graph import END, StateGraph, MessageGraph
from langgraph.checkpoint.sqlite import SqliteSaver

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

import functools
import operator
from typing import List, Sequence, TypedDict, Annotated
import json
import os
import random

from IPython.display import Image, display

import concurrent.futures

In [3]:
unique_id = "Market Optimisation"
os.environ["LANGCHAIN_PROJECT"] = f"Tracing Walkthrough - {unique_id}"

In [4]:
# from langsmith import Client

# client = Client()

In [11]:
class CorePrinciples:
    def __init__(self, core_principles: List[str]):
        self.core_principles = core_principles
    
    def add_principle(self, principle: str):
        """
        Adds a principle to the core principles list.
        
        :param principle: The principle to be added.
        """
        self.core_principles.append(principle)
        
    def __str__(self):
        """
        Returns a string representation of the core principles, each principle is listed on a new line with a preceding dash.
        
        Example:
        - principle 1
        - principle 2
        ...
        """
        return "\n".join([f"- {principle}" for principle in self.core_principles])

class ExpertAgent:
    """
    Expert Agent class defining agents that provide feedback on prompts.
    """

    def __init__(self, position: str, core_principles: CorePrinciples, llm = ChatOpenAI(temperature=0.5, model="gpt-4o")):
        self.position = position
        self.core_principles = core_principles
        self.system_message = f"""You are an experienced: {self.position}. Your core principles are:
{self.core_principles}"""
        self.llm = llm

    def bid(self, state: Sequence[BaseMessage]) -> float:
        """
        Bids on the prompt based on the expert's expertise.
        """
        prompt_text = f"""Your task is to bid on the prompt in the conversation above in light of your core principles.
The bid must reflect the prompts quality and alignment with your core principles.

The bid must be an integer between 1 and 10 and should be based on the following scale:
- 1 (Exceptional Alignment): Perfectly aligns with your core principles. No modifications needed.
- 2 (Strong Alignment): Demonstrates strong alignment. Minimal to no adjustments required.
- 3 (Good Alignment): Well-aligned with minor tweaks needed.
- 4 (Moderate Alignment): Moderately aligned but requires moderate adjustments.
- 5 (Adequate Alignment): Adequate alignment with room for improvement.
- 6 (Fair Alignment): Fairly aligned but lacking in certain areas. Significant improvements needed.
- 7 (Marginal Alignment): Marginally aligns, requiring substantial reworking.
- 8 (Poor Alignment): Poorly aligns, necessitating major revisions.
- 9 (Very Poor Alignment): Significantly misaligned, requiring a comprehensive overhaul.
- 10 (Wholesale Changes Needed): In direct conflict with your core principles, requiring wholesale changes.

Your bid process should be as follows:
1. Read the prompt carefully as an experienced: {self.position}. Understand it's content and intent.
2. Based on your assessment of how well the prompt aligns with your core principles, assign a bid using the bidding scale. Ensure your bid reflects the prompt's quality and alignment accurately.
3. Submit your bid."""
        function_def = {
            "name": "bid",
            "description": "Submit a bid for the prompt",
            "parameters": {
                "type": "object",
                "properties": {
                    "expert": {"type": "string", "enum": [self.position]},
                    "bid": {"type": "string", "enum": ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]},
                },
                "required": ["expert", "bid"],
            },
        }
        messages = [
            ("system", self.system_message),
            MessagesPlaceholder(variable_name="messages"),
            ("system", prompt_text),
        ]

        prompt = ChatPromptTemplate.from_messages(messages)

        chain = (
            prompt
            | self.llm.bind_functions(functions=[function_def], function_call="bid")
            | JsonOutputFunctionsParser()
        )
        result = chain.invoke({"messages": state})
        return result

    def self_reflection_graph(self, criteria) -> MessageGraph:
        """
        Constructs a graph for self-reflection and improvement of prompts.
        """

        def generation_node(state: Sequence[BaseMessage]):
            prompt_text = f"""Your task is to improve the prompt in the conversation above in light of your core principles.
If you recieve feedback and recommendations for the prompt, respond with a revised version of your previous attempts actioning the feedback.
Always think outside the box and consider unconventional ideas on how to implement the feedback.

The success criteria for the prompt are as follows:
{criteria}
You will be penalized if the prompt does not meet this criteria.

Below are strict guidelines that you MUST follow if making changes to the prompt:
- DO NOT modify existing restrictions.
- DO NOT modify or remove negations.
- DO NOT add, modify or remove placeholders denoted by curly braces. If you wish to use curly braces in your response, use double curly braces to avoid confusion with placeholders.
- ALWAYS treat placeholders as the actual content.
You will be penalized if you do not follow these guidelines.

Your update process should be as follows:
1. Read the prompt as an experienced: {self.position}. Understand it's content and intent.
2. Think carefully about how you can implement the most recent feedback and revise the prompt.
3. Explcitly go through each success criteria and ensure the prompt meets them. If not, revise the prompt to make sure it does.
4. Explicitly go through each guideline and ensure the changes adhere to them. If not, revise the prompt to make sure it does.
5. Submit your revised prompt."""
            messages = [
                ("system", self.system_message),
                MessagesPlaceholder(variable_name="messages"),
                ("system", prompt_text),
            ]
            prompt = ChatPromptTemplate.from_messages(messages)
            chain = prompt | self.llm
            return chain.invoke({"messages": state})
            
        def reflection_node(state: Sequence[BaseMessage]):
            prompt_text = f"""Your task is to provide feedback on the prompt in the conversation above in light of your core princples.
Always think outside the box and consider unconventional ideas on how to enforce your core principles in the prompt.

The success criteria for the updated prompt are as follows:
{criteria}
You must use this information to inform your feedback.

Your reviewal process should be as follows:
1. Read the prompt carefully as an experienced: {self.position}. Understand it's content and intent.
2. Explain how you think the prompt can be improved in light of your core principles.
3. Submit your feedback."""
            messages = [
                ("system", self.system_message),
                MessagesPlaceholder(variable_name="messages"),
                ("system", prompt_text),
            ]
            prompt = ChatPromptTemplate.from_messages(messages)
            chain = prompt | self.llm
            result = chain.invoke({"messages": state})
            return HumanMessage(content=result.content)

        builder = MessageGraph()
        builder.add_node("generate", generation_node)
        builder.add_node("reflect", reflection_node)
        builder.set_entry_point("generate")

        def should_continue(state: List[BaseMessage]):
            if len(state) > 2:
                return END
            return "reflect"

        builder.add_conditional_edges("generate", should_continue)
        builder.add_edge("reflect", "generate")
        graph = builder.compile()

        return graph
    
    def update_prompt(self, state: Sequence[BaseMessage], criteria) -> str:
        """
        Uses self_reflection_graph to iteratively act on feedback and update prompt
        """
        graph = self.self_reflection_graph(criteria)
        result = graph.invoke([HumanMessage(content=state["messages"][-2].content)])
        return {"messages": result, "next": "Moderator"}

In [17]:
class MarketPlace:
    """
    Moderator Agent class defining an agent that builds the experts and moderates the bidding process.
    """

    def __init__(self, base_prompt: str, criteria: str = None, experts: List[ExpertAgent] = None):
        self.base_prompt = base_prompt
        self.criteria = criteria
        self.experts = experts
        self.iteration = 0

    def reset(self):
        """
        Resets the bidding process.
        """
        self.iteration = 0

    def bidding_process(self, state: dict) -> dict:
        """
        Execution of bidding process to select the next expert to process the prompt.
        """
        self.iteration += 1
        # print(f"Iteration: {self.iteration}")
        # Collect bids from all experts
        with concurrent.futures.ThreadPoolExecutor() as executor:
            futures = [executor.submit(expert.bid, state["messages"]) for expert in self.experts]
            bids = [future.result() for future in concurrent.futures.as_completed(futures)]
        # Only consider valid bid formats
        bids = [bid for bid in bids if 'expert' in bid]
        # Convert bids to integers
        for bid in bids:
            bid["bid"] = int(bid["bid"])
        # Sort dictionary of bids by bid value
        bids = sorted(bids, key=lambda x: x["bid"], reverse=True)
        # for i in range(len(bids)):
        #     print("Expert: {expert}, Bid: {bid}".format(expert=bids[i]["expert"], bid=bids[i]["bid"]))
        # If a tie occurs, randomly select a expert from the tied experts, else select the expert with the highest bid
        max_bid = max(bids, key=lambda x: x["bid"])["bid"]
        tied_experts = [expert for expert in bids if expert["bid"] == max_bid]
        if len(tied_experts) > 1:
            highest_bidder = tied_experts[random.randint(0, len(tied_experts) - 1)]
        else:
            highest_bidder = tied_experts[0]
        next_expert = highest_bidder["expert"]
        print(f"Highest Bidder: {next_expert}, Bid: {max_bid}")
        # Update the state with the next expert to process
        if max_bid <= 2.0 or self.iteration >=6:
            self.reset()
            return {
                "next": "FINISH", 
                "messages": [AIMessage(content=f"Bidding over. All bids <= 2", name="Moderator")]
            }
        else:
            return {
                "next": next_expert, 
                "messages": [AIMessage(content=f"Highest Bidder: {next_expert}, Bid: {max_bid}", name="Moderator")]
            }
        
    def construct_expert_graph(self):
        """
        Constructs a graph of expert agents based on their roles and functions.
        """

        def agent_node(state, agent):
            return agent.update_prompt(state, self.criteria)
        
        # The agent state is the input to each node in the graph
        class AgentState(TypedDict):
            # The annotation tells the graph that new messages will always be added to the current states
            messages: Annotated[Sequence[BaseMessage], operator.add]
            next: str

        workflow = StateGraph(AgentState)
        for expert in self.experts:
            # Create a node for each expert agent
            node = functools.partial(agent_node, agent=expert)
            workflow.add_node(expert.position, node)
        workflow.add_node("Moderator", self.bidding_process)

        members = [expert.position for expert in self.experts]
        for member in members:
            # We want our experts to ALWAYS "report back" to the moderator when done
            workflow.add_edge(member, "Moderator")
        # The moderator populates the "next" field in the graph state with routes to a node or finishes
        conditional_map = {k: k for k in members}
        conditional_map["FINISH"] = END
        workflow.add_conditional_edges("Moderator", lambda x: x["next"], conditional_map)
        # Finally, add entrypoint
        workflow.set_entry_point("Moderator")

        memory = SqliteSaver.from_conn_string(":memory:")
        graph = workflow.compile(checkpointer=memory)
        
        return graph
    
    def optimise_prompt(self):
        """
        Optimises a prompt by invoking a graph of expert agents.
        """
        # Initial state
        initial_state = {
            "messages": [HumanMessage(content=self.base_prompt, name="User")],
        }

        # Construct the graph
        graph = self.construct_expert_graph()
        # display(Image(graph.get_graph().draw_mermaid_png()))

        n = random.randint(1, 1000)
        config = {
            "configurable": {"thread_id": n},
            "recursion_limit": 50,
            }    

        # Run the graph
        for s in graph.stream(
            initial_state,
            config,
            stream_mode="values",
            ):
            if "__end__" not in s:
                continue
                # if len(s["messages"]) > 0:
                #     s["messages"][-1].pretty_print()
                    
        def message_to_dict(obj):
            if isinstance(obj, HumanMessage) or isinstance(obj, AIMessage):
                return {obj.name: obj.content}
            raise TypeError(f'Object of type {obj.__class__.__name__} is not JSON serializable')

        model = self.llm.model_name
        temp = int(self.llm.temperature)
        path = f"/Users/iwatson/Documents/Research Project/prompt-optimisation/src/conversations/{model}/conversations_market_{temp}.json"
        if not os.path.exists(path):
            with open(path, "w") as f:
                json.dump([], f)
        
        with open(path, "r") as f:
            # write messages to json file
            data = json.load(f)
            # get the current key number then increment it
            key = len(data)
            data.append({key: json.dumps(s, default=message_to_dict)})
            
        with open(path, "w") as f:
            json.dump(data, f, indent=4)
        
        return s

In [18]:
# llm = ChatAnthropic(temperature=1.0, model="claude-3-5-sonnet-20240620")
llm = ChatOpenAI(temperature=1.0, model="gpt-4o")

In [19]:
from agent_suite import PromptDesignAgents, HumanEvalAgents, GSM8kAgents, SST2Agents

prompt_design_agents = PromptDesignAgents()

style_and_structure_expert = ExpertAgent("Style_and_Structure_Expert", CorePrinciples(prompt_design_agents.get_style_and_structure_principles()), llm)
conciseness_and_clarity_expert = ExpertAgent("Conciseness_and_Clarity_Expert", CorePrinciples(prompt_design_agents.get_conciseness_and_clarity_principles()), llm)
contextual_relevance_expert = ExpertAgent("Contextual_Relevance_Expert", CorePrinciples(prompt_design_agents.get_contextual_relevance_principles()), llm)
task_alignment_expert = ExpertAgent("Task_Alignment_Expert", CorePrinciples(prompt_design_agents.get_task_alignment_principles()), llm)
example_demonstration_expert = ExpertAgent("Example_Demonstration_Expert", CorePrinciples(prompt_design_agents.get_example_demonstration_principles()), llm)
incremental_prompting_expert = ExpertAgent("Incremental_Prompting_Expert", CorePrinciples(prompt_design_agents.get_incremental_prompting_principles()), llm)

human_eval_agents = HumanEvalAgents()

code_reviewer = ExpertAgent("Code_Reviewer", CorePrinciples(human_eval_agents.get_code_reviewer_principles()), llm)
software_engineer = ExpertAgent("Software_Engineer", CorePrinciples(human_eval_agents.get_software_engineering_principles()), llm)
software_architect = ExpertAgent("Software_Architect", CorePrinciples(human_eval_agents.get_software_architecture_principles()), llm)

gsm8k_agents = GSM8kAgents()

mathematician = ExpertAgent("Mathematician", CorePrinciples(gsm8k_agents.get_mathematician_principles()), llm)
word_problem_solver = ExpertAgent("Word_Problem_Solver", CorePrinciples(gsm8k_agents.get_word_problem_solver_principles()), llm)

sst2_agents = SST2Agents()

graded_sentiment_analyst = ExpertAgent("Graded_Sentiment_Analyst", CorePrinciples(sst2_agents.get_graded_sentiment_analyst_principles()), llm)
emotive_sentiment_analyst = ExpertAgent("Emotive_Sentiment_Analyst", CorePrinciples(sst2_agents.get_emotive_sentiment_analyst_principles()), llm)
aspect_based_sentiment_analyst = ExpertAgent("Aspect_Based_Sentiment_Analyst", CorePrinciples(sst2_agents.get_aspect_based_sentiment_analyst_principles()), llm)


In [20]:
from prompts.gpt_4o.human_eval_prompts import HumanEvalPrompts
from prompts.gpt_4o.gsm8k_prompts import GSM8KPrompts
from prompts.gpt_4o.sst2_prompts import SST2Prompts

human_eval_prompts = HumanEvalPrompts()
gsm8k_prompts = GSM8KPrompts()
sst2_prompts = SST2Prompts()

baseline_prompt = sst2_prompts.get_baseline_prompt()
criteria = sst2_prompts.get_criteria()

market = MarketPlace(
    base_prompt=baseline_prompt,
    criteria=criteria,
    experts=[
        style_and_structure_expert,
        conciseness_and_clarity_expert,
        contextual_relevance_expert,
        task_alignment_expert,
        example_demonstration_expert,
        incremental_prompting_expert,
        graded_sentiment_analyst,
        emotive_sentiment_analyst,
        aspect_based_sentiment_analyst,
    ],
    )
for expert in market.experts:
    print("Position: ", expert.position + "\nCore Principles: ", expert.core_principles)

Position:  Style_and_Structure_Expert
Core Principles:  - Always structure prompts logically for the task
- Always use a style and tone in prompts that is appropriate for the task
- Always assign a role to the language model that is relevant to the task
Position:  Conciseness_and_Clarity_Expert
Core Principles:  - Always write clear and concise prompts
- Always use simple and direct language in prompts
- Always avoid ambiguity in prompts
Position:  Contextual_Relevance_Expert
Core Principles:  - Always provide context to help the model understand the task
- Always write prompts informed by the context of the task
- Always design contextually relevant roles for the language model
Position:  Task_Alignment_Expert
Core Principles:  - Always write prompts that align with the task criteria
- Always tailor instructions to the task to guide the model
- Always make the task abundantly clear to the model in the prompt
Position:  Example_Demonstration_Expert
Core Principles:  - Always provide ex

In [21]:
import time

times = []
for _ in range(5):
    start = time.time()
    result = market.optimise_prompt()
    end = time.time()
    print(f"Time taken: {end - start}")
    times.append(end - start)
    result["messages"][-2].pretty_print()
    print("--------------------")

Highest Bidder: Aspect_Based_Sentiment_Analyst, Bid: 7
Highest Bidder: Conciseness_and_Clarity_Expert, Bid: 4
Highest Bidder: Contextual_Relevance_Expert, Bid: 5
Highest Bidder: Incremental_Prompting_Expert, Bid: 3
Highest Bidder: Style_and_Structure_Expert, Bid: 3
Highest Bidder: Incremental_Prompting_Expert, Bid: 3
Time taken: 176.52764797210693

### Revised Prompt Proposal

**Context:** Understanding customer feedback requires analyzing various aspects such as service quality and product performance to gauge overall satisfaction. If sentiments are mixed, base the overall sentiment on the aspect with the strongest or most frequent sentiment.

**Task:** Analyze the following text and identify its aspects and their corresponding sentiments. After analyzing each aspect, classify the overall sentiment of the text as either positive or negative.

{content}

**Output Requirements:**
1. **Identify Aspects:**
    - Read the text and identify different aspects mentioned (e.g., customer servic

In [23]:
print("Max time: ", max(times))
print("Min time: ", min(times))
print("Average time: ", sum(times) / len(times))

Max time:  257.2571382522583
Min time:  90.8610348701477
Average time:  178.66373562812805
