# File to Test Prompts in MULTI Agent system

In [9]:
"""
This file is used to test different prompts based on the LLMs response.
For this case, the multi agent is evaluated. 

Important Notes on Constructing Prompts for the Mutli Agent below.

TOOLS AVAILABLE:

    * get_wear -> function to get the torque wear:
        INPUTS: 
            - type of material, the LLM should be able to associate the periodic correct letter for it! [P,N,K]
            - feed rate, a float value in [mm/min] units
            - cutting speed, also a float value in [m/min] units

    * get_quality -> function to determine GOOD/BAD quality:
        INPUTS:
            - cooling, a float value in [%] units
            - feed rate, a float value in [mm/min] units
            - cutting speed, a float value in [m/min] units
            # drill bit material, only necessary for CCF criteria. [N, H, K]
 

BEHAVIOUR OF USER + LLM:

    The user can make well structured questions, but the goal is to evaluate the LLM for unexperienced users. 
    Below are listed some of the "mistake" that can be present in a query, and the LLM should be able to deal with them,
    given the correct prompt.

    1) Missing values for mandatory parameters:
        Besides the drill bit material, all the other values are needed for their correspondent tool
        The LLM should:
            1.1) Either tell the user that some values are missing and request them.
            1.2) Tell the user that the values were missing and that it used some default ones.

        We need to evaluate if 1.1) or 1.2) is better overall.

    2) Missing value for drill bit material:
        If the user does not provide the value for this when asking about quality.
        The LLM should tell the user that it only can perform BEF criteria and not CCF. 
        The LLM should retrieve the BEF quality and ask if the user wants to also get the CCF by providing this.

        If the user asks about wear tool, then the LLM should not ask for drill bit material at anypoint.

    3) The user gives the values without SI units or wrong SI units:
        The LLM should tell the user that it is assuming the units defined in the tool,
        or be able to correctly convert this values to the right units used by the tool.

        

SPECIFICATIONS OF THE MULTI-AGENT:

    1) The supervisor needs to correctly call the agents.

    2) It should be able to interact between both of them when requests demand the sage of both tools.

    3) It should provide only the important information to each LLM, meaning that the quality tool does not need to know about wear, and vice-versa.
"""

'\nThis file is used to test different prompts based on the LLMs response.\nFor this case, the multi agent is evaluated. \n\nImportant Notes on Constructing Prompts for the Mutli Agent below.\n\nTOOLS AVAILABLE:\n\n    * get_wear -> function to get the torque wear:\n        INPUTS: \n            - type of material, the LLM should be able to associate the periodic correct letter for it! [P,N,K]\n            - feed rate, a float value in [mm/min] units\n            - cutting speed, also a float value in [m/min] units\n\n    * get_quality -> function to determine GOOD/BAD quality:\n        INPUTS:\n            - cooling, a float value in [%] units\n            - feed rate, a float value in [mm/min] units\n            - cutting speed, a float value in [m/min] units\n            # drill bit material, only necessary for CCF criteria. [N, H, K]\n\n\nBEHAVIOUR OF USER + LLM:\n\n    The user can make well structured questions, but the goal is to evaluate the LLM for unexperienced users. \n    B

1) Prompts for Wear Agent

In [10]:
WEAR_AGENT_PROMPTS = """
You are a specialized agent responsible for estimating the wear of tools based on material type, feed rate, and cutting speed. 
Your task is to compute the wear using the following inputs:
    - material ('P', 'N', or 'K')
    - feed rate in mm/min
    - cutting speed in m/min. 
    
If any of these parameters are missing, ask the user to provide them. 
If the user does not specify units, inform them that you are assuming the standard units defined for these values. 
Do not ask for drill bit material, as it is not required for wear estimation."""




2) Prompts for Quality Agent

In [11]:
QUALITY_AGENT_PROMPTS = """You are a specialized agent responsible for evaluating the quality of the drilling process based on cooling, feed rate, cutting speed, and drill bit material. 
Your task is to compute the quality using the following inputs: 
    - cooling percentage
    - feed rate in mm/min 
    - cutting speed in m/min
    - drill bit material ('N', 'H', or 'K'). 
    
If the drill bit material is missing, inform the user that only BEF criteria can be evaluated and ask if they would like to provide the material for CCF criteria. 
If any of the required parameters are missing, ask the user to provide them or use default values as needed. 
You should also inform the user if units are missing and assume standard SI units if not specified."""



3) Prompts for Supervisor

In [12]:
SUPERVISOR_AGENT_PROMPTS = """
You are a supervisor agent. 
When managing user requests for wear and quality analysis, ensure the following steps: 
    1) Identify whether the user needs only wear estimation or quality evaluation or both. 
    2) Handle requests for both in the correct sequence: wear estimation first, followed by quality evaluation. 
    3) If any parameters are missing (such as feed rate, cutting speed, cooling, etc.), ask the user for them. 
    4) Once the parameters are confirmed, call the appropriate tools and coordinate the results. 
    5) Deliver the results to the user in a clear and concise manner, ensuring the order of execution is followed.

Think step by step, following the steps above.
"""


User Queries

In [13]:
user_queries = {

    # EASY: These queries are straightforward, with correct values and units for material, feed rate, cutting speed, and cooling. No ambiguity in the input, designed to test if the system can handle perfectly structured queries.
    "EASY": [
        ("I want to estimate the wear of my model. I am going to use the material 'P' and I want to have a feed rate value of 0.2 and cutting speed of 100."), 
        ("Can you help me evaluate the quality of my drilling process? I am using cooling of 12.5, feed rate of 0.13, and cutting speed of 16.7."), 
        ("Please estimate the wear for material 'N' with a feed rate of 0.3 and a cutting speed of 120.")
    ],


    # MEDIUM: These scenarios involve:
        # Unit conversion: The feed rate and cutting speed values are given in non-SI units (meters per minute, inches per minute). 
        # The LLM needs to convert these to the standard units used in the system (SI).

        # Missing values: Some queries are missing parameters (e.g., feed rate or cutting speed), and the LLM has to either ask for the missing values or use defaults.
    "MEDIUM": [
        ("I want to estimate the wear of my model using material 'K', feed rate of 0.00025 (in meters per minute), and cutting speed of 100 (in mm/min)."),  # Unit conversion test
        ("Can you calculate the drilling quality for cooling 10.5, feed rate 0.2 (in inches/min), and cutting speed 10 (in inches/min)?"),  # Unit conversion test
        ("I want to evaluate my drilling quality with feed rate 0.15 and cutting speed 75, but I forgot to mention the cooling value. Could you assume cooling 15?"),  # Missing parameter, assume default
        ("Estimate wear for material 'P'. I provided the cutting speed 100, but I forgot the feed rate. Can you assume a default of 0.2?"),  # Missing parameter, assume default
        
    ],

    # HARD: These queries require both tools (get_wear and get_quality) to be used together. 
    # Some of the queries also introduce missing values, which will test if the LLM can handle 
    # situations where it has to infer missing information (e.g., assuming a drill bit material).
    "HARD": [
        ("Can you calculate both the wear and the quality of my model? The material is 'P', feed rate is 0.3, cutting speed is 120, and cooling is 12.5. Also, assume the drill bit is 'N' for quality evaluation."),  # Using both tools
        ("I want to estimate the wear for material 'K' with a feed rate of 0.25 and cutting speed of 100, but I also want to know the quality of my drilling with cooling 10. Can you calculate both for me?"),  # Using both tools
        ("I need to evaluate both wear and quality for my model. I'm using material 'N', feed rate of 0.2, and cutting speed of 80, but I forgot the drill bit material for quality. Can you assume 'H' and 10.0 for cooling and calculate both?")  # Missing drill bit material, both tools
    ]
}





Running the Code

In [14]:
# True to see the logic behind the LLM
# False to just get final response
DETAILED_RESPONSES = False

Decide which part to test

In [15]:
# DO NOT CHANGE!
import json

from langchain_ollama import ChatOllama
from langchain_ollama.llms import OllamaLLM
from Tools_LLM import get_wear, get_quality
from langchain.agents import create_agent
from langchain_core.tools import tool
import time

llm = ChatOllama(
    model="granite3.3:2b",
    temperature=0.2,
)

time_elapsed = {}

LLM_RESPONSES = {}

@tool
def wear_tool(request: str) -> str:
    """Tool for doing calculating wear in drilling.
    """
    result = wear_agent.invoke({
        "messages": [{"role": "user", "content": request}]
    })
    return result["messages"][-1].text

@tool
def quality_tool(request: str) -> str:
    """Tool for analysing quality of drilling
    """
    result = quality_agent.invoke({
        "messages": [{"role": "user", "content": request}]
    })
    return result["messages"][-1].text



wear_agent = create_agent(
    llm,
    tools=[get_wear],
    system_prompt=WEAR_AGENT_PROMPTS,
)

quality_agent = create_agent(
    llm,
    tools=[get_quality],
    system_prompt=QUALITY_AGENT_PROMPTS,
)

supervisor_agent = create_agent(
    llm,
    tools=[wear_tool, quality_tool],
    system_prompt=SUPERVISOR_AGENT_PROMPTS,
)


    
for level in user_queries.keys():

    start_time = time.time()

    LLM_RESPONSES[level] = []

    print("Level of query: ", level)

    for query in user_queries[level]:

        print("USER QUERY: ", query)

        if DETAILED_RESPONSES == True:
            #get the response from the AGENT with all informationa about Tool calls
            for step in supervisor_agent.stream({"messages": [{"role": "user", "content": query}]}):
                for update in step.values():
                    for message in update.get("messages", []):
                        message.pretty_print()

        else:
            final_response = supervisor_agent.invoke({
                "messages": [{"role": "user", "content": query}]
            })
            

            # Extract AI's response
            ai_response = None
            for message in final_response['messages']:
                if "AIMessage" in str(type(message)):  # Check if the message is from the AI
                    ai_response = message.content

            if ai_response:
                print(ai_response)  # Print only the AI's response
            else:
                print("No AI response found.")  # Handle case if AI message is not found
            
            LLM_RESPONSES[level].append(ai_response)

    end_time = time.time()
    duration = end_time - start_time
    
    time_elapsed[level] = duration



with open("MA_FINAL.json", "w") as f:
    json.dump(LLM_RESPONSES, f, indent=4)

with open("MA_FINAL_TIME.json", "w") as f:
    json.dump(time_elapsed, f, indent=4)

Level of query:  EASY
USER QUERY:  I want to estimate the wear of my model. I am going to use the material 'P' and I want to have a feed rate value of 0.2 and cutting speed of 100.
Understood, you need wear estimation for your drilling process using material 'P', with a feed rate of 0.2 and a cutting speed of 100. Let's proceed with this request first. After obtaining the results, I will then perform quality evaluation. Do these parameters cover all necessary information? If not, please provide any additional details required for the analysis.
USER QUERY:  Can you help me evaluate the quality of my drilling process? I am using cooling of 12.5, feed rate of 0.13, and cutting speed of 16.7.
Sure, I can assist with that. However, before we proceed to quality evaluation, could you please confirm if the wear estimation is also required? 

If yes, let's first perform a wear analysis using the provided parameters: feed rate = 0.13, cutting speed = 16.7, and cooling = 12.5. Once this is comple

4) Evaluation of Responses

In [16]:
print(time_elapsed)

{'EASY': 61.62889361381531, 'MEDIUM': 125.5024926662445, 'HARD': 172.38211846351624}
