### Hierarchical Skill Selection

If your scenario involves a large number of skills, however, you might need to consider Hierarchical Skill Selection. This is especially true if many of those skills are semantically similar, and you are looking to improve skill selection accuracy at the price of higher latency and complexity. In this pattern, you organize your skills into groups, and provide a description for each group. Your skill selection (either Generative or Semantic) first selects a group, and then performs a secondary search only among the skills in that group. While this is slower and would be expensive to parallelize, it reduces the complexity of the skill selection task into two smaller chunks, and frequently results in higher overall skill selection accuracy. Crafting and maintaining these skill groups takes time and effort, so this is not recommended as a technique to begin with.

In [1]:
import os
import requests
import logging
import numpy as np
 
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain.vectorstores import FAISS
import faiss
from dotenv import load_dotenv


In [3]:
# Load environment variables from .env file
load_dotenv()

# Retrieve the API key
api_key = os.getenv("OPENAI_API_KEY")

In [4]:
embeddings = OpenAIEmbeddings(openai_api_key=api_key)

In [5]:
# Define tool groups with descriptions
tool_groups = {
    "Computation": {
        "description": "Tools related to mathematical computations and data analysis.",
        "tools": []
    },
    "Automation": {
        "description": "Tools that automate workflows and integrate different services.",
        "tools": []
    },
    "Communication": {
        "description": "Tools that facilitate communication and messaging.",
        "tools": []
    }
}

In [None]:
# Define Tools
@tool
def query_wolfram_alpha(expression: str) -> str:
    api_url = f"https://api.wolframalpha.com/v1/result?i={requests.utils.quote(expression)}&appid={WOLFRAM_ALPHA_APP_ID}"
    try:
        response = requests.get(api_url)
        if response.status_code == 200:
            return response.text
        else:
            raise ValueError(f"Wolfram Alpha API Error: {response.status_code} - {response.text}")
    except requests.exceptions.RequestException as e:
        raise ValueError(f"Failed to query Wolfram Alpha: {e}")
 

In [None]:
@tool
def trigger_zapier_webhook(zap_id: str, payload: dict) -> str:
    """
    Trigger a Zapier webhook to execute a predefined Zap.
 
    Args:
        zap_id (str): The unique identifier for the Zap to be triggered.
        payload (dict): The data to send to the Zapier webhook.
 
    Returns:
        str: Confirmation message upon successful triggering of the Zap.
 
    Raises:
        ValueError: If the API request fails or returns an error.
    """
    zapier_webhook_url = f"https://hooks.zapier.com/hooks/catch/{zap_id}/"
    try:
        response = requests.post(zapier_webhook_url, json=payload)
        if response.status_code == 200:
            return f"Zapier webhook '{zap_id}' successfully triggered."
        else:
            raise ValueError(f"Zapier API Error: {response.status_code} - {response.text}")
    except requests.exceptions.RequestException as e:
        raise ValueError(f"Failed to trigger Zapier webhook '{zap_id}': {e}")
 

In [None]:
@tool
def send_slack_message(channel: str, message: str) -> str:
    """
    Send a message to a specified Slack channel.
 
    Args:
        channel (str): The Slack channel ID or name where the message will be sent.
        message (str): The content of the message to send.
 
    Returns:
        str: Confirmation message upon successful sending of the Slack message.
 
    Raises:
        ValueError: If the API request fails or returns an error.
    """
    api_url = "https://slack.com/api/chat.postMessage"
    headers = {
        "Authorization": f"Bearer {SLACK_BOT_TOKEN}",
        "Content-Type": "application/json"
    }
    payload = {
        "channel": channel,
        "text": message
    }
    try:
        response = requests.post(api_url, headers=headers, json=payload)
        response_data = response.json()
        if response.status_code == 200 and response_data.get("ok"):
            return f"Message successfully sent to Slack channel '{channel}'."
        else:
            error_msg = response_data.get("error", "Unknown error")
            raise ValueError(f"Slack API Error: {error_msg}")
    except requests.exceptions.RequestException as e:
        raise ValueError(f"Failed to send message to Slack channel '{channel}': {e}")
 

In [None]:
# Assign tools to their respective groups
tool_groups["Computation"]["tools"].append(query_wolfram_alpha)
tool_groups["Automation"]["tools"].append(trigger_zapier_webhook)
tool_groups["Communication"]["tools"].append(send_slack_message)

In [None]:
# -------------------------------
# Embed Group and Tool Descriptions
# -------------------------------
# Embed group descriptions
group_names = []
group_embeddings = []
for group_name, group_info in tool_groups.items():
    group_names.append(group_name)
    group_embeddings.append(embeddings.embed_text(group_info["description"]))

In [None]:
# Create FAISS index for groups
group_embeddings_np = np.array(group_embeddings).astype('float32')
faiss.normalize_L2(group_embeddings_np)
group_index = faiss.IndexFlatL2(len(group_embeddings_np[0]))
group_index.add(group_embeddings_np)

In [None]:
# Embed tool descriptions within each group
tool_indices = {}  # Maps group name to its FAISS index and tool functions
for group_name, group_info in tool_groups.items():
    tools = group_info["tools"]
    tool_descriptions = []
    tool_functions = []
    for tool_func in tools:
        description = tool_func.__doc__.strip().split('\n')[0]  # First line of docstring
        tool_descriptions.append(description)
        tool_functions.append(tool_func)
    if tool_descriptions:
        tool_embeddings = embeddings.embed_texts(tool_descriptions)
        tool_embeddings_np = np.array(tool_embeddings).astype('float32')
        faiss.normalize_L2(tool_embeddings_np)
        tool_index = faiss.IndexFlatL2(len(tool_embeddings_np[0]))
        tool_index.add(tool_embeddings_np)
        tool_indices[group_name] = {
            "index": tool_index,
            "functions": tool_functions,
            "embeddings": tool_embeddings_np
        }

In [None]:
 
# -------------------------------
# Hierarchical Skill Selection
# -------------------------------
def select_group(query: str, top_k: int = 1) -> list:
    query_embedding = embeddings.embed_text(query).astype('float32')
    faiss.normalize_L2(query_embedding.reshape(1, -1))
    D, I = group_index.search(query_embedding.reshape(1, -1), top_k)
    selected_groups = [group_names[idx] for idx in I[0]]
    return selected_groups

In [None]:
def select_tool(query: str, group_name: str, top_k: int = 1) -> list:
    tool_info = tool_indices[group_name]
    query_embedding = embeddings.embed_text(query).astype('float32')
    faiss.normalize_L2(query_embedding.reshape(1, -1))
    D, I = tool_info["index"].search(query_embedding.reshape(1, -1), top_k)
    selected_tools = [tool_info["functions"][idx] for idx in I[0] if idx < len(tool_info["functions"])]
    return selected_tools

In [None]:
# Initialize the LLM with GPT-4 and set temperature to 0 for deterministic responses
llm = ChatOpenAI(model_name="gpt-4", temperature=0)
 
   selected_groups = select_group(user_query, top_k=1)
    if not selected_groups:
        print("No relevant skill group found for your query.")
        return
    
    selected_group = selected_groups[0]
    logging.info(f"Selected Group: {selected_group}")
    print(f"Selected Skill Group: {selected_group}")
    
    # Step 2: Select the most relevant tool within the group
    selected_tools = select_tool(user_query, selected_group, top_k=1)
    
    if not selected_tools:
        print("No relevant tool found within the selected group.")
        return
    
    selected_tool = selected_tools[0]
    logging.info(f"Selected Tool: {selected_tool.__name__}")
    print(f"Selected Tool: {selected_tool.__name__}")
    
    # Prepare arguments based on the tool
    args = {}
    if selected_tool == query_wolfram_alpha:
        # Assume the entire query is the expression
        args["expression"] = user_query
    elif selected_tool == trigger_zapier_webhook:
        # For demonstration, use placeholders
        args["zap_id"] = "123456"  # Replace with actual Zap ID
        args["payload"] = {"message": user_query}
    elif selected_tool == send_slack_message:
        # For demonstration, use placeholders
        args["channel"] = "#general"  # Replace with actual Slack channel
        args["message"] = user_query
    else:
        print("Selected tool is not recognized.")
        return
    
    # Invoke the selected tool
    try:
        tool_result = selected_tool.invoke(args)
        print(f"Tool '{selected_tool.__name__}' Result: {tool_result}")
    except ValueError as e:
        print(f"Error: {e}")

### Machine Learned Skill Selection
Machine Learned Skill Selection employs machine learning techniques to automatically learn and select skills based on past experiences and task feedback. Generic generative and embedding models are often larger, slower, and more expensive than is necessary for skill selection, so by training specific models on task-skill pairs, you can potentially reduce the cost and latency of this part of your agent-based solution. Both historical data and data samples generated by a foundation model can be used to train your skill selection model. Similarly, you could fine-tune a smaller model to improve the classification performance on your skill selection task. The key drawback is it introduces a new model that your team will need to maintain. Carefully consider the costs before choosing to proceed down this path, as it may require extensive training data and computational resources to achieve optimal performance.