[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/weaviate/recipes/blob/main/integrations/llm-agent-frameworks/function-calling/baseten/baseten-query-agent.ipynb)

## Weaviate Query Agent with Baseten

This notebook will show you how to define the Weaviate Query Agent as a tool with Baseten.

### Requirements
1. Weaviate Cloud instance (WCD): The Weaviate Query Agent is only accessible through WCD at the moment. You can create a serverless cluster or a free 14-day sandbox [here](https://console.weaviate.cloud/).
1. Install the Weaviate Agents package with `pip install weaviate-agents`
1. You'll need a Weaviate cluster with data. If you don't have one, check out [this notebook](integrations/Weaviate-Import-Example.ipynb) to import the Weaviate Blogs.


### Setup Instructions for Baseten

This notebook uses the Llama 3 TensorRT Engine from Baseten, the following reference illustrates how to set it up:

As a quick TLDR, you need to run:

```bash
pip install --upgrade truss
truss init llama-3-1-8b-trt-llm
cd llama-3-1-8b-trt-llm
rm model/model.py
```

You will now see the following compute config:

```yaml
model_name: Llama 3.1 8B Engine
resources:
  accelerator: A100
secrets:
  hf_access_token: "set token in baseten workspace"
trt_llm:
  build:
    base_model: llama
    checkpoint_repository:
      repo: meta-llama/Llama-3.1-8B-Instruct
      source: HF
    max_seq_len: 8192
```

Deploy on Baseten with:

```bash
truss push --publish --trusted
```

Reference: https://docs.baseten.co/performance/examples/llama-trt

[Learn more about Baseten](https://www.youtube.com/watch?v=rzJ8hDx1Kic) on Weaviate Podcast #105 with Philip Kiely!

### Import libraries and keys

In [None]:
import weaviate
from weaviate_agents.query import QueryAgent
import os
import json
import requests
import re


In [None]:
os.environ["WEAVIATE_URL"] = ""
os.environ["WEAVIATE_API_KEY"] = ""
os.environ["OPENAI_ENDPOINT"] = ""

### Define Query Agent function

In [45]:
def send_query_agent_request(query: str) -> str:
    """
    Send a query to the database and get the response.

    Args:
        query (str): The question or query to search for in the database. This can be any natural language question related to the content stored in the database.

    Returns:
        str: The response from the database containing relevant information.
    """

    # connect to your Weaviate Cloud instance
    weaviate_client = weaviate.connect_to_weaviate_cloud(
        cluster_url=os.getenv("WEAVIATE_URL"),
        auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WEAVIATE_API_KEY")),
        headers={"X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY")}, # add the API key to the model provider from your Weaviate collection
    )

    # connect the query agent to your Weaviate collection(s)
    query_agent = QueryAgent(
        client=weaviate_client,
        collections=["Blogs"]
    )
    return query_agent.run(query).final_answer

### Define tool

In [48]:
tools_json = """
[
  {
    "type": "function",
    "function": {
      "name": "send_query_agent_request",
      "description": "Send a query to the database and get the response.",
      "parameters": {
        "type": "object",
        "properties": {
          "query": {
            "type": "string",
            "description": "The question or query to search for in the database. This can be any natural language question related to the content stored in the database."
          }
        },
        "required": [
          "query"
        ],
        "additionalProperties": false
      },
      "strict": true
    }
  }
]
"""

# Parse the JSON string to use it as a Python object
tools = json.loads(tools_json)

### Define function calling loop

In [41]:


def run_assistant(message, chat_history=None):
    if chat_history is None:
        chat_history = []
    
    # Step 1: Get user message
    print(f"Question:\n{message}")
    print("="*50)
    
    # Initialize messages
    if chat_history:
        messages = chat_history + [{"role": "user", "content": message}]
    else:
        messages = [
            {"role": "system", "content": "You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls."},
            {"role": "user", "content": message}
        ]
    
    # Function to detect and extract tool calls from response
    def extract_tool_calls(text):
        # Try to parse as JSON first
        try:
            data = json.loads(text)
            # Check if it's a list of tool calls or a single tool call
            if isinstance(data, list) and len(data) > 0 and "name" in data[0]:
                return data
            elif isinstance(data, dict) and "name" in data and "parameters" in data:
                return [data]
        except json.JSONDecodeError:
            pass
        
        # Try to extract JSON-like tool calls using regex
        try:
            # Look for JSON objects that have "name" and "parameters" fields
            pattern = r'{"name":.*?"parameters":.*?}'
            matches = re.findall(pattern, text)
            if matches:
                tools = []
                for match in matches:
                    try:
                        tool = json.loads(match)
                        if "name" in tool and "parameters" in tool:
                            tools.append(tool)
                    except:
                        pass
                if tools:
                    return tools
        except:
            pass
        
        return None
    
    # Initial call to the model
    try:
        payload = {
            "messages": messages,
            "tools": tools,  # Assuming tools is defined elsewhere
            "tool_choice": "auto",
        }
        
        resp = requests.post(
            os.environ.get('BASETEN_ENDPOINT'),
            headers={"Authorization": f"Api-Key {os.environ.get('BASETEN_API_KEY')}"},
            json=payload,
        )
        
        if resp.status_code != 200:
            print(f"API Error: Status code {resp.status_code}")
            print(resp.text)
            return messages
            
        # Process the response
        response_text = resp.content.decode('utf-8', errors='replace')
        tool_calls = extract_tool_calls(response_text)
        
        if tool_calls:
            # This is a tool call response
            print("Tool plan:")
            print("I need to use a tool to help answer your question.", "\n")
            
            # Add assistant message with tool calls
            assistant_message = {
                "role": "assistant",
                "content": "I need to use a tool to help answer your question.",
                "tool_calls": []
            }
            
            # Process each tool call
            for i, tool_call in enumerate(tool_calls):
                call_id = f"call_{i}"
                
                # Handle parameters that might be a string
                params = tool_call["parameters"]
                if isinstance(params, str):
                    try:
                        params = json.loads(params)
                    except:
                        pass
                
                tool_call_obj = {
                    "id": call_id,
                    "type": "function",
                    "function": {
                        "name": tool_call["name"],
                        "arguments": json.dumps(params) if isinstance(params, dict) else params
                    }
                }
                assistant_message["tool_calls"].append(tool_call_obj)
                
                # Print tool call info
                print(f"Tool name: {tool_call['name']} | Parameters: {params}")
            
            print("="*50)
            
            # Add assistant message to conversation history
            messages.append(assistant_message)
            
            # Execute each tool call
            for tool_call in assistant_message["tool_calls"]:
                function_name = tool_call["function"]["name"]
                args_str = tool_call["function"]["arguments"]
                
                # Parse arguments
                if isinstance(args_str, str):
                    try:
                        function_args = json.loads(args_str)
                    except:
                        function_args = {"query": args_str}
                else:
                    function_args = args_str
                
                try:
                    # Call the actual function (assumed to be defined elsewhere)
                    function_response = globals()[function_name](function_args.get("query"))
                    
                    # Add tool response to messages
                    tool_message = {
                        "role": "tool",
                        "tool_call_id": tool_call["id"],
                        "name": function_name,
                        "content": str(function_response)
                    }
                    messages.append(tool_message)
                    
                    print(f"Executed tool {function_name} successfully")
                    
                    # Since we have a tool response, we'll just format it nicely as the final response
                    # This avoids making another API call that might fail
                    final_content = f"Based on the information I found:\n\n{function_response}"
                    
                    final_message = {
                        "role": "assistant",
                        "content": final_content
                    }
                    
                    messages.append(final_message)
                    
                    print("Final response (based on tool output):")
                    print(final_content)
                    print("="*50)
                    
                except Exception as e:
                    error_msg = f"Error executing tool {function_name}: {str(e)}"
                    print(error_msg)
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call["id"],
                        "name": function_name,
                        "content": f"Error: {str(e)}"
                    })
                    
                    # In case of error, add a simple response
                    messages.append({
                        "role": "assistant",
                        "content": "I encountered an error while trying to find information for you."
                    })
            
        else:
            # This is a direct response with no tool calls
            clean_text = re.sub(r'<\|.*?\|>', '', response_text).strip()
            
            if not clean_text:
                clean_text = "I'm not sure how to answer that question."
            
            assistant_message = {
                "role": "assistant",
                "content": clean_text
            }
            
            messages.append(assistant_message)
            
            print("Direct response:")
            print(clean_text)
            print("="*50)
    
    except Exception as e:
        print(f"Error in request: {str(e)}")
        messages.append({
            "role": "assistant",
            "content": f"I encountered an error while processing your request: {str(e)}"
        })
    
    return messages

### Query time

In [42]:
chat_history = run_assistant("How do I run Weaviate with Docker?")

Question:
How do I run Weaviate with Docker?
Tool plan:
I need to use a tool to help answer your question. 

Tool name: send_query_agent_request | Parameters: {'query': 'How to run Weaviate with Docker?'}
Executed tool send_query_agent_request successfully
Final response (based on tool output):
Based on the information I found:

To run Weaviate with Docker, you will need to follow these steps:

1. **Install Docker and Docker Compose**: Ensure that you have both the `docker` and `docker-compose` CLI tools installed on your system. Installation processes may differ depending on your operating system, with specific installation guides available for [Mac](https://docs.docker.com/desktop/install/mac-install/), [Windows](https://docs.docker.com/desktop/install/windows-install/), and [Ubuntu Linux](https://docs.docker.com/engine/install/ubuntu/).

2. **Obtain the Docker Compose File**: You can either use a pre-prepared Docker Compose file or create a custom one using the [Weaviate configurati

In [44]:
chat_history

[{'role': 'system',
  'content': 'You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls.'},
 {'role': 'user', 'content': 'How do I run Weaviate with Docker?'},
 {'role': 'assistant',
  'content': 'I need to use a tool to help answer your question.',
  'tool_calls': [{'id': 'call_0',
    'type': 'function',
    'function': {'name': 'send_query_agent_request',
     'arguments': '{"query": "How to run Weaviate with Docker?"}'}}]},
 {'role': 'tool',
  'tool_call_id': 'call_0',
  'name': 'send_query_agent_request',
  'content': 'To run Weaviate with Docker, you will need to follow these steps:\n\n1. **Install Docker and Docker Compose**: Ensure that you have both the `docker` and `docker-compose` CLI tools installed on your system. Installation processes may differ depending on your operating system, with specific installation guide