# Creating Custom Tools with Unity Catalog Functions

This notebook demonstrates how to create custom tools using Unity Catalog functions for use with Databricks Agent Framework. Unity Catalog functions allow you to create reusable, versioned tools that can be called by AI agents.

## Overview

Unity Catalog functions enable you to:
- Create custom tools that AI agents can invoke
- Version and manage your functions centrally
- Control access and permissions
- Execute functions securely in serverless compute

## Prerequisites

- Databricks Runtime 15.0+
- Python 3.10+
- Serverless compute enabled
- Access to Unity Catalog with `users.ashwin_srikant` schema

## Step 1: Install Required Dependencies

First, we need to install the Unity Catalog AI packages that provide the tools for creating and managing functions.

In [None]:
%pip install unitycatalog-ai[databricks]
%pip install unitycatalog-langchain[databricks]
dbutils.library.restartPython()

## Step 2: Initialize Unity Catalog Client

Create a client to interact with Unity Catalog for function management. The client will use your current Databricks authentication.

In [None]:
from unitycatalog.ai.core.databricks import DatabricksFunctionClient

# Define our schema location
CATALOG = "users"
SCHEMA = "ashwin_srikant"

# Initialize the Databricks Function client
client = DatabricksFunctionClient()

print(f"Databricks Function client initialized")
print(f"Target schema: {CATALOG}.{SCHEMA}")

## Step 3: Create Simple Mathematical Tools

Let's start with basic mathematical functions. These demonstrate the key requirements:
- **Type hints**: All parameters and return values must have type annotations
- **Docstrings**: Clear Google-style docstrings help the LLM understand when to use the function
- **No variable arguments**: Functions cannot use *args or **kwargs
- **Dependencies**: Import statements should be inside the function body

In [None]:
def add_numbers(number_1: float, number_2: float) -> float:
    """Adds two floating point numbers and returns their sum.
    
    Args:
        number_1: The first number to add
        number_2: The second number to add
        
    Returns:
        The sum of the two input numbers
    """
    return number_1 + number_2

def multiply_numbers(number_1: float, number_2: float) -> float:
    """Multiplies two floating point numbers and returns their product.
    
    Args:
        number_1: The first number to multiply
        number_2: The second number to multiply
        
    Returns:
        The product of the two input numbers
    """
    return number_1 * number_2

def calculate_percentage(part: float, whole: float) -> float:
    """Calculates what percentage one number is of another.
    
    Args:
        part: The partial value
        whole: The total value
        
    Returns:
        The percentage as a decimal (e.g., 0.25 for 25%)
    """
    if whole == 0:
        return 0.0
    return (part / whole) * 100

## Step 4: Register Mathematical Functions

Now we'll register these functions with Unity Catalog. The `replace=True` parameter allows us to update existing functions.

In [None]:
# Register the addition function
add_function_info = client.create_python_function(
    func=add_numbers,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
)

print(f"‚úÖ Registered function: {CATALOG}.{SCHEMA}.{add_function_info.name}")

# Register the multiplication function
multiply_function_info = client.create_python_function(
    func=multiply_numbers,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
)

print(f"‚úÖ Registered function: {CATALOG}.{SCHEMA}.{multiply_function_info.name}")

# Register the percentage function
percentage_function_info = client.create_python_function(
    func=calculate_percentage,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
)

print(f"‚úÖ Registered function: {CATALOG}.{SCHEMA}.{percentage_function_info.name}")

## Step 5: Create Data Processing Tools

Let's create more sophisticated functions that work with data. These examples show how to:
- Import dependencies within the function
- Work with different data types
- Handle error cases gracefully

In [None]:
def analyze_text_sentiment(text: str) -> str:
    """Analyzes the sentiment of a given text string using simple keyword matching.
    
    This is a simplified sentiment analysis that categorizes text as positive,
    negative, or neutral based on keyword matching.
    
    Args:
        text: The text string to analyze for sentiment
        
    Returns:
        A string indicating the sentiment: 'positive', 'negative', or 'neutral'
    """
    # Import dependencies inside the function
    import re
    
    if not text or not isinstance(text, str):
        return "neutral"
    
    # Convert to lowercase for analysis
    text_lower = text.lower()
    
    # Define simple keyword lists
    positive_words = ['good', 'great', 'excellent', 'amazing', 'wonderful', 'fantastic', 'love', 'happy', 'successful']
    negative_words = ['bad', 'terrible', 'awful', 'horrible', 'hate', 'sad', 'failed', 'disappointed', 'angry']
    
    # Count positive and negative words
    positive_count = sum(1 for word in positive_words if word in text_lower)
    negative_count = sum(1 for word in negative_words if word in text_lower)
    
    if positive_count > negative_count:
        return "positive"
    elif negative_count > positive_count:
        return "negative"
    else:
        return "neutral"

def format_currency(amount: float, currency_code: str) -> str:
    """Formats a numeric amount as currency with proper formatting.
    
    Args:
        amount: The numeric amount to format
        currency_code: The three-letter currency code (USD, EUR, GBP, JPY)
        
    Returns:
        A formatted currency string
    """
    # Define currency symbols
    currency_symbols = {
        "USD": "$",
        "EUR": "‚Ç¨",
        "GBP": "¬£",
        "JPY": "¬•"
    }
    
    symbol = currency_symbols.get(currency_code.upper(), currency_code)
    
    # Format with commas and two decimal places
    formatted_amount = f"{amount:,.2f}"
    
    return f"{symbol}{formatted_amount}"

def calculate_compound_interest(principal: float, rate: float, time: int, compound_frequency: int) -> float:
    """Calculates compound interest for an investment.
    
    Args:
        principal: The initial amount invested
        rate: The annual interest rate as a decimal (e.g., 0.05 for 5%)
        time: The number of years
        compound_frequency: How many times per year interest is compounded
        
    Returns:
        The final amount after compound interest
    """
    import math
    
    if principal <= 0 or rate < 0 or time <= 0 or compound_frequency <= 0:
        return 0.0
    
    # A = P(1 + r/n)^(nt)
    amount = principal * math.pow((1 + rate / compound_frequency), compound_frequency * time)
    
    return round(amount, 2)

def get_workspace_id() -> str:
    """Gets the current Databricks workspace ID using the Databricks SDK.
    
    This function attempts to determine the workspace ID from the current
    Databricks environment using the WorkspaceClient. It extracts the ID
    from the workspace URL.
    
    Returns:
        The workspace ID as a string, or an error message if unable to determine
    """
    try:
        from databricks.sdk import WorkspaceClient
        import re
        
        # Initialize the workspace client with default authentication
        w = WorkspaceClient()
        
        # Get the workspace URL from the client configuration
        workspace_url = w.config.host
        
        if not workspace_url:
            return "Error: Unable to determine workspace URL"
        
        # Extract workspace ID from different URL patterns
        # Pattern 1: https://dbc-12345678-abcd.cloud.databricks.com
        match = re.search(r'dbc-([a-f0-9]+)', workspace_url)
        if match:
            return match.group(1)
        
        # Pattern 2: https://12345678901234567890.1.azuredatabricks.net
        match = re.search(r'(\d{20})', workspace_url)
        if match:
            return match.group(1)
        
        # Pattern 3: Other patterns - try to extract meaningful identifier
        match = re.search(r'([a-f0-9-]+)\.cloud\.databricks\.com', workspace_url)
        if match:
            return match.group(1)
        
        # If no pattern matches, return the host URL
        return f"Host: {workspace_url}"
        
    except ImportError:
        return "Error: Databricks SDK not available"
    except Exception as e:
        return f"Error: {str(e)}"

## Step 6: Register Data Processing Functions

Register our more complex functions with Unity Catalog.

In [None]:
# Register the sentiment analysis function
sentiment_function_info = client.create_python_function(
    func=analyze_text_sentiment,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
)

print(f"‚úÖ Registered function: {CATALOG}.{SCHEMA}.{sentiment_function_info.name}")

# Register the currency formatting function
currency_function_info = client.create_python_function(
    func=format_currency,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
)

print(f"‚úÖ Registered function: {CATALOG}.{SCHEMA}.{currency_function_info.name}")

# Register the compound interest function
interest_function_info = client.create_python_function(
    func=calculate_compound_interest,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
)

print(f"‚úÖ Registered function: {CATALOG}.{SCHEMA}.{interest_function_info.name}")

# Register the workspace ID function
workspace_function_info = client.create_python_function(
    func=get_workspace_id,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
)

print(f"‚úÖ Registered function: {CATALOG}.{SCHEMA}.{workspace_function_info.name}")

## Step 7: Test the Functions

Let's test our registered functions to ensure they work correctly. We can execute them in two modes:
- **Serverless mode** (default): Functions execute remotely in Databricks serverless compute
- **Local mode**: Functions execute locally for debugging

In [None]:
# Test mathematical functions
print("=== Testing Mathematical Functions ===")

# Test addition
result = client.execute_function(
    function_name=f"{CATALOG}.{SCHEMA}.add_numbers",
    parameters={"number_1": 15.5, "number_2": 24.7}
)
print(f"15.5 + 24.7 = {result}")

# Test multiplication
result = client.execute_function(
    function_name=f"{CATALOG}.{SCHEMA}.multiply_numbers",
    parameters={"number_1": 12.0, "number_2": 8.5}
)
print(f"12.0 √ó 8.5 = {result}")

# Test percentage calculation - ensure parameters are floats
result = client.execute_function(
    function_name=f"{CATALOG}.{SCHEMA}.calculate_percentage",
    parameters={"part": 25.0, "whole": 100.0}
)
print(f"25 is {result}% of 100")

In [None]:
# Test data processing functions
print("\n=== Testing Data Processing Functions ===")

# Test sentiment analysis
test_texts = [
    "This is a wonderful and amazing product!",
    "I hate this terrible service.",
    "The weather is okay today."
]

for text in test_texts:
    sentiment = client.execute_function(
        function_name=f"{CATALOG}.{SCHEMA}.analyze_text_sentiment",
        parameters={"text": text}
    )
    print(f"Text: '{text}' ‚Üí Sentiment: {sentiment}")

# Test currency formatting - now requires both parameters
amounts = [(1234.56, "USD"), (2500.00, "EUR"), (999.99, "GBP")]
for amount, currency in amounts:
    formatted = client.execute_function(
        function_name=f"{CATALOG}.{SCHEMA}.format_currency",
        parameters={"amount": amount, "currency_code": currency}
    )
    print(f"{amount} {currency} ‚Üí {formatted}")

# Test compound interest - handle result properly
final_amount_result = client.execute_function(
    function_name=f"{CATALOG}.{SCHEMA}.calculate_compound_interest",
    parameters={"principal": 1000.0, "rate": 0.05, "time": 10, "compound_frequency": 12}
)
# Extract the actual value and convert to float for formatting
final_amount = final_amount_result.value if hasattr(final_amount_result, 'value') else final_amount_result
try:
    final_amount_float = float(final_amount)
    print(f"\n$1,000 invested at 5% annually for 10 years (monthly compounding): ${final_amount_float:,.2f}")
except (ValueError, TypeError):
    print(f"\n$1,000 invested at 5% annually for 10 years (monthly compounding): {final_amount}")

# Test workspace ID function
print("\n=== Testing Workspace Information ===")
workspace_id_result = client.execute_function(
    function_name=f"{CATALOG}.{SCHEMA}.get_workspace_id",
    parameters={}
)
workspace_id = workspace_id_result.value if hasattr(workspace_id_result, 'value') else workspace_id_result
print(f"Workspace ID: {workspace_id}")

## Step 8: List All Registered Functions

Let's view all the functions we've created in our schema.

In [None]:
# List all functions in our schema
functions = client.list_functions(catalog=CATALOG, schema=SCHEMA)

print(f"\n=== Functions in {CATALOG}.{SCHEMA} ===")
for func in functions:
    print(f"üì¶ {func.name}")
    if hasattr(func, 'comment') and func.comment:
        print(f"   Description: {func.comment}")
    print()

## Step 9: Using Functions with LangChain (Optional)

Unity Catalog functions can be easily integrated with LangChain for use in AI agents. Here's a quick example:

In [None]:
# Example of using Unity Catalog functions with LangChain
from unitycatalog.ai.langchain.toolkit import UCFunctionToolkit

# Create a toolkit with our functions
toolkit = UCFunctionToolkit(
    function_names=[
        f"{CATALOG}.{SCHEMA}.add_numbers",
        f"{CATALOG}.{SCHEMA}.multiply_numbers",
        f"{CATALOG}.{SCHEMA}.calculate_percentage",
        f"{CATALOG}.{SCHEMA}.analyze_text_sentiment",
        f"{CATALOG}.{SCHEMA}.format_currency",
        f"{CATALOG}.{SCHEMA}.calculate_compound_interest",
        f"{CATALOG}.{SCHEMA}.get_workspace_id"
    ],
    client=client
)

# Get the tools using the correct method
tools = toolkit.tools

print(f"Created {len(tools)} LangChain tools from Unity Catalog functions:")
for tool in tools:
    print(f"- {tool.name}: {tool.description}")

### üìù Important Limitations:
- **No default parameters**: Unity Catalog Python functions cannot have default parameter values
- **All parameters required**: Every parameter must be provided when calling the function
- **Type hints mandatory**: All parameters and return values must have type annotations
- **Strict type checking**: Parameters must match exact types (e.g., `float` parameters cannot accept `int` values)