# Function Call Parser - LLM Function Calling Support

The function call parser enables **Pythonic function calling syntax for LLM tool invocation**. Instead of raw JSON, LLMs can generate natural Python function calls that are parsed into structured tool invocations.

**Core Features:**
- **Natural Syntax**: `search(query="AI", limit=5)` instead of JSON blobs
- **Positional & Keyword Args**: Supports both calling conventions
- **Schema-Aware Nesting**: Automatically structures flat args into nested schemas
- **Type-Safe Parsing**: AST-based parsing prevents code injection
- **Pydantic Integration**: Works seamlessly with Pydantic models

**Use Case**: LLM outputs `recall("ocean preferences", limit=10)` → Parser converts to `{"tool": "recall", "arguments": {"query": "ocean preferences", "limit": 10}}`

In [None]:
from pydantic import BaseModel

from lionherd_core.libs.schema_handlers import (
    map_positional_args,
    nest_arguments_by_schema,
    parse_function_call,
)

## 1. Basic Function Call Parsing

Parse Python function call syntax into JSON tool invocation format.

In [2]:
# Simple keyword arguments
result = parse_function_call('search(query="AI", limit=5)')
print(f"Tool: {result['tool']}")
print(f"Arguments: {result['arguments']}")

Tool: search
Arguments: {'query': 'AI', 'limit': 5}


In [3]:
# Positional arguments (indexed as _pos_0, _pos_1, ...)
result = parse_function_call('recall("ocean preferences", 10)')
print(f"Tool: {result['tool']}")
print(f"Arguments: {result['arguments']}")
print("Positional args will be mapped to param names later")

Tool: recall
Arguments: {'_pos_0': 'ocean preferences', '_pos_1': 10}
Positional args will be mapped to param names later


In [4]:
# Mixed positional and keyword arguments
result = parse_function_call('create("MyEntity", entity_type="person", confidence=0.95)')
print(f"Tool: {result['tool']}")
print(f"Arguments: {result['arguments']}")

Tool: create
Arguments: {'_pos_0': 'MyEntity', 'entity_type': 'person', 'confidence': 0.95}


## 2. Complex Argument Types

Parser handles strings, numbers, booleans, lists, dicts, and None.

In [5]:
# Various data types
result = parse_function_call(
    'configure(name="test", count=42, enabled=True, tags=["a", "b"], config={"key": "value"}, default=None)'
)
print("Parsed arguments:")
for key, value in result["arguments"].items():
    print(f"  {key}: {value!r} (type: {type(value).__name__})")

Parsed arguments:
  name: 'test' (type: str)
  count: 42 (type: int)
  enabled: True (type: bool)
  tags: ['a', 'b'] (type: list)
  config: {'key': 'value'} (type: dict)
  default: None (type: NoneType)


In [6]:
# Nested structures
result = parse_function_call(
    'analyze(data=[{"id": 1, "value": 100}, {"id": 2, "value": 200}], options={"deep": True})'
)
print(f"Data: {result['arguments']['data']}")
print(f"Options: {result['arguments']['options']}")

Data: [{'id': 1, 'value': 100}, {'id': 2, 'value': 200}]
Options: {'deep': True}


## 3. Method Call Syntax

Handles both simple function calls and method calls (extracts final method name).

In [7]:
# Simple function call
result1 = parse_function_call('search(query="test")')
print(f"Simple: {result1['tool']}")

# Method call (client.search)
result2 = parse_function_call('client.search(query="test")')
print(f"Method: {result2['tool']}")

# Both extract the same tool name
print(f"Same tool: {result1['tool'] == result2['tool']}")

Simple: search
Method: search
Same tool: True


## 4. Error Handling

Parser validates syntax and rejects invalid input.

In [8]:
# Invalid syntax examples
invalid_calls = [
    "not a function call",  # Not a call
    "func(**kwargs)",  # **kwargs not supported
    "func(x=undefined_var)",  # Non-literal value
]

for call in invalid_calls:
    try:
        parse_function_call(call)
        print(f"❌ Should have failed: {call}")
    except ValueError as e:
        print(f"✓ Rejected: {call[:30]}... - {str(e)[:50]}")

✓ Rejected: not a function call... - Invalid function call syntax: invalid syntax (<unk
✓ Rejected: func(**kwargs)... - Invalid function call syntax: **kwargs not support
✓ Rejected: func(x=undefined_var)... - Invalid function call syntax: malformed node or st


## 5. Mapping Positional Arguments

Map positional args (_pos_0, _pos_1) to actual parameter names from schema.

In [9]:
# Parse function call with positional args
parsed = parse_function_call('search("AI research", 10, True)')
print(f"Before mapping: {parsed['arguments']}")

# Define parameter names from function signature
param_names = ["query", "limit", "include_archived"]

# Map positional args to param names
mapped = map_positional_args(parsed["arguments"], param_names)
print(f"After mapping: {mapped}")

Before mapping: {'_pos_0': 'AI research', '_pos_1': 10, '_pos_2': True}
After mapping: {'query': 'AI research', 'limit': 10, 'include_archived': True}


In [10]:
# Mixed positional and keyword args
parsed = parse_function_call('create("Entity", "person", confidence=0.9)')
print(f"Before mapping: {parsed['arguments']}")

param_names = ["name", "entity_type"]
mapped = map_positional_args(parsed["arguments"], param_names)
print(f"After mapping: {mapped}")
print("Note: Keyword args preserved as-is")

Before mapping: {'_pos_0': 'Entity', '_pos_1': 'person', 'confidence': 0.9}
After mapping: {'name': 'Entity', 'entity_type': 'person', 'confidence': 0.9}
Note: Keyword args preserved as-is


In [11]:
# Too many positional args - error
parsed = parse_function_call("func(1, 2, 3, 4)")
param_names = ["a", "b"]

try:
    map_positional_args(parsed["arguments"], param_names)
except ValueError as e:
    print(f"✓ Caught error: {e}")

✓ Caught error: Too many positional arguments (expected 2)


## 6. Schema-Based Nesting

Automatically restructure flat arguments into nested objects based on Pydantic schema.

In [12]:
# Define schemas with nested structure
class SearchOptions(BaseModel):
    include_archived: bool = False
    fuzzy_match: bool = True


class SearchRequest(BaseModel):
    query: str
    limit: int = 10
    options: SearchOptions | None = None  # Nested schema (optional)


# Flat arguments (as LLM might generate)
flat_args = {
    "query": "AI research",
    "limit": 20,
    "include_archived": True,  # Should nest into options
    "fuzzy_match": False,  # Should nest into options
}

print("Flat arguments:")
print(flat_args)

Flat arguments:
{'query': 'AI research', 'limit': 20, 'include_archived': True, 'fuzzy_match': False}


In [13]:
# Nest arguments according to schema
nested = nest_arguments_by_schema(flat_args, SearchRequest)

print("\nNested arguments:")
print(nested)

# Verify structure matches schema
request = SearchRequest(**nested)
print("\n✓ Successfully created SearchRequest:")
print(f"  query: {request.query}")
print(f"  limit: {request.limit}")
print(f"  options.include_archived: {request.options.include_archived}")
print(f"  options.fuzzy_match: {request.options.fuzzy_match}")


Nested arguments:
{'query': 'AI research', 'limit': 20, 'options': {'include_archived': True, 'fuzzy_match': False}}

✓ Successfully created SearchRequest:
  query: AI research
  limit: 20
  options.include_archived: True
  options.fuzzy_match: False


## 7. Union Type Nesting

Handles union types (e.g., `EpisodicOption | SemanticOption`) by collecting fields from all union members.

In [14]:
# Define union schemas
class EpisodicOption(BaseModel):
    context: str
    importance: float


class SemanticOption(BaseModel):
    concept: str
    confidence: float


class MemoryRequest(BaseModel):
    content: str
    options: EpisodicOption | SemanticOption  # Union type


# Flat args for episodic memory
flat_episodic = {
    "content": "User completed authentication refactor",
    "context": "session_work",  # From EpisodicOption
    "importance": 0.8,  # From EpisodicOption
}

nested_episodic = nest_arguments_by_schema(flat_episodic, MemoryRequest)
print("Episodic memory (nested):")
print(nested_episodic)

# Validate it works
memory = MemoryRequest(**nested_episodic)
print(f"✓ Created: {memory.content[:30]}... with context={memory.options.context}")

Episodic memory (nested):
{'content': 'User completed authentication refactor', 'options': {'context': 'session_work', 'importance': 0.8}}
✓ Created: User completed authentication ... with context=session_work


In [15]:
# Flat args for semantic memory
flat_semantic = {
    "content": "JWT tokens expire after 1 hour in production",
    "concept": "authentication_patterns",  # From SemanticOption
    "confidence": 0.95,  # From SemanticOption
}

nested_semantic = nest_arguments_by_schema(flat_semantic, MemoryRequest)
print("Semantic memory (nested):")
print(nested_semantic)

# Validate
memory = MemoryRequest(**nested_semantic)
print(f"✓ Created: {memory.content[:30]}... with concept={memory.options.concept}")

Semantic memory (nested):
{'content': 'JWT tokens expire after 1 hour in production', 'options': {'concept': 'authentication_patterns', 'confidence': 0.95}}
✓ Created: JWT tokens expire after 1 hour... with concept=authentication_patterns


## 8. End-to-End LLM Function Calling

Complete workflow: LLM output → Parse → Map → Nest → Validate

In [16]:
# Simulate LLM output (Pythonic function call)
llm_output = 'search("ocean preferences", 5, fuzzy_match=True, include_archived=False)'

print(f"LLM Output: {llm_output}")
print()

LLM Output: search("ocean preferences", 5, fuzzy_match=True, include_archived=False)



In [17]:
# Step 1: Parse function call
parsed = parse_function_call(llm_output)
print("Step 1 - Parsed:")
print(f"  Tool: {parsed['tool']}")
print(f"  Raw arguments: {parsed['arguments']}")
print()

Step 1 - Parsed:
  Tool: search
  Raw arguments: {'_pos_0': 'ocean preferences', '_pos_1': 5, 'fuzzy_match': True, 'include_archived': False}



In [18]:
# Step 2: Map positional args to param names
param_names = ["query", "limit"]
mapped = map_positional_args(parsed["arguments"], param_names)
print("Step 2 - Mapped positional args:")
print(f"  {mapped}")
print()

Step 2 - Mapped positional args:
  {'query': 'ocean preferences', 'limit': 5, 'fuzzy_match': True, 'include_archived': False}



In [19]:
# Step 3: Nest according to schema
nested = nest_arguments_by_schema(mapped, SearchRequest)
print("Step 3 - Nested by schema:")
print(f"  {nested}")
print()

Step 3 - Nested by schema:
  {'query': 'ocean preferences', 'limit': 5, 'options': {'fuzzy_match': True, 'include_archived': False}}



In [20]:
# Step 4: Validate and create typed object
request = SearchRequest(**nested)
print("Step 4 - Validated SearchRequest:")
print(f"  query: {request.query}")
print(f"  limit: {request.limit}")
print(f"  options.fuzzy_match: {request.options.fuzzy_match}")
print(f"  options.include_archived: {request.options.include_archived}")
print()
print("✓ LLM function call successfully converted to typed request")

Step 4 - Validated SearchRequest:
  query: ocean preferences
  limit: 5
  options.fuzzy_match: True
  options.include_archived: False

✓ LLM function call successfully converted to typed request


## 9. Batch Function Calls

Parse multiple function calls for parallel execution.

In [21]:
# Simulate LLM generating batch of function calls
batch_calls = [
    'search("ocean preferences", 5)',
    'search("orchestration patterns", 10)',
    'search("khive mcp", 3)',
]

print("Parsing batch of function calls:")
print()

parsed_batch = []
for call in batch_calls:
    parsed = parse_function_call(call)
    # Map and nest
    mapped = map_positional_args(parsed["arguments"], ["query", "limit"])
    nested = nest_arguments_by_schema(mapped, SearchRequest)

    # Create request
    request = SearchRequest(**nested)
    parsed_batch.append(request)

    print(f"  {call}")
    print(f"    → query={request.query!r}, limit={request.limit}")

print()
print(f"✓ Parsed {len(parsed_batch)} function calls for parallel execution")

Parsing batch of function calls:

  search("ocean preferences", 5)
    → query='ocean preferences', limit=5
  search("orchestration patterns", 10)
    → query='orchestration patterns', limit=10
  search("khive mcp", 3)
    → query='khive mcp', limit=3

✓ Parsed 3 function calls for parallel execution


## 10. Integration with Real Schemas

Example with realistic MCP-like schemas.

In [22]:
# Define realistic MCP schemas
from typing import Literal


class RecallOptions(BaseModel):
    memory_type: Literal["episodic", "semantic", "working"] | None = None
    limit: int = 10
    verbose: bool = False


class RecallRequest(BaseModel):
    action: Literal["search", "by_entity"] = "search"
    query: str
    options: RecallOptions | None = None


# LLM generates: recall("ocean preferences", memory_type="episodic", limit=5)
llm_call = 'recall("ocean preferences", memory_type="episodic", limit=5)'

# Parse and process
parsed = parse_function_call(llm_call)
mapped = map_positional_args(parsed["arguments"], ["query"])
nested = nest_arguments_by_schema(mapped, RecallRequest)

print(f"LLM call: {llm_call}")
print("\nNested structure:")
print(nested)

# Create typed request
recall_req = RecallRequest(**nested)
print("\n✓ Created RecallRequest:")
print(f"  action: {recall_req.action}")
print(f"  query: {recall_req.query}")
print(f"  options.memory_type: {recall_req.options.memory_type if recall_req.options else None}")
print(f"  options.limit: {recall_req.options.limit if recall_req.options else None}")

LLM call: recall("ocean preferences", memory_type="episodic", limit=5)

Nested structure:
{'query': 'ocean preferences', 'options': {'memory_type': 'episodic', 'limit': 5}}

✓ Created RecallRequest:
  action: search
  query: ocean preferences
  options.memory_type: episodic
  options.limit: 5


## Summary Checklist

**Function Call Parser Essentials:**
- ✅ Parse Pythonic function calls into structured tool invocations
- ✅ Support positional and keyword arguments
- ✅ Handle complex types (lists, dicts, nested structures)
- ✅ Extract method names from chained calls
- ✅ Map positional args to parameter names
- ✅ Auto-nest flat args based on Pydantic schemas
- ✅ Support union types (EpisodicOption | SemanticOption)
- ✅ Type-safe AST parsing (prevents code injection)
- ✅ End-to-end LLM function calling workflow
- ✅ Batch parsing for parallel execution

**Benefits:**
- Natural syntax for LLMs (Python > JSON)
- Automatic schema alignment
- Type safety and validation
- Reduces LLM output verbosity

**Next Steps:**
- See `_schema_to_model.py` for schema extraction
- See `_typescript.py` for TypeScript schema generation
- See MCP tool handlers for real-world usage