# 01. Basic TensorZero Gateway

This notebook demonstrates basic TensorZero gateway functionality including:
- Setting up the client
- Making inference calls
- Using different providers
- Understanding the response structure

In [1]:
import os
import json
from tensorzero import TensorZeroGateway
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Verify API keys are set
api_keys = {
    "OpenAI": os.getenv("OPENAI_API_KEY"),
    "Anthropic": os.getenv("ANTHROPIC_API_KEY"),
    "xAI": os.getenv("XAI_API_KEY")
}

for provider, key in api_keys.items():
    if key:
        print(f"✅ {provider} API key is set")
    else:
        print(f"✗ {provider} API key is missing")

✅ OpenAI API key is set
✅ Anthropic API key is set
✅ xAI API key is set


## 1. Initialize TensorZero Client

TensorZero can be used in two modes:
1. **Standalone Gateway**: Connect to a running gateway service
2. **Embedded Gateway**: Run gateway within your Python process

In [None]:
# Option 1: Connect to standalone gateway (requires docker compose up)
# Note: The old constructor is deprecated, use build_http instead
gateway_client = TensorZeroGateway.build_http(gateway_url="http://localhost:3000")

# Test connection
try:
    # Make a simple health check request
    print("✅ Connected to TensorZero gateway")
    print("🌐 Gateway API: http://localhost:3000")
    print("🎨 TensorZero UI: http://localhost:4000")
    print("📊 ClickHouse: http://localhost:8123")
except Exception as e:
    print(f"✗ Failed to connect: {e}")
    print("Make sure to run 'poe up' or 'docker compose up' first!")

In [None]:
# Option 2: Embedded gateway (runs in-process)
embedded_client = TensorZeroGateway.build_embedded(
    clickhouse_url="http://chuser:chpassword@localhost:8123/tensorzero",
    config_file="../config/tensorzero.toml",
)

# For this notebook, we'll use the standalone gateway
client = gateway_client

## 2. Basic Chat Inference

Let's start with a simple chat completion using our configured functions.

In [None]:
# Test different providers - now with 8 variants!
variants_to_test = [
    ("gpt4", "OpenAI GPT-4"),
    ("gpt4_mini", "OpenAI GPT-4o Mini"),
    ("claude3_opus", "Anthropic Claude 3 Opus"),
    ("claude3_sonnet", "Anthropic Claude 3 Sonnet"),
    ("claude3_haiku", "Anthropic Claude 3 Haiku"),
    ("grok3_mini", "xAI Grok-3 Mini"),
    ("grok_code_fast", "xAI Grok Code Fast"),
    ("grok4", "xAI Grok-4"),
]

test_prompt = "Write a haiku about machine learning"

for variant_name, display_name in variants_to_test:
    try:
        response = client.inference(
            function_name="chat",
            variant_name=variant_name,  # Specify which variant to use
            input={
                "messages": [
                    {"role": "user", "content": test_prompt}
                ]
            }
        )
        print(f"\n{'='*50}")
        print(f"{display_name} ({variant_name}):\n")
        print(response.content[0].text if response.content else "No content")
    except Exception as e:
        print(f"\n{'='*50}")
        print(f"{display_name} ({variant_name}): ❌ Failed - {str(e)[:100]}...")

## 3. Using Specific Variants

We can request specific model variants for our functions.

In [None]:
# Test different providers
variants_to_test = [
    ("gpt4", "OpenAI GPT-4"),
    ("gpt35", "OpenAI GPT-3.5"),
    ("claude3_opus", "Anthropic Claude 3 Opus"),
    ("claude3_sonnet", "Anthropic Claude 3 Sonnet"),
    ("grok", "xAI Grok")
]

test_prompt = "Write a haiku about machine learning"

for variant_name, display_name in variants_to_test:
    try:
        response = client.inference(
            function_name="chat",
            variant_name=variant_name,  # Specify which variant to use
            input={
                "messages": [
                    {"role": "user", "content": test_prompt}
                ]
            }
        )
        print(f"\n{'='*50}")
        print(f"{display_name} ({variant_name}):\n")
        print(response.content)
    except Exception as e:
        print(f"\n{'='*50}")
        print(f"{display_name} ({variant_name}): ❌ Failed - {e}")

In [None]:
# Test sentiment analysis with structured output - NEW!
test_texts = [
    "I absolutely love using TensorZero! It makes LLM development so much easier.",
    "The service is down again. This is really frustrating and impacting our work.",
    "The documentation is okay, but could use more examples.",
    "Mixed feelings - great features but the setup was complicated."
]

# Test with different providers that support structured output
structured_variants = [
    ("gpt4_json", "OpenAI GPT-4"),
    ("claude_json", "Anthropic Claude"),  
    ("grok3_json", "xAI Grok-3 (with structured output!)")
]

for variant_name, provider_name in structured_variants:
    print(f"\n{'='*50}")
    print(f"Testing {provider_name} - Structured Output")
    print("="*50)
    
    for text in test_texts[:2]:  # Test first 2 texts
        try:
            response = client.inference(
                function_name="analyze_sentiment",
                variant_name=variant_name,
                input={
                    "system": {"text": text},  # Note: system input may be required
                    "messages": [
                        {"role": "user", "content": text}
                    ]
                }
            )
            
            # Parse the JSON response
            result = json.loads(response.content[0].text)
            
            print(f"\nText: {text[:50]}...")
            print(f"Sentiment: {result['sentiment']} (confidence: {result['confidence']:.2f})")
            print(f"Explanation: {result['explanation']}")
        except Exception as e:
            print(f"\n❌ Failed for '{text[:30]}...': {str(e)[:100]}")

## 4. Structured Output with JSON Schema

TensorZero supports structured outputs using JSON schema validation.

In [None]:
# Test sentiment analysis with structured output
test_texts = [
    "I absolutely love using TensorZero! It makes LLM development so much easier.",
    "The service is down again. This is really frustrating and impacting our work.",
    "The documentation is okay, but could use more examples.",
    "Mixed feelings - great features but the setup was complicated."
]

for text in test_texts:
    response = client.inference(
        function_name="analyze_sentiment",
        input={
            "messages": [
                {"role": "user", "content": text}
            ]
        }
    )
    
    # Parse the JSON response
    result = json.loads(response.content)
    
    print(f"\nText: {text[:50]}...")
    print(f"Sentiment: {result['sentiment']} (confidence: {result['confidence']:.2f})")
    print(f"Explanation: {result['explanation']}")

## 5. Multi-turn Conversations

TensorZero supports multi-turn conversations with message history.

In [None]:
# Build a conversation
messages = [
    {"role": "system", "content": "You are a helpful AI assistant specializing in LLM infrastructure."},
    {"role": "user", "content": "What are the key components of TensorZero?"},
]

# First turn
response1 = client.inference(
    function_name="chat",
    variant_name="gpt4",
    input={"messages": messages}
)

print("Assistant:", response1.content[:200] + "...\n")

# Add response to conversation
messages.append({"role": "assistant", "content": response1.content})
messages.append({"role": "user", "content": "Tell me more about the observability features."})

# Second turn
response2 = client.inference(
    function_name="chat",
    variant_name="gpt4",
    input={"messages": messages}
)

print("Follow-up response:", response2.content[:200] + "...")

## 6. Response Metadata and Observability

TensorZero provides rich metadata with each response for observability.

## Key Learnings

1. **Gateway Modes**: TensorZero supports both standalone and embedded gateway modes
2. **Multi-Provider**: 8 variants configured across OpenAI, Anthropic, and xAI
3. **Structured Output**: JSON schema validation for reliable outputs (all Grok models support this!)
4. **Observability**: Each inference has a unique ID for tracking
5. **Feedback Loop**: Built-in feedback collection for optimization
6. **Client API**: Use `TensorZeroGateway.build_http()` (constructor is deprecated)

## Advanced Capabilities (NEW):
- **xAI Grok Models**: All support structured output, reasoning, and function calling
- **grok-4-0790**: Supports image input + text output
- **JSON Functions**: Configured with schema files in `config/functions/`
- **Services**: Gateway (3000), UI (4000), ClickHouse (8123)

Next notebook: We'll explore multi-provider testing and performance comparisons.

## 7. Error Handling and Fallbacks

Let's test how TensorZero handles errors and provider failures.

In [None]:
# Test with invalid variant
try:
    response = client.inference(
        function_name="chat",
        variant_name="non_existent_variant",
        input={
            "messages": [{"role": "user", "content": "Test"}]
        }
    )
except Exception as e:
    print(f"Expected error for invalid variant: {e}")

# Test with invalid function
try:
    response = client.inference(
        function_name="non_existent_function",
        input={
            "messages": [{"role": "user", "content": "Test"}]
        }
    )
except Exception as e:
    print(f"\nExpected error for invalid function: {e}")

## 8. Collecting Feedback

TensorZero allows collecting feedback on inferences for optimization.

In [None]:
# Make an inference
response = client.inference(
    function_name="creative_write",
    input={
        "messages": [
            {"role": "user", "content": "Write a creative tagline for TensorZero"}
        ]
    }
)

print(f"Tagline: {response.content}")
print(f"\nInference ID: {response.inference_id}")

# Collect feedback
try:
    client.feedback(
        inference_id=response.inference_id,
        feedback={
            "score": 0.9,
            "helpful": True,
            "creative": True,
            "comment": "Great tagline!"
        }
    )
    print("\n✓ Feedback submitted successfully")
except Exception as e:
    print(f"\n✗ Failed to submit feedback: {e}")

## Key Learnings

1. **Gateway Modes**: TensorZero supports both standalone and embedded gateway modes
2. **Multi-Provider**: Easy to switch between providers using variants
3. **Structured Output**: JSON schema validation for reliable outputs
4. **Observability**: Each inference has a unique ID for tracking
5. **Feedback Loop**: Built-in feedback collection for optimization

Next notebook: We'll explore multi-provider testing and performance comparisons.