# Basic Evaluation with RAIL Score

This notebook demonstrates basic content evaluation using the RAIL Score Python SDK.

## What You'll Learn

- How to initialize the RAIL Score client
- Perform basic evaluations across all 8 dimensions
- Understand the response structure
- Interpret scores and confidence levels

## Setup

First, install the RAIL Score SDK if you haven't already:

In [None]:
# Uncomment to install
# !pip install rail-score

In [None]:
import os
from rail_score import RailScore

# Initialize client
# Option 1: Use environment variable (recommended)
client = RailScore(api_key=os.getenv("RAIL_API_KEY"))

# Option 2: Direct API key (not recommended for production)
# client = RailScore(api_key="your-api-key-here")

## Example 1: Basic Evaluation

Let's evaluate a simple AI-generated statement:

In [None]:
# Content to evaluate
content = "Our AI system prioritizes user privacy and data security through encryption and access controls."

# Perform evaluation
result = client.evaluation.basic(content)

# Display overall score
print(f"Overall RAIL Score: {result.rail_score.score:.2f}/10")
print(f"Confidence: {result.rail_score.confidence:.2%}")

## Example 2: Exploring Dimension Scores

Let's look at the individual dimension scores:

In [None]:
print("\nDimension Scores:")
print("-" * 60)

for dim_name, dim_score in result.scores.items():
    print(f"\n{dim_name.upper().replace('_', ' ')}")
    print(f"  Score: {dim_score.score:.2f}/10")
    print(f"  Confidence: {dim_score.confidence:.2%}")
    print(f"  Explanation: {dim_score.explanation}")
    
    if dim_score.issues:
        print(f"  Issues: {', '.join(dim_score.issues)}")

## Example 3: Visualizing Scores

Let's create a bar chart to visualize the dimension scores:

In [None]:
import matplotlib.pyplot as plt

# Extract dimension names and scores
dimensions = [dim.replace('_', ' ').title() for dim in result.scores.keys()]
scores = [dim_score.score for dim_score in result.scores.values()]

# Create bar chart
plt.figure(figsize=(12, 6))
bars = plt.bar(dimensions, scores, color='steelblue')

# Add a horizontal line at the overall score
plt.axhline(y=result.rail_score.score, color='red', linestyle='--', 
            label=f'Overall Score: {result.rail_score.score:.2f}')

# Customize chart
plt.xlabel('RAIL Dimensions')
plt.ylabel('Score (0-10)')
plt.title('RAIL Score Evaluation by Dimension')
plt.xticks(rotation=45, ha='right')
plt.ylim(0, 10)
plt.legend()
plt.tight_layout()
plt.grid(axis='y', alpha=0.3)
plt.show()

## Example 4: Comparing Multiple Contents

Let's evaluate and compare multiple AI-generated statements:

In [None]:
contents = [
    "Our AI makes automated decisions without human oversight.",
    "We collect user data to improve our services with explicit consent.",
    "The AI model is a black box - we don't know how it works.",
    "Our system provides clear explanations for all decisions and protects user privacy."
]

# Evaluate each content
results = []
for i, content in enumerate(contents, 1):
    result = client.evaluation.basic(content)
    results.append(result)
    print(f"Content {i}: {result.rail_score.score:.2f}/10")
    print(f"  Text: {content[:60]}...")
    print()

## Example 5: Understanding Metadata

Each evaluation response includes useful metadata:

In [None]:
# Get metadata from the last result
metadata = results[-1].metadata

print("Request Metadata:")
print(f"  Request ID: {metadata.req_id}")
print(f"  Plan Tier: {metadata.tier}")
print(f"  Credits Consumed: {metadata.credits_consumed}")
print(f"  Processing Time: {metadata.processing_time_ms:.0f}ms")
print(f"  Queue Wait Time: {metadata.queue_wait_time_ms:.0f}ms")
print(f"  Timestamp: {metadata.timestamp}")

## Example 6: Custom Weights

You can apply custom importance weights to different dimensions:

In [None]:
# Define custom weights (must sum to 100)
custom_weights = {
    "safety": 35,
    "privacy": 30,
    "reliability": 15,
    "accountability": 10,
    "transparency": 5,
    "fairness": 3,
    "inclusivity": 1,
    "user_impact": 1
}

# Evaluate with custom weights
weighted_result = client.evaluation.basic(
    "Healthcare AI system for patient diagnosis",
    weights=custom_weights
)

print(f"Weighted RAIL Score: {weighted_result.rail_score.score:.2f}/10")
print("\nTop 3 Dimensions (by weight):")
for dim in ["safety", "privacy", "reliability"]:
    score = weighted_result.scores[dim]
    print(f"  {dim.title()}: {score.score:.2f}/10 (weight: {custom_weights[dim]}%)")

## Example 7: Error Handling

Here's how to handle potential errors:

In [None]:
from rail_score import (
    AuthenticationError,
    InsufficientCreditsError,
    ValidationError,
    RateLimitError
)

try:
    result = client.evaluation.basic("Test content")
    print(f"✅ Evaluation successful: {result.rail_score.score:.2f}/10")
    
except AuthenticationError:
    print("❌ Authentication failed. Check your API key.")
    
except InsufficientCreditsError as e:
    print(f"❌ Insufficient credits. Balance: {e.balance}, Required: {e.required}")
    
except ValidationError as e:
    print(f"❌ Validation error: {e}")
    
except RateLimitError as e:
    print(f"❌ Rate limit exceeded. Retry after {e.retry_after}s")
    
except Exception as e:
    print(f"❌ Unexpected error: {e}")

## Next Steps

Now that you understand basic evaluation, explore:

- **02_dimension_specific.ipynb** - Focus on specific dimensions
- **03_batch_processing.ipynb** - Evaluate multiple items efficiently
- **04_compliance_checks.ipynb** - Check GDPR, HIPAA, CCPA compliance
- **05_rag_evaluation.ipynb** - Detect hallucinations in RAG systems

## Resources

- [API Reference](https://responsibleailabs.ai/docs/api-reference)
- [Full Documentation](https://responsibleailabs.ai/docs)
- [GitHub Repository](https://github.com/Responsible-AI-Labs/rail-score)