# Fiddler Evals SDK - Using Fiddler Evaluators

This quickstart shows how to set up and use Fiddler's built-in evaluators to assess your AI application outputs.

**Prerequisites:**
- A Fiddler account with API access
- An LLM credential configured in **Settings > LLM Gateway**

## 1. Setup

Connect to Fiddler and initialize evaluators:

In [None]:
import pandas as pd
from fiddler_evals import init
from fiddler_evals.evaluators import RAGFaithfulness, AnswerRelevance

URL = ''  # e.g., 'https://your-org.fiddler.ai'
TOKEN = ''  # From Settings > Credentials
LLM_CREDENTIAL_NAME = ''  # From Settings > LLM Gateway
LLM_MODEL_NAME = ''  # e.g., 'fiddler/llama3.1-8b'

init(url=URL, token=TOKEN)

# Initialize evaluators
faithfulness = RAGFaithfulness(model=LLM_MODEL_NAME, credential=LLM_CREDENTIAL_NAME)
relevance = AnswerRelevance(model=LLM_MODEL_NAME, credential=LLM_CREDENTIAL_NAME)

## 2. Test Cases

Create sample data to evaluate:

In [None]:
test_cases = pd.DataFrame(
    [
        {
            'scenario': '✅ Perfect Match',
            'query': 'What is the capital of France?',
            'context': ['Paris is the capital of France.'],
            'response': 'The capital of France is Paris.',
        },
        {
            'scenario': '❌ Hallucination',
            'query': 'What are the office hours?',
            'context': ['We are closed on weekends.'],
            'response': 'We are open 9 AM to 5 PM every day.',
        },
        {
            'scenario': '❌ Irrelevant Answer',
            'query': 'How do I reset my password?',
            'context': ['To reset, click "Forgot Password".'],
            'response': 'Our system is very secure and uses 256-bit encryption.',
        },
    ]
)

## 3. Evaluate

Use evaluators to score each test case. Each evaluator returns a score with a `value` (0-1) and `label`:

In [None]:
def evaluate_row(row):
    f_score = faithfulness.score(
        user_query=row['query'],
        rag_response=row['response'],
        retrieved_documents=row['context'],
    )

    r_score = relevance.score(user_query=row['query'], rag_response=row['response'])

    return pd.Series(
        {
            'Faithfulness': f_score.label,
            'Relevance': r_score.label,
            'Status': 'HEALTHY'
            if f_score.value > 0.6 and r_score.value > 0.6
            else 'ISSUE DETECTED',
        }
    )

## 4. View Results

Display the evaluation results:

In [None]:
results = test_cases.join(test_cases.apply(evaluate_row, axis=1))


def color_status(val):
    color = '#d9534f' if val == 'ISSUE DETECTED' else '#5cb85c'
    return f'background-color: {color}; color: white; font-weight: bold'


results[['scenario', 'Faithfulness', 'Relevance', 'Status']].style.map(
    color_status, subset=['Status']
)

## Next Steps

- Try more of Fiddler's out-of-the-box evaluators
- Try custom evaluators in [Part 2: Custom Judge](./Fiddler_Quickstart_Evals_Pt2_CustomJudge.ipynb)
- Try Datasets and Experiments in [Part 3: Datasets and Experiments](./Fiddler_Quickstart_Evals_Pt3_Datasets_Experiments.ipynb)
