### Opik configuration Guide

This notebook demonstrates the basic usage of the `opik` library. We'll cover:

- Logging test cases  
- Running evaluations  
- Viewing and saving results locally  
- Evaluating DeepEval metrics through the Trace metrics API


In [2]:
!pip install opik




[notice] A new release of pip is available: 23.2.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
import os
OPENAI_API_KEY = "your_openai_api_key_here"
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

## Metric Evaluation Using `opik.evaluation.metrics`

This script evaluates the similarity and moderation quality of a generated text (`output`) compared to a reference text (`reference`) using the `opik.evaluation.metrics` package.

### Inputs

- **Output**: A sample response that might come from a language model.
- **Reference**: A human-written or expected answer used for comparison.

### Metrics Used

- **Equals**: Checks if the output and reference are equal, ignoring case sensitivity.
- **Moderation**: Evaluates the moderation compliance of the output (e.g., safety, appropriateness).

### Processing Flow

1. A list of metric instances is created.
2. Each metric is applied to the `output` and `reference`.
3. The result (`.value`) from each metric is stored in a dictionary for easy access.


In [4]:
from opik.evaluation.metrics import Equals, Moderation, GEval
from datetime import datetime, timezone

# Example output and reference
output = """Paris is the capital of France and one of the most visited cities in the world. 
While some tourists express concerns about safety in certain neighborhoods, Paris remains a vibrant and welcoming city. 
Visitors are advised to stay vigilant, especially in crowded areas, but overall, the city is considered safe for travelers."""
reference = """Paris is the capital of France and a major tourist destination. 
While no city is entirely without risk, Paris is generally safe for visitors who take standard precautions."""

metrics = [
    Equals(case_sensitive=False),
    Moderation()
]

metric_results = {}
for m in metrics:
    if isinstance(m, Equals):
        result = m.score(output=output, reference=reference)
    elif isinstance(m, Moderation):
        result = m.score(output=output, reference=reference)
    else:
        continue
    metric_results[m.__class__.__name__] = result.value



In [5]:
print("Metric Results:", metric_results)

Metric Results: {'Equals': 0.0, 'Moderation': 0.0}


## Posting Evaluation Metrics to CognitiveView Trace API

This script sends the evaluation metric results (e.g., from Opik) to the CognitiveView Trace API using an authenticated HTTP POST request.

### Authentication

- Uses an **Authorization token** (`AUTH_TOKEN`) for secure access to the API.
- Includes an **X-User-Id** header to identify the user performing the operation.

### Endpoint

- **Base URL**: `https://api.cognitiveview.com`
- **API Path**: `/cv/v1/metrics`
- **Full Endpoint**: `https://api.cognitiveview.com/cv/v1/metrics`

### Payload Structure

#### `metric_metadata`
Metadata describing the context of the evaluation:
- `application_name`: Name of the application being evaluated.
- `version`: Version of the application.
- `resource_name`: The evaluated resource (e.g., a model or endpoint).
- `resource_id`: Unique ID of the resource.
- `url`: The endpoint URL of the resource.
- `provider`: Source of the metric system (e.g., `Opik`).
- `use_case`: The business or functional use case (e.g., `transportation`).

#### `metric_data`
Data containing the metric scores:
- `resource_id`: The ID of the instance or model run being scored.
- `resource_name`: Name of the evaluated resource.
- `deepeval`: Dictionary of computed metric scores 

In [None]:
import requests


BASE_URL = "https://api.cognitiveview.com"
AUTH_TOKEN ="Your-Authorization-Token-Here"  # Replace with your actual token
url = f"{BASE_URL}/cv/v1/metrics"

headers = {
    "Authorization": AUTH_TOKEN,
    "Content-Type": "application/json",
    "X-User-Id": "C473421_T181751",  # Replace with your actual user ID
}

payload = {
  "metric_metadata": {
    "application_name": "chat-application",
    "version": "1.0.0",
    "resource_name": "chat-completion",
    "resource_id": "R-756",
    "url": "https://api.example.com/chat",
    "provider": "deepeval",
    "use_case": "transportation"
  },
  "metric_data": {
    "resource_id": "res_123456",
    "resource_name": "chat-completion",
    "opik": metric_results
  } 
}

response = requests.post(url, headers=headers, json=payload)

# Output the response
print(f"Status Code: {response.status_code}")
print("Response JSON:", response.json())

