# Scorer Evaluations
This notebook demonstrates how to retrieve metrics for a Scorer, as well as how to run evaluations on a Scorer using a dataset of sample assistant responses and manual human scores. For now, this is only implemented for the `SelfAskLikertScorer`, specifically for the `hate_speech` and `violence` scales.

## Running evaluation on a Scorer
The following cell demonstrates how to run a custom evaluation on a `Scorer` using a dataset of LLM assistant responses and human-labeled scores. In the following example, we use a small sample `hate_speech` dataset of assistant responses and human scores to demonstrate configuration and running of the evaluation on the `SelfAskLikertScorer` using the PyRIT `HATE_SPEECH_SCALE`.

In [None]:
import json
from dataclasses import asdict

from pyrit.common import IN_MEMORY, initialize_pyrit
from pyrit.common.path import (
    SCORER_EVALS_PATH,
)
from pyrit.prompt_target import OpenAIChatTarget
from pyrit.score import LikertScalePaths, ScorerEvaluator, SelfAskLikertScorer

initialize_pyrit(memory_db_type=IN_MEMORY)

target = OpenAIChatTarget()
likert_scorer = SelfAskLikertScorer(chat_target=target, likert_scale_path=LikertScalePaths.HATE_SPEECH_SCALE.value)

evaluator = ScorerEvaluator(scorer=likert_scorer)
csv_path = f"{str(SCORER_EVALS_PATH)}/eval_datasets/likert_hate_speech_sample_dataset.csv"
metrics = await evaluator.run_evaluation_from_csv_async(
    csv_path=csv_path, type="harm", assistant_response_col="assistant_response", gold_label_col_names=["human_likert_score_1", "human_likert_score_2", "human_likert_score_3"], top_level_harm="hate_speech")
metrics

HarmScorerMetrics(mean_absolute_error=0.14444444444444446, mae_standard_error=0.026604866660841435, t_statistic=-1.517026652044617, p_value=0.15151253623732802, krippendorff_alpha_combined=0.6774828895065603, krippendorff_alpha_humans=0.6594786524638434, krippendorff_alpha_model=None, type='harm')

In [None]:
# Either work for fetching the hate_speech metrics
evaluator.get_scorer_metrics("hate_speech")
likert_scorer.get_scorer_metrics(file_name = "hate_speech")

HarmScorerMetrics(mean_absolute_error=0.14444444444444446, mae_standard_error=0.026604866660841435, t_statistic=-1.517026652044617, p_value=0.15151253623732802, krippendorff_alpha_combined=0.6774828895065603, krippendorff_alpha_humans=0.6594786524638434, krippendorff_alpha_model=None, type='harm')