# LLM Metrics Lite — Visualization Demo

This notebook demonstrates how to visualize metrics generated by **llm-metrics-lite**.

Plots included:
- Metric bar chart (single response)
- Radar chart (overall quality shape)
- Side-by-side comparison of two LLM outputs


In [None]:
import matplotlib.pyplot as plt
import numpy as np

from llm_metrics_lite import (
    evaluate_output,
    train_char_ngram_model,
)

## Example LLM Outputs

In [None]:
output_a = '''
Electric vehicles are growing rapidly worldwide.
In 2023, sales reached around 14 million units.
This growth may continue despite supply chain issues.
'''

output_b = '''
EVs are the future. EVs are the future. EVs are the future.
Sales were big last year, probably.
'''

context = '''
Global EV sales reached approximately 14 million units in 2023.
Supply chain constraints have impacted production rates.
'''

## Train Lightweight Perplexity Model

In [None]:
training_corpus = [
    'Electric vehicles are transforming transportation.',
    'Battery technology continues to improve.',
    'Supply chain disruptions affect manufacturing.'
]

perplexity_model = train_char_ngram_model(training_corpus, n=3)

## Evaluate Outputs

In [None]:
res_a = evaluate_output(
    output_text=output_a,
    context_text=context,
    perplexity_model=perplexity_model,
)

res_b = evaluate_output(
    output_text=output_b,
    context_text=context,
    perplexity_model=perplexity_model,
)

res_a, res_b

## Bar Chart — Metric Breakdown (Output A)

In [None]:
metrics = {
    'Coherence': res_a.coherence,
    'Groundedness': res_a.groundedness,
    'Readability': res_a.readability / 100,
    'Repetition (inv)': 1 - res_a.repetition,
    'Uncertainty (inv)': 1 - res_a.uncertainty,
}

plt.figure()
plt.bar(metrics.keys(), metrics.values())
plt.xticks(rotation=30)
plt.title('Metric Breakdown — Output A')
plt.ylim(0, 1)
plt.show()

## Radar Chart — Quality Shape

In [None]:
labels = list(metrics.keys())
values = list(metrics.values())
values += values[:1]

angles = np.linspace(0, 2 * np.pi, len(labels), endpoint=False).tolist()
angles += angles[:1]

fig = plt.figure()
ax = fig.add_subplot(111, polar=True)
ax.plot(angles, values)
ax.fill(angles, values, alpha=0.1)
ax.set_thetagrids(np.degrees(angles[:-1]), labels)
ax.set_title('Quality Radar — Output A')
plt.show()

## Side-by-Side Comparison

In [None]:
labels = ['coherence', 'groundedness', 'quality_index']
a_vals = [res_a.coherence, res_a.groundedness, res_a.quality_index]
b_vals = [res_b.coherence, res_b.groundedness, res_b.quality_index]

x = np.arange(len(labels))
width = 0.35

plt.figure()
plt.bar(x - width/2, a_vals, width, label='Output A')
plt.bar(x + width/2, b_vals, width, label='Output B')
plt.xticks(x, labels)
plt.ylim(0, 1)
plt.title('Output Comparison')
plt.legend()
plt.show()

## Summary

These visualizations help:
- Interpret why an LLM response scores well or poorly
- Compare multiple outputs or prompts
- Generate figures for blogs, reports, and papers
