# LLM Metrics Lite â€” End-to-End Demo

This notebook demonstrates lightweight, dependency-free evaluation metrics for LLM outputs.

In [1]:

from llm_metrics_lite import (
    evaluate_output,
    train_char_ngram_model,
    coherence_score,
    ngram_perplexity,
    groundedness_score,
    flesch_reading_ease,
    lexical_diversity,
    repetition_ratio,
    uncertainty_score,
    numeric_density,
    quality_index,
    estimated_cost,
)


In [3]:

llm_output = '''
The global electric vehicle market is expected to grow rapidly.
In 2023, sales reached approximately 14 million units.
However, growth may slow slightly due to supply chain constraints.
Overall, the outlook remains positive.
'''

context_text = '''
Global EV sales reached 14 million units in 2023 according to industry reports.
Supply chain disruptions have affected growth.
'''


In [5]:

training_corpus = [
    "Electric vehicles are transforming transportation.",
    "Battery technology is improving rapidly.",
    "Supply chain disruptions can affect production."
]

perplexity_model = train_char_ngram_model(training_corpus, n=3)


In [7]:

results = evaluate_output(
    output_text=llm_output,
    context_text=context_text,
    perplexity_model=perplexity_model,
    perplexity_n=3,
)

results


LLMEvaluationResult(coherence=0, perplexity=2.258063745646645, groundedness=0.3333333333333333, readability=36.063, lexical_diversity=0.9394, repetition=0.0, uncertainty=0.0606, numeric_density=0.0606, quality_index=0.4361, estimated_cost=8.4e-05, word_count=33, char_count=226, approx_token_count=42)

In [9]:

print("Coherence:", results.coherence)
print("Perplexity:", results.perplexity)
print("Groundedness:", results.groundedness)
print("Readability:", results.readability)
print("Lexical Diversity:", results.lexical_diversity)
print("Repetition Ratio:", results.repetition)
print("Uncertainty Score:", results.uncertainty)
print("Numeric Density:", results.numeric_density)
print("Quality Index:", results.quality_index)
print("Estimated Cost ($):", results.estimated_cost)


Coherence: 0
Perplexity: 2.258063745646645
Groundedness: 0.3333333333333333
Readability: 36.063
Lexical Diversity: 0.9394
Repetition Ratio: 0.0
Uncertainty Score: 0.0606
Numeric Density: 0.0606
Quality Index: 0.4361
Estimated Cost ($): 8.4e-05


## Individual Metric Usage

In [11]:

coherence_score(llm_output),
flesch_reading_ease(llm_output),
lexical_diversity(llm_output),
repetition_ratio(llm_output),
uncertainty_score(llm_output),
numeric_density(llm_output)


0.0606

## Performance & Cost Simulation

In [13]:

tokens = results.approx_token_count
latency_seconds = 0.42

tokens_per_second = tokens / latency_seconds
cost_estimate = estimated_cost(tokens)

tokens_per_second, cost_estimate


(100.0, 8.4e-05)

## Summary

LLM Metrics Lite provides interpretable, dependency-free evaluation signals for LLM outputs.