# Validating Large Language Model Outputs

This is the notebook companion for the blog post [Validating Large Language Model Outputs](https://txt.cohere.ai/validating-llm-outputs).

One key property of LLMs that’s different from traditional software is that the output is probabilistic in nature. The same input (i.e., the prompt) may not always produce the same response. While this property makes it possible to build entirely new classes of natural language applications, it also means that those applications require a mechanism for validating their outputs.

An output validation step ensures that an LLM application is robust and predictable. In this article, we looked at what output validation is and how to implement it using [Guardrails AI](http://getguardrails.ai/).


## 1: Setup

In [24]:
# TODO: upgrade to "cohere>5"! pip install "cohere<5" guardrails-ai -q

In [3]:
import cohere
import guardrails as gd
from guardrails.validators import ValidRange, ValidChoices
from pydantic import BaseModel, Field
from rich import print
from typing import List

## 2: Define the Output Schema

Our goal is to extract detailed patient information from a medical record.
As an example, we will use the following medical record:

In [3]:
doctors_notes = """49 y/o Male with chronic macular rash to face & hair, worse in beard, eyebrows & nares.
Itchy, flaky, slightly scaly. Moderate response to OTC steroid cream"""

We want our extracted information to contain the following fields:

1. Patient's gender
2. Patient's age
3. A list of symptoms, each with a severity rating and an affected area
4. A list of medications, each with information about the patient's response to the medication

Let's define the Pydantic classes below.

In [4]:
class Symptom(BaseModel):
    symptom: str = Field(..., description="Symptom that a patient is experiencing")
    affected_area: str = Field(
        ...,
        description="What part of the body the symptom is affecting",
        validators=[ValidChoices(["Head", "Face", "Neck", "Chest"], on_fail="reask")]
    )

class CurrentMed(BaseModel):
    medication: str = Field(..., description="Name of the medication the patient is taking")
    response: str = Field(..., description="How the patient is responding to the medication")


class PatientInfo(BaseModel):
    gender: str = Field(..., description="Patient's gender")
    age: int = Field(..., description="Patient's age", validators=[ValidRange(0, 100)])
    symptoms: List[Symptom] = Field(..., description="Symptoms that the patient is experiencing")
    current_meds: List[CurrentMed] = Field(..., description="Medications that the patient is currently taking")


## 3: Initialize a Guard Object Based on the Schema

In [5]:
PROMPT = """Given the following doctor's notes about a patient,
please extract a dictionary that contains the patient's information.

${doctors_notes}

${gr.complete_json_suffix_v2}
"""

In [6]:
# Initialize a Guard object from the Pydantic model PatientInfo
guard = gd.Guard.from_pydantic(PatientInfo, prompt=PROMPT)
print(guard.base_prompt)

## 4: Wrap an LLM Call with the Guard Object

In [7]:
# Create a Cohere client
co = cohere.Client(api_key='COHERE_API_KEY')

In [21]:
# Wrap the Cohere API call with the `guard` object
raw_llm_output, validated_output = guard(
    co.generate,
    prompt_params={"doctors_notes": doctors_notes},
    model='command',
    max_tokens=1024,
    temperature=0.3,
)

# Print the validated output from the LLM
print(validated_output)



In [23]:
guard.guard_state.most_recent_call.tree