<a target="_blank" href="https://colab.research.google.com/github/cohere-ai/notebooks/blob/main/notebooks/llmu/Validating_Large_Language_Model_Outputs.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Validating Large Language Model Outputs

One key property of LLMs that’s different from traditional software is that the output is probabilistic in nature. The same input (i.e., the prompt) may not always produce the same response. While this property makes it possible to build entirely new classes of natural language applications, it also means that those applications require a mechanism for validating their outputs.

An output validation step ensures that an LLM application is robust and predictable. In this article, we looked at what output validation is and how to implement it using [Guardrails AI](https://www.guardrailsai.com/).

Read the accompanying [article here](https://docs.cohere.com/docs/validating-outputs).


## 1: Setup

In [None]:
! pip install cohere git+https://github.com/guardrails-ai/guardrails.git@main

Collecting git+https://github.com/guardrails-ai/guardrails.git@main
  Cloning https://github.com/guardrails-ai/guardrails.git (to revision main) to /tmp/pip-req-build-vcruxxjc
  Running command git clone --filter=blob:none --quiet https://github.com/guardrails-ai/guardrails.git /tmp/pip-req-build-vcruxxjc
  Resolved https://github.com/guardrails-ai/guardrails.git to commit 6de5641b8f269164cd57cd95f32dacb9e7d83537
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [None]:
!guardrails hub install hub://guardrails/valid_range
!guardrails hub install hub://guardrails/valid_choices

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.

Installing hub:[35m/[0m[35m/guardrails/[0m[95mvalid_range...[0m

[2K[32m[  ==][0m Fetching manifest
[2K[32m[    ][0m Downloading dependencies
[1A[2K[?25l[32m[    ][0m Running post-install setup
[1A[2K✅Successfully installed hub:[35m/[0m[35m/guardrails/[0m[95mvalid_range[0m!

[1mImport validator:[0m
from guardrails.hub import ValidRange

[1mGet more info:[0m
[4;94mhttps://hub.guardrailsai.com/validator/guardrails/valid_range[0m


Installing hub:[35m/[0m[35m/guardrails/[0m[95mvalid_choices...[0m

[2K[32m[=   ][0m Fetching manifest
[2K[32m[====][0m Downloading dependencies
[1A[2K[?25l[32m[    ][0m Running post-install setup
[1A[2K✅Successfully installed hub:[35m/[0m[35m/guardrails/[0m[95mvalid_choices[0m!

[1mImport validator:[0m
from guardrails.hub import ValidChoices

[1mGet more info:[0m
[4;94mhttps://hub.guardrailsai.co

In [None]:
import os
import cohere
import guardrails as gd
from guardrails.hub import ValidRange, ValidChoices
from pydantic import BaseModel, Field
from rich import print
from typing import List

# Create a Cohere client
co = cohere.Client(api_key="COHERE_API_KEY")

# Configure the API key for Guardrails
os.environ["COHERE_API_KEY"]="COHERE_API_KEY"

## 2: Define the Output Schema

Our goal is to extract detailed patient information from a medical record.
As an example, we will use the following medical record:

In [None]:
doctors_notes = """49 y/o Male with chronic macular rash to face & hair, worse in beard, eyebrows & nares.
Itchy, flaky, slightly scaly. Moderate response to OTC steroid cream"""

We want our extracted information to contain the following fields:

1. Patient's gender
2. Patient's age
3. A list of symptoms, each with a severity rating and an affected area
4. A list of medications, each with information about the patient's response to the medication

Let's define the Pydantic classes below.

In [None]:
class Symptom(BaseModel):
    symptom: str = Field(..., description="Symptom that a patient is experiencing")
    affected_area: str = Field(
        ...,
        description="What part of the body the symptom is affecting",
        validators=[ValidChoices(["Head", "Face", "Neck", "Chest"], on_fail="reask")]
    )

class CurrentMed(BaseModel):
    medication: str = Field(..., description="Name of the medication the patient is taking")
    response: str = Field(..., description="How the patient is responding to the medication")


class PatientInfo(BaseModel):
    gender: str = Field(..., description="Patient's gender")
    age: int = Field(..., description="Patient's age", validators=[ValidRange(0, 100)])
    symptoms: List[Symptom] = Field(..., description="Symptoms that the patient is experiencing")
    current_meds: List[CurrentMed] = Field(..., description="Medications that the patient is currently taking")


## 3: Initialize a Guard Object Based on the Schema

In [None]:
PROMPT = """Given the following doctor's notes about a patient,
please extract a dictionary that contains the patient's information.

${doctors_notes}

${gr.complete_json_suffix_v2}
"""

In [None]:
# Initialize a Guard object from the Pydantic model PatientInfo
guard = gd.Guard.from_pydantic(PatientInfo, prompt=PROMPT)

    Importing validators from `guardrails.validators` is deprecated.
    All validators are now available in the Guardrails Hub. Please install
    and import them from the hub instead. All validators will be
    removed from this module in the next major release.

    Install with: `guardrails hub install hub://<namespace>/<validator_name>`
    Import as: from guardrails.hub import `ValidatorName`
    
  warn(


## 4: Wrap an LLM Call with the Guard Object

In [None]:
# Wrap the Cohere API call with the `guard` object
response = guard(
    instructions=PROMPT,
    prompt_params={"doctors_notes": doctors_notes},
    model='command-r',
    temperature=0,
    num_reasks=3,
)

# Print the validated output from the LLM
print(response.validated_output)

In [None]:
guard.history.last.tree