<a href="https://colab.research.google.com/github/aljebraschool/ai-startup-idea-generator/blob/master/LLM_university_Validating_Outputs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

One key property of LLMs that’s different from traditional software is that the output is probabilistic in nature. The same input (i.e., the prompt) may not always produce the same response. While this property makes it possible to build entirely new classes of natural language applications, it also means that those applications require a mechanism for validating their outputs.

An output validation step ensures that an LLM application is robust and predictable. In this article, we looked at what output validation is and how to implement it using [Guardrails AI](https://www.guardrailsai.com/).

Read the accompanying [article here](https://docs.cohere.com/docs/validating-outputs).


Let’s look at an example of using Guardrails in a text extraction task. The task is to extract the information from a doctor’s note into a JSON object. The following is the doctor’s note.

# Setup

In [2]:
!pip install cohere guardrails-ai -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/45.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.8/45.8 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m249.9/249.9 kB[0m [31m9.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.3/232.3 kB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.2/139.2 kB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.0/46.0 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.8/42.8 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m37.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [3]:
!pip install guardrails-ai --upgrade



In [4]:
!guardrails configure

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
Enable anonymous metrics reporting? [Y/n]: y
Do you wish to use remote inferencing? [Y/n]: y

[1mEnter API Key below[0m[1m [0m👉 You can find your API Key at [4;94mhttps://hub.guardrailsai.com/keys[0m

API Key: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJnaXRodWJ8NDg1MDIwMjMiLCJhcGlLZXlJZCI6IjJjOWY4YWQ5LThkZDEtNGRkNC1iYWE2LWQ3NzkyZGIyYmZjNiIsInNjb3BlIjoicmVhZDpwYWNrYWdlcyIsInBlcm1pc3Npb25zIjpbXSwiaWF0IjoxNzM0NTk4ODc0LCJleHAiOjQ4ODgxOTg4NzR9.H38jtM6Rnn40GU-YM92vgqLW9iT0ciA7-Kak7P3AfTA

            Login successful.

            Get started by installing our RegexMatch validator:
            https://hub.guardrailsai.com/validator/guardrails_ai/regex_match

            You can install it by running:
            guardrails hub install hub://guardrails/regex_match

            Find more validators at https://hub.guardrailsai.com
            


In [5]:
!guardrails hub install hub://guardrails/valid_range
!guardrails hub install hub://guardrails/valid_choices

Installing hub:[35m/[0m[35m/guardrails/[0m[95mvalid_range...[0m
[2K[32m[=== ][0m Fetching manifest
[2K[32m[    ][0m Downloading dependencies
[1A[2K[?25l[32m[    ][0m Running post-install setup
[1A[2K✅Successfully installed guardrails/valid_range!


[1mImport validator:[0m
from guardrails.hub import ValidRange

[1mGet more info:[0m
[4;94mhttps://hub.guardrailsai.com/validator/guardrails/valid_range[0m

Installing hub:[35m/[0m[35m/guardrails/[0m[95mvalid_choices...[0m
[2K[32m[    ][0m Fetching manifest
[2K[32m[   =][0m Downloading dependencies
[1A[2K[?25l[32m[    ][0m Running post-install setup
[1A[2K✅Successfully installed guardrails/valid_choices!


[1mImport validator:[0m
from guardrails.hub import ValidChoices

[1mGet more info:[0m
[4;94mhttps://hub.guardrailsai.com/validator/guardrails/valid_choices[0m



In [15]:
import os
import cohere
import guardrails as gd
from guardrails.hub import ValidRange, ValidChoices
from pydantic import BaseModel, Field
from rich import print

In [7]:
co = cohere.ClientV2("COHERE_API_KEY") # Get your free API key: https://dashboard.cohere.com/api-keys

In [8]:
# Configure the API key for Guardrails
os.environ["COHERE_API_KEY"] = "COHERE_API_KEY"

# Define the output schema

Next, we define the output schema that defines what the LLM response should look like. As mentioned earlier, Guardrails provides an option to define the schema using Pydantic. We’ll use this option, and below is the schema we’ll use for the doctor notes extraction task.

Our goal is to extract detailed patient information from a medical record.
As an example, we will use the following medical record:

In [9]:
doctors_notes = """49 y/o Male with chronic macular rash to face & hair, worse in beard, eyebrows & nares.
Itchy, flaky, slightly scaly. Moderate response to OTC steroid cream"""

We want our extracted information to contain the following fields:

1. Patient's gender
2. Patient's age
3. A list of symptoms, each with a severity rating and an affected area
4. A list of medications, each with information about the patient's response to the medication

Let's define the Pydantic classes below.

In [26]:
class Symptom(BaseModel):
  symptoms : str = Field(..., description = "symptom that the patient is suffering from")
  affected_area : str = Field(..., description= "what part of the body the symptom is affecting",
                              validators = [ValidChoices(["Head", "Face", "Neck", "Chest"], on_fail="reask")]
                              )

class CurrentMed(BaseModel):
  medication : str = Field(..., description = "The name of the medication the patient is taking")
  response : str = Field(..., description = "The patient's response to the medication")

class PatientInfo(BaseModel):
  gender : str = Field(..., description = "The patient's gender")
  age : int = Field(..., description = "The patient's age", validators = [ValidRange(0, 100)],
  )
  symptoms : list[Symptom] = Field(..., description = "A list of symptoms the patient is suffering from")
  current_medications : list[CurrentMed] = Field(..., description = "A list of medications the patient is taking")


# Initialize a Guard Object Based on the Schema

Next, we initialize a Guard object based on the schema we have defined.

First, we define the base instruction prompt for the LLM as follows.

In [27]:
PROMPT = """Given the following doctor's note about the patient, please extract the dictionary that contains patient's information

{doctor_notes}

${gr.complete_json_suffix_v2}
"""



# Then, we initialize a Guard object from the PatientInfo Pydantic model.

In [28]:
# Initialize Guard from Pydantic model first
guard = gd.Guard.from_pydantic(PatientInfo)

In [29]:
# Wrap the Cohere API call with the `guard` object

response = guard(
     model='command-r',
     messages = [{"role": "user", "content": PROMPT}],
     temperature = 0,
     num_reasks=3
)

# Print the validated output from the LLM
print(response.validated_output)

