### Gaurdrail AI validation using the built-in Validator

- In Guardrails-AI, a validator is a plug-in that inspects the text coming into or going out of the LLM and decides whether it violates a rule (schema, PII, toxicity, topic drift, etc.)

- Each “validator” (like toxic_language, pii, bias, faithfulness, etc.) is a plugin — not a standalone PyPI package.

- Each "validator" needs to installed from the gaurdrails hub not from the PyPI package manager.

In [4]:
## Install Gaurdrail AI Validator (Profanity)
!guardrails hub install hub://guardrails/profanity_free

Installing hub:[35m/[0m[35m/guardrails/[0m[95mprofanity_free...[0m
[2K[32m[=   ][0m Fetching manifestst
[2K[32m[   =][0m Downloading dependenciespendencies
[2K[32m[=   ][0m Running post-install setuptall setup
[1A[2K✅Successfully installed guardrails/profanity_free version [1;36m0.0[0m.[1;36m0[0m!


[1mImport validator:[0m
from guardrails.hub import ProfanityFree

[1mGet more info:[0m
[4;94mhttps://hub.guardrailsai.com/validator/guardrails/profanity_free[0m



In [5]:
!guardrails hub list

Installed Validators:
- ProfanityFree


In [6]:
from guardrails import Guard
from guardrails.hub import ProfanityFree     

In [10]:

try:
  # Create a guard with profanity filter
  guard = Guard().use(ProfanityFree(on_fail="exception"))

  # Simulate an LLM returns a message
#   output = "You are a beautiful person!"
  output = "You are a pathetic person!"

  res = guard.validate(output)  # This will raise or fail because profanity found

  # Get the output
  print(res)

  # Check if validation passed
  print(res.validation_passed)

except Exception as e:
  print(e)

Validation failed for field with errors: You are a pathetic person! contains profanity. Please return profanity-free output.




In [13]:
from guardrails import Guard
from guardrails.hub import ProfanityFree
from openai import OpenAI
import os
from dotenv import load_dotenv
load_dotenv()

from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

In [14]:
# Wrap LLM call to match Guardrails expectations
def llm_wrapper(messages=None, model=None, **kwargs):
    # `messages` is a list of dicts: [{"role": "...", "content": "..."}]
    return client.chat.completions.create(model=model, messages=messages, **kwargs).choices[0].message.content
     

In [20]:
# Create a Guard with the ProfanityFree validator
# You can specify on_fail behavior, e.g. exception, fix, reject
guard = Guard().use(ProfanityFree(on_fail="fix"))

# Use the guard to call your LLM
response = guard(
    llm_wrapper,
    messages=[{"role": "assistant", "content": "how to troll to my best friend with abusive language."}],
    model="gpt-4o"
)

print("Validated output:", response.validated_output)




Validated output: I’m sorry, I can’t assist with that request.




In [19]:
#Now, the model (gpt-4o) is trained with OpenAI’s built-in safety filters, so it refuses to generate offensive text.

response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "user", "content": "how to troll to my best friend with abusive language."} ] )
print(response.choices[0].message.content)

I’m sorry, I can’t assist with that request.


In [21]:
# Possitive scenario

response = guard(
    llm_wrapper,
    messages=[{"role": "user", "content": "Tell me a story about a beautiful person in 100 words"}],
    model="gpt-4o"
)

print("Validated output:", response.validated_output)
print("Validation passed:", response.validation_passed)



Validated output: In a quaint village, lived Elara, whose beauty was not just in her appearance but in her heart. Her eyes sparkled with kindness, and her smile was a balm to weary souls. Every morning, she tended to the village garden, nurturing vibrant blooms that mirrored her spirit. Elara's laughter was a melody that lifted spirits, and her gentle words mended broken hearts. When a storm threatened the village, she organized a shelter, ensuring everyone was safe. Her selflessness inspired unity, and the village thrived. Elara's true beauty lay in her compassion, leaving an indelible mark on every life she touched.
Validation passed: True


