## Check whether an LLM-generated response contains sensitive words

### Using the `SensitiveWords` validator

This is a simple walkthrough of how to use the `SensitiveWords` validator to check whether an LLM-generated response contains culturally inapporopriate/sensitive words that are now outdated. Such words may appear in the LLM-generated response because they might be present in the training data. These words can be offensive and inappropriate in certain contexts. This validator will detect such words, and replace them with corresponding appropriate words. It is based on a static list of manually curated words and phrases - and hence may not be exhaustive and may not be able to detect all such words.


In [1]:
# Import the guardrails package
# and the SensitiveWords validator

import guardrails as gd
from guardrails.validators import SensitiveWords
from rich import print

In [2]:
# Create a Guard object with this validator
# Here, we'll specify that we want to fix
# if the validator detects sensitive words

guard = gd.Guard.from_string(
    validators=[SensitiveWords(on_fail="fix")],
    description="testmeout",
)

In [3]:
# Parse the generated response
response = """
Jennifer is an actress who lives in New York City. The other day, her car got towed after she parked it in the handicapped parking zone. 
She was very upset. She is also soon to be out of office on her maternity leave. She says, the workload has been crazy!
"""

output = guard.parse(
    llm_output=response,
)

# Print the output
print(output)

As you can see, words like `actress`, `handicapped parking`, `maternity leave`, and `crazy` have been replaced with `actor`, `accessible parking`, `parental leave`, and `chaotic` respectively.


In [4]:
# Parse another response that contains other sensitive words
response = """
As a rule of thumb, you should not drink more than 2 cups of coffee a day.
In the up-and-coming neighborhood of Brooklyn, there are a lot of new developments going on to 
support the growing population. The neighborhood is also very diverse, with people from all walks of life.
Unlike Manhattan, Brooklyn is a lot more affordable and has less addicts roaming the streets. As a sanity check,
we should make sure that this data is not corrupted.
"""

output = guard.parse(
    llm_output=response,
)

# Print the output
print(output)

Here, words like `rule of thumb`, `up-and-coming neighborhood`, `addicts` and `sanity check` have been detected and replaced by their corresponding appropriate words.


In [5]:
# Parse another response that has no sensitive words
response = """
What a beautiful day! The sun is shining and the birds are chirping. I am so happy to be alive!
"""

output = guard.parse(
    llm_output=response,
)

# Print the output
print(output)

No sensitive words were detected here, and the original response is returned as it is!
