Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions docs/faq.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,76 @@
# Frequently Asked Questions

## I'm seeing a PromptCallableException when invoking my Guard. What should I do?

If you see an exception that looks like this

```
PromptCallableException: The callable `fn` passed to `Guard(fn, ...)` failed with the following error: `custom_llm_func() got an unexpected keyword argument 'messages'`. Make sure that `fn` can be called as a function that takes in a single prompt string and returns a string.
```

It means that the call to the LLM failed. This is usually triggered for one of the following reasons:

1. An API key is not present or not passed correctly to the LLM
1. The LLM API was passed arguments it doesn't expect. Our recommendation is to use the LiteLLM standard, and pass arguments that conform to that standard directly in the guard callable. It's helpful as a debugging step to remove all other arguments or to try and use the same arguments in a LiteLLM client directly.
1. The LLM API is down or experiencing issues. This is usually temporary, and you can use LiteLLM or the LLM client directly to verify if the API is working as expected.
1. You passed a custom LLM callable, and it either doesn't conform to the expected signature or it throws an error during execution. Make sure that the custom LLM callable can be called as a function that takes in a single prompt string and returns a string.

## How can I host Guardrails as its own server

Guardrails can run totally on the server as of version 0.5.0. You can use the `guardrails-ai` package to run Guardrails as a server. You can find more information on how to do this in the [Guardrails AI documentation](https://docs.guardrails.ai/guardrails_ai/installation).

## Which validators should I use? Where can I find them?

You can find a variety of validators on the [Guardrails Hub](https://hub.guardrailsai.com). We recommend starting by drilling down into your usecase on the left nav of that page (chatbot, customer support, etc...). We're also coming out with starter packs soon that are generally applicable to common usecases.

## How does Guardrails impact my LLM app's latency

tl;dr - guardrails aims to add < 100ms to each LLM request, use our recommendations to speed stuff up.

We've done a lot of work to make Guardrails perform well. Validating LLMs is not trivial, and because of the different approaches used to solve each kind of validation, performance can vary. Performance can be split into two categories: Guard execution and Validation execution. Guard execution is the time it takes to amend prompts, parse LLM output, delegate validation, and compile validation results.

Guard execution time is minimal, and should run on the order of microseconds.

Validation execution time is usually on the order of tens of milliseconds. There are definitely standouts here. For example, some ML-based validators like [RestrictToTopic](https://hub.guardrailsai.com/validator/tryolabs/restricttotopic) can take seconds to run when the model is cold and running locally on a CPU. However, we've seen it run in sub-100ms time when the model is running on GPUs.

Here are our recommendations:

1. use streaming when presenting user-facing applications. Streaming allows us to validate smaller chunks (sentences, phrases, etc, depending on the validator), and this can be done in parallel as the LLM generates the rest of the output
1. Host your validator models on GPUs. Guardrails provides inference endpoints for some popular validators, and we're working on making this more accessible.
1. Run Guardrails on its own dedicated server. This allows the library to take advantage of a full set of compute resources to thread out over. It also allows you to scale Guardrails independently of your application
1. In production and performance-testing environments, use telemetry to keep track of validator latency and how it changes over time. This will help you understand right-sizing your infrastructure and identifying bottlenecks in guard execution.

## How do I setup my own `fix` function for validators in a guard?

If we have a validator that looks like this
```python
from guardrails.validators import PassResult, FailResult, register_validator

@register_validator(name="is_cake", data_type="string")
def is_cake(value, metadata):
if value == "cake":
return PassResult()
return FailResult(error_message="This is not a cake.")
```

You can override the `fix` behavior by passing it as a function to the Guard object when the validator is declared.

```python
from guardrails import Guard

def fix_is_cake(value, metadata):
return "IT IS cake"

guard = Guard().use(is_cake, on_fail=fix_is_cake)

res = guard.parse(
llm_output="not cake"
)

print(res.validated_output) # Validated outputs
# Prints "IT IS cake"
```

## I'm encountering an XMLSyntaxError when creating a `Guard` object from a `RAIL` specification. What should I do?

Make sure that you are escaping the `&` character in your `RAIL` specification. The `&` character has a special meaning in XML, and so you need to escape it with `&amp;`. For example, if you have a prompt like this:
Expand All @@ -18,4 +89,23 @@ You need to escape the `&` character like this:
</prompt>
```

## Are validators all model-based? Are they proprietary to Guardrails?

Some validators are model based, some validators use LLMs, and some validators are purely logic. They are usually not proprietary, you can see a variety of them on the [Guardrails Hub](https://hub.guardrailsai.com) where they are largely open source and licensed under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).

This doesn't preclude the creation and use of custom private validators with any given technology, but we are huge believers in the power of open source software.

## Issues loging in with `guardrails configure`

If you encounter issues logging in using an API key with `guardrails configure`, they may be caused by one of the following:

1. Your API key has been revoked, expired, or is invalid. You can check your existing tokens and create a new one at [https://hub.guardrailsai.com/tokens](https://hub.guardrailsai.com/tokens) if necessary.
2. If you're using Windows Powershell, ensure that you paste the token by right-clicking instead of using the keyboard shortcut `CTRL+V`, as the shortcut may cause caret characters to be pasted instead of the clipboard contents.

If your login issues persist, please check the contents of the ~/.guardrailsrc file to ensure the correct token has been persisted.

## Where do I get more help if I need it?

If you're still encountering issues, please [open an issue](https://github.com/guardrails-ai/guardrails/issues/new) and we'll help you out!

We're also available on [Discord](https://discord.gg/U9RKkZSBgx) if you want to chat with us directly.
32 changes: 15 additions & 17 deletions docs/guardrails_ai/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,17 @@

## Installation

Install the Guardrails core package and CLI using pip.

```python
pip install guardrails-ai
```

## Create Input and Output Guards for LLM Validation

1. Download and configure the Guardrails Hub CLI.
1. Configure the Guardrails Hub CLI.

```bash
pip install guardrails-ai
guardrails configure
```
2. Install a guardrail from Guardrails Hub.
Expand All @@ -38,8 +39,8 @@ pip install guardrails-ai
First, install the necessary guardrails from Guardrails Hub.

```bash
guardrails hub install hub://guardrails/competitor_check
guardrails hub install hub://guardrails/toxic_language
guardrails hub install hub://guardrails/regex_match
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏🏻 for speed

guardrails hub install hub://guardrails/valid_length
```

Then, create a Guard from the installed guardrails.
Expand All @@ -48,7 +49,7 @@ pip install guardrails-ai
from guardrails.hub import RegexMatch, ValidLength
from guardrails import Guard

guard = Guard().use(
guard = Guard().use_many(
RegexMatch(regex="^[A-Z][a-z]*$"),
ValidLength(min=1, max=32)
)
Expand All @@ -71,13 +72,12 @@ class Pet(BaseModel):
name: str = Field(description="a unique pet name")
```

Now, create a Guard from the `Pet` class. The Guard can be used to call the LLM in a manner so that the output is formatted to the `Pet` class. Under the hood, this is done by either of two methods:
Now, create a Guard from the `Pet` class. The Guard can be used to call the LLM in a manner so that the output is formatted to match the `Pet` class. Under the hood, this is done by either of two methods:
1. Function calling: For LLMs that support function calling, we generate structured data using the function call syntax.
2. Prompt optimization: For LLMs that don't support function calling, we add the schema of the expected output to the prompt so that the LLM can generate structured data.

```py
from guardrails import Guard
import openai

prompt = """
What kind of pet should I get and what should I name it?
Expand All @@ -86,19 +86,17 @@ prompt = """
"""
guard = Guard.from_pydantic(output_class=Pet)

raw_output, validated_output, *rest = guard(
llm_api=openai.chat.completions.create,
prompt=prompt,
engine="gpt-3.5-turbo"
res = guard(
messages=[{"role": "user", "content": prompt}],
model="gpt-3.5-turbo"
)

print(f"{validated_output}")
print(res.validated_output)
```

This prints:
This prints:
```
{
"pet_type": "dog",
"name": "Buddy
}
{'pet_type': 'dog', 'name': 'Buddy'}
```

This output is a dict that matches the structure of the `Pet` class.
4 changes: 2 additions & 2 deletions docs/guardrails_ai/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ To install a specific version, run:
# pip install guardrails-ai==[version-number]

# Example:
pip install guardrails-ai==0.2.0a6
pip install guardrails-ai==0.5.0a10
```

## Install from GitHub
Expand All @@ -39,7 +39,7 @@ Installing directly from GitHub is useful when a release has not yet been cut wi
# pip install git+https://github.com/guardrails-ai/guardrails.git@[branch/commit/tag]
# Examples:
pip install git+https://github.com/guardrails-ai/guardrails.git@main
pip install git+https://github.com/guardrails-ai/guardrails.git@dev
pip install git+https://github.com/guardrails-ai/guardrails.git@0.5.0-dev
```


Expand Down
73 changes: 44 additions & 29 deletions docs/the_guard.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# What is a Guard?

The guard object is the main interface for GuardRails. It is seeded with a RailSpec, and then used to run the GuardRails AI engine. It is the object that accepts changing prompts, wraps LLM prompts, and keeps track of call history.
The guard object is the main interface for GuardRails. It can be used without configuration for string-based LLM apps, and accepts a pydantic object for structured data usecases. The Guard is then used to run the GuardRails AI engine. It is the object that wraps LLM calls, orchestrates validation, and keeps track of call history.


## How it works
Expand All @@ -11,74 +11,89 @@ The guard object is the main interface for GuardRails. It is seeded with a RailS

## Two main flows
### __call__
After instantiating a Guard you can call it in order to wrap an LLM with Guardrails and validate the output of that LLM according to the RAIL specification you provided. Calling the guard in this way returns a tuple with the raw output from the LLM first, and the validatd output second.
After initializing a guard, you can invoke it using a model name and messages for your prompt, similarly to how you would invoke a regular call to an LLM SDK. Guardrails will call the LLM and then validate the output against the guardrails you've configured. The output will be returned as a `GuardResponse` object, which contains the raw LLM output, the validated output, and whether or not validation was successful.

```py
import openai
from guardrails import Guard
import os

os.environ["OPENAI_API_KEY"] = [YOUR API KEY]

guard = Guard.from_rail(...)
guard = Guard()

raw_output, validated_output, *rest = guard(
openai.completions.create,
res = guard(
model="gpt-3.5-turbo-instruct",
max_tokens=1024,
temperature=0.3
messages=[{
"role": "user",
"content": "How do I make a cake?"
}]
)

print(raw_output)
print(validated_output)
print(res.raw_llm_output)
print(res.validated_output)
print(res.validation_passed)
```

### parse
If you would rather call the LLM yourself, or at least make the first call yourself, you can use `Guard.parse` to apply your RAIL specification to the LLM output as a post process. You can also allow Guardrails to make re-asks to the LLM by specifying the `num_reasks` argument, or keep it purely as a post-processor by setting it to zero. Unlike `__call__`, `Guard.parse` only returns the validated output.
If you would rather call the LLM yourself, or at least make the first call yourself, you can use `Guard.parse` to apply your RAIL specification to the LLM output as a post process. You can also allow Guardrails to make re-asks to the LLM by specifying the `num_reasks` argument, or keep it purely as a post-processor by setting it to zero. `Guard.parse` returns the same fields as `__call__`.

Calling `Guard.parse` with reasks:
Calling `Guard.parse` as a post-processor:
```py
import openai
from guardrails import Guard

guard = Guard.from_rail(...)
guard = Guard()

output = call_my_llm()
output = openai.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": "How do I make a cake?"
}]
).choices[0].message.content

validated_output = guard.parse(
llm_output=output,
llm_api=openai.completions.create,
model="gpt-3.5-turbo-instruct",
max_tokens=1024,
temperature=0.3,
num_reasks=2
res = guard.parse(
llm_output=output
)

print(validated_output)
print(res.validated_output) # Validated output
```

Calling `Guard.parse` as a post-processor:
Calling `Guard.parse` with reasks:
```py
import openai
from guardrails import Guard

guard = Guard.from_rail(...)
guard = Guard()

output = call_my_llm()
output = openai.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": "How do I make a cake?"
}]
).choices[0].message.content

validated_output = guard.parse(
res = guard.parse(
llm_output=output,
num_reasks=0
model="gpt-3.5-turbo-instruct",
num_reasks=1
)

print(validated_output)
print(res.validated_output) # Validated output
print(guard.history.last.reasks) # A list of reasks
```

## Error Handling and Retries
GuardRails currently performs automatic retries with exponential backoff when any of the following errors occur when calling the LLM:
Guardtails currently performs automatic retries with exponential backoff when any of the following errors occur when calling the LLM:

- openai.error.APIConnectionError
- openai.error.APIError
- openai.error.TryAgain
- openai.error.Timeout
- openai.error.RateLimitError
- openai.error.ServiceUnavailableError
- An incorrect structure was returned from the LLM

Note that this list is not exhaustive of the possible errors that could occur. In the event that errors other than these arise during LLM calls, an exception will be raised. The messaging of this exception is intended to help troubleshoot common problems, especially with custom LLM wrappers, as well as communicate the underlying error. This type of exception would look like the following:
```log
Expand Down