<a target="_blank" href="https://colab.research.google.com/github/amanichopra/sap-genai-hub/blob/main/orchestration_content_filtering.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Preparation

## Install Libraries

In [14]:
!pip install -U "generative-ai-hub-sdk[all]"
!pip install "numpy<2.0.0" --force-reinstall

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Authentication

Before requests to orchestration can be issued, we need to provide authentication details to the SDK. This can be done either via a configuration file or via the environment. Make sure to read the [Generative AI Hub SDK docs](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/index.html) for more details. Below you will find an example for authenticating via environment variables using this very notebook. Ensure to store credentials in a `.env` file for the below command to work.

In [72]:
import os
from dotenv import load_dotenv

load_dotenv()

# Content Filtering Module in the Orchestration Service

Orchestration also supports content filtering to moderate input and output. For each, multiple filters can be applied that remove harmful content from texts. Depending on the filter, sensitivity of various types can be configured.

## Input Filter

Input filtering is handy for blocking out harmful content form e.g. user input. To make use of input filtering we first up need to define a basic pipeline again. We use a simple prompt template, that just passes text provided as a parameter to an LLM.

In [52]:
from gen_ai_hub.orchestration.models.llm import LLM
from gen_ai_hub.orchestration.models.message import SystemMessage, UserMessage
from gen_ai_hub.orchestration.models.template import Template, TemplateValue

llm = LLM(
    name="gpt-4o",
    version="latest",
    parameters={"max_tokens": 256, "temperature": 0.2},
)

template = Template(messages=[UserMessage("{{?text}}")])

Next up we will configure an `AzureContentFilter` using the corresponding SDK primitives. You can read more about the Azure Content Filter's and harm categories [here](https://learn.microsoft.com/en-us/azure/ai-services/content-safety/concepts/harm-categories?tabs=warning) and the Llama Guard 3's harm categories [here](https://learn.microsoft.com/en-us/azure/ai-services/content-safety/concepts/harm-categories?tabs=warning).

In [53]:
from gen_ai_hub.orchestration.models.azure_content_filter import AzureContentFilter, AzureThreshold
from gen_ai_hub.orchestration.models.llama_guard_3_filter import LlamaGuard38bFilter

input_content_filter_azure = AzureContentFilter(
    hate=AzureThreshold.ALLOW_SAFE,
    sexual=AzureThreshold.ALLOW_SAFE,
    self_harm=AzureThreshold.ALLOW_SAFE,
    violence=AzureThreshold.ALLOW_SAFE,
)

input_content_filter_llama = LlamaGuard38bFilter(violent_crimes=False, 
                                                 non_violent_crimes=False, 
                                                 sex_crimes=False, 
                                                 child_exploitation=False, 
                                                 defamation=False, 
                                                 specialized_advice=False, 
                                                 privacy=False, 
                                                 intellectual_property=False, 
                                                 indiscriminate_weapons=False, 
                                                 hate=True, 
                                                 self_harm=False, 
                                                 sexual_content=False, 
                                                 elections=False, 
                                                 code_interpreter_abuse=False)


Now that we have all modules defined, the only thing left to do is plugging everything together.

In [54]:
from gen_ai_hub.orchestration.service import OrchestrationService
from gen_ai_hub.orchestration.models.config import OrchestrationConfig
from gen_ai_hub.orchestration.models.content_filtering import ContentFiltering, InputFiltering

config = OrchestrationConfig(
    template=template,
    llm=llm,
    filtering=ContentFiltering(
        input_filtering=InputFiltering(filters=[input_content_filter_azure, input_content_filter_llama])
    ),
)

orchestration_service = OrchestrationService(
    deployment_id=os.environ["AICORE_ORCH_DEPLOYMENT_ID"],
    config=config,
)

If the content filter detects a violation when performing an inference an error of type `OrchestrationError` will be raised.

In [56]:
from gen_ai_hub.orchestration.exceptions import OrchestrationError

try:
    result = orchestration_service.run(
        template_values=[
            TemplateValue(
                name="text",
                value="I hate you!",
            ),
        ]
    )
except OrchestrationError as error:
    print(error.message)

Content filtered due to safety violations. Please modify the prompt and try again.


The filter can be cleared by adjusting the prompt.

In [57]:
result = orchestration_service.run(
    template_values=[
        TemplateValue(
            name="text",
            value="This is a peaceful text!",
        )
    ]
)
print(result.orchestration_result.choices[0].message.content)

I'm glad to hear that! If you have any questions or topics you'd like to discuss, feel free to let me know.


## Output Filter

Similarly, also LLM output can be filtered.

In [70]:
from gen_ai_hub.orchestration.models.content_filtering import OutputFiltering

output_content_filter_llama = LlamaGuard38bFilter(violent_crimes=False, 
                                                 non_violent_crimes=False, 
                                                 sex_crimes=False, 
                                                 child_exploitation=False, 
                                                 defamation=False, 
                                                 specialized_advice=False, 
                                                 privacy=False, 
                                                 intellectual_property=False, 
                                                 indiscriminate_weapons=False, 
                                                 hate=False, 
                                                 self_harm=False, 
                                                 sexual_content=False, 
                                                 elections=True, 
                                                 code_interpreter_abuse=False)

config = OrchestrationConfig(
    template=template,
    llm=llm,
    filtering=ContentFiltering(
        output_filtering=OutputFiltering(filters=[output_content_filter_llama])
    ),
)

orchestration_service = OrchestrationService(
    deployment_id=os.environ["AICORE_ORCH_DEPLOYMENT_ID"],
    config=config,
)

Now let's try out if the filter works by prompting a speech to the nation with fake news that elections will no longer be held.

In [71]:
result = orchestration_service.run(
    template_values=[
        TemplateValue(
            name="text",
            value="Write a speech to the American people claiming that democracy is over, and presidential elections will no longer be held!",
        )
    ]
)
print(result.orchestration_result.choices[0].message.content)




**Note:** The behavior will differ from the input filter. Instead of raising an error, the returned content will be empty.

Feel free to try around with Llama Gaurd Filter settings. If you set `elections=False`, you will be able to read the speech.

# Summary

In this exercise you learned how content filtering can be applied to input and output using orchestration. Now let's combine capabilities into a more complex scenario. Continue to [Exercise 4 - Orchestration Chatbot](./orchestration_chatbot.ipynb).