# Demo: Filtering Requests/Responses for Amazon Bedrock using Langfuse

### Context
Previously we saw how you can trace through the requests and response. Understand the context of request, prompt that has accompanied the request. Here, we will demonstrate how you can use open souce library such as LLM guard in the context of Langfuse to control how you can pass the input to LLMs and control the output to the end users. This is particularly useful when you establish the organization-wide policy of Responsible AI guidelines. 

### Challenges
- Do not send the request to LLM if the request is deemed to be classified as a banned topic
- Anonymize the input before sending the request to LLM
- Anonymize the output befor sending the response to the users

### How
Part 1 - We will use LLM Guard's capability to scan through the request and see if the request contains any word that is not approved by the organization. If so, do not send the request any further, just respond to the user that the the request is not safe to send to LLM.

Part 2 - We will use LLM Guard's capability to anonymize and de-anonymize the text. This is particularly useful when you are working with the sensitive data such as Healthcare.


## Implementation
Here is the high level implementation flow. Using LLM Guard library for filtering the requests and response. Langfuse is integrated to work with LLM Guard which is an open source library.
More details here ... https://llm-guard.com/

- **LLM (Large Language Model)**: Anthropic Claude V1 available through Amazon Bedrock
  This model will be used to generate a screen play for a given situation.
- **LLM Guard**: An Open source library
- **Langfuse**: Inspect through Langfuse UI console for deeper analysis


In [None]:
%pip install langchain langchain-community langchain-core langfuse boto3 llm-guard
%pip install -U langchain-aws

In [None]:
# set credentials - for langchain and for langfuse
import os
os.system('export AWS_PROFILE=default')
os.environ["LANGFUSE_PUBLIC_KEY"] = 'pk-lf-c8ec60a4-3f7e-4e65-8eda-09e76f796b3f'
os.environ["LANGFUSE_SECRET_KEY"] = 'sk-lf-0ffdfee6-4e88-4110-85ef-b6e153382c81'
os.environ["LANGFUSE_HOST"] = 'http://localhost:3000'

In [None]:
import warnings
import sys
import textwrap
import os
import boto3
from typing import Optional
from io import StringIO
# External Dependencies:
from botocore.config import Config

warnings.filterwarnings('ignore')

def print_ww(*args, width: int = 100, **kwargs):
    """Like print(), but wraps output to `width` characters (default 100)"""
    buffer = StringIO()
    try:
        _stdout = sys.stdout
        sys.stdout = buffer
        print(*args, **kwargs)
        output = buffer.getvalue()
    finally:
        sys.stdout = _stdout
    for line in output.splitlines():
        print("\n".join(textwrap.wrap(line, width=width)))
        

#### Initialize Bedrock LLM
We are setting bedrock runtime to us-east-1 region.
- Selecting anthropic claude 3 sonnet, also have set the model parameters


In [None]:
import boto3
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_aws import ChatBedrock
from langchain_aws import BedrockEmbeddings

bedrock_runtime = boto3.client(
    service_name="bedrock-runtime",
    region_name="us-east-1",
)

model_id = "anthropic.claude-3-sonnet-20240229-v1:0"

model_kwargs =  { 
    "max_tokens": 2048,
    "temperature": 0.0,
    "top_k": 250,
    "top_p": 1,
    "stop_sequences": ["\n\nHuman"],
}

llm = ChatBedrock(
    client=bedrock_runtime,
    model_id=model_id,
    model_kwargs=model_kwargs,
)

#### Filtering requests and response so that LLMs get sanitized input and users get sanitized output
Library used: LLM Guard (https://llm-guard.com/)
preamble directs LLM to bypass the specific content.
Note that we have decorated anonymizing, de-anonymizing, invoking LLM - all with the decorator @observe. This helps us to deep dive on Langfuse UI to analyze and inspect further.


In [None]:
#Set the trace details for inspection
trace_name = "FilterTrace"
session_id = "SecuritySession"
user_id = "Developer_FILTER"

### The prompt is being managed within Langfuse
You can change the prompt and make them available as you modify it.

#### Create a prompt for Claude 3 and manage it through Langfuse
Check Langfuse Prompt console; If the prompt is already there, Edit to show the LLM engineering capability

In [41]:
str_prompt = (
    "system: You are a screen play writer. "
    "Use the situation provided as"
    "the question. If you don't know the answer, say that you "
    "don't know. Keep the screen play interesting."
    "\n\n"
    "human: {input}"
    
)

langfuse.create_prompt(
    name="screen-play-creator",
    prompt=str_prompt,
    config={
        "model":model_id,
        "temperature": 0,
    },
    labels=["production"]
);

In [45]:
from langfuse import Langfuse
 
langfuse = Langfuse()
# Get current production version of prompt
langfuse_prompt = langfuse.get_prompt("screen-play-creator")
print(langfuse_prompt.prompt)

system: You are a screen play writer. Use the situation provided as the question. Be creative. Keep the screen play interesting.

human: {input}


In [46]:
from langfuse.decorators import observe, langfuse_context
from llm_guard.input_scanners import BanTopics
#from langchain.prompts import ChatPromptTemplate
from langchain_core.prompts import ChatPromptTemplate
from langfuse.callback import CallbackHandler

from llm_guard.vault import Vault
from llm_guard.input_scanners import Anonymize
from llm_guard.input_scanners.anonymize_helpers import BERT_LARGE_NER_CONF
from langfuse.decorators import observe, langfuse_context
from llm_guard.output_scanners import Deanonymize

vault = Vault()

topic_scanner = BanTopics(topics=["violence"], threshold=0.25)

@observe()
def anonymize(input: str):
  scanner = Anonymize(vault, preamble="Insert before prompt", allowed_names=["Priya Dube"],
                    recognizer_conf=BERT_LARGE_NER_CONF, language="en")
  sanitized_prompt, is_valid, risk_score = scanner.scan(input)
  return sanitized_prompt
 
@observe()
def deanonymize(sanitized_prompt: str, answer: str):
  scanner = Deanonymize(vault)
  sanitized_model_output, is_valid, risk_score = scanner.scan(sanitized_prompt, answer)

  return sanitized_model_output
    
@observe()
def invokeLLMWithSecurityFilter(question):
    sanitized_prompt, is_valid, risk_score = topic_scanner.scan(question)

    print(sanitized_prompt)
    print(is_valid)
    print(risk_score)

    langfuse_context.score_current_observation(
        name=question,
        value=risk_score
    )
 
    if(risk_score>0.4):
        return "This is not child safe, please request another topic"

    prompt = ChatPromptTemplate.from_template(langfuse_prompt.get_langchain_prompt())
    
    langfuse_handler = CallbackHandler()   
    # adding Langfuse context
    langfuse_context.update_current_trace(
        name=trace_name, 
        session_id=session_id,
        user_id=user_id, 
    )      
    langfuse_handler=langfuse_context.get_current_langchain_handler()      
    chain = prompt | llm | StrOutputParser()
    # anonymize
    a_question = anonymize(question)
    result = chain.invoke({"input": question},config={"callbacks": [langfuse_handler]})
    
    #de-anonymize - I want to send de-anonymized response
    #sanitized_output = deanonymize(a_question, result)
    #return sanitized_output
    
    #anonymize result - I want to send anonymized response
    return anonymize(result)


[2m2024-09-09 17:26:31[0m [[32m[1mdebug    [0m] [1mInitialized classification model[0m [36mdevice[0m=[35mdevice(type='cpu')[0m [36mmodel[0m=[35mModel(path='MoritzLaurer/roberta-base-zeroshot-v2.0-c', subfolder='', revision='d825e740e0c59881cf0b0b1481ccf726b6d65341', onnx_path='protectai/MoritzLaurer-roberta-base-zeroshot-v2.0-c-onnx', onnx_revision='fde5343dbad32f1a5470890505c72ec656db6dbe', onnx_subfolder='', onnx_filename='model.onnx', kwargs={}, pipeline_kwargs={'batch_size': 1, 'device': device(type='cpu'), 'return_token_type_ids': False, 'max_length': 512, 'truncation': True}, tokenizer_kwargs={})[0m


#### Demonstrate the filterning of banned content
**Responsible AI 1**

In [47]:
#YOUR QUERY HERE ...
situation = "With Atul Agnihotri as a director of the movie, give me a screen play on The Laughing Buddha"
result = invokeLLMWithSecurityFilter(situation)
print_ww (result)


[2m2024-09-09 17:26:35[0m [[32m[1mdebug    [0m] [1mNo banned topics detected     [0m [36mscores[0m=[35m{'violence': 0.09190163016319275}[0m
With Atul Agnihotri as a director of the movie, give me a screen play on The Laughing Buddha
True
0.0
[2m2024-09-09 17:26:35[0m [[32m[1mdebug    [0m] [1mNo entity types provided, using default[0m [36mdefault_entities[0m=[35m['CREDIT_CARD', 'CRYPTO', 'EMAIL_ADDRESS', 'IBAN_CODE', 'IP_ADDRESS', 'PERSON', 'PHONE_NUMBER', 'US_SSN', 'US_BANK_NUMBER', 'CREDIT_CARD_RE', 'UUID', 'EMAIL_ADDRESS_RE', 'US_SSN_RE'][0m


Some weights of the model checkpoint at dslim/bert-large-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[2m2024-09-09 17:26:36[0m [[32m[1mdebug    [0m] [1mInitialized NER model         [0m [36mdevice[0m=[35mdevice(type='cpu')[0m [36mmodel[0m=[35mModel(path='dslim/bert-large-NER', subfolder='', revision='13e784dccceca07aee7a7aab4ad487c605975423', onnx_path='dslim/bert-large-NER', onnx_revision='13e784dccceca07aee7a7aab4ad487c605975423', onnx_subfolder='onnx', onnx_filename='model.onnx', kwargs={}, pipeline_kwargs={'batch_size': 1, 'device': device(type='cpu'), 'aggregation_strategy': 'simple', 'ignore_labels': ['O', 'CARDINAL']}, tokenizer_kwargs={'model_input_names': ['input_ids', 'attention_mask']})[0m
[2m2024-09-09 17:26:36[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern          [0m [36mgroup_name[0m=[35mCREDIT_CARD_RE[0m
[2m2024-09-09 17:26:36[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern          [0m [36mgroup_name[0m=[35mUUID[0m
[2m2024-09-09 17:26:36[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern          [0m [36mgroup_name[0m=[

Some weights of the model checkpoint at dslim/bert-large-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[2m2024-09-09 17:26:56[0m [[32m[1mdebug    [0m] [1mInitialized NER model         [0m [36mdevice[0m=[35mdevice(type='cpu')[0m [36mmodel[0m=[35mModel(path='dslim/bert-large-NER', subfolder='', revision='13e784dccceca07aee7a7aab4ad487c605975423', onnx_path='dslim/bert-large-NER', onnx_revision='13e784dccceca07aee7a7aab4ad487c605975423', onnx_subfolder='onnx', onnx_filename='model.onnx', kwargs={}, pipeline_kwargs={'batch_size': 1, 'device': device(type='cpu'), 'aggregation_strategy': 'simple', 'ignore_labels': ['O', 'CARDINAL']}, tokenizer_kwargs={'model_input_names': ['input_ids', 'attention_mask']})[0m
[2m2024-09-09 17:26:56[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern          [0m [36mgroup_name[0m=[35mCREDIT_CARD_RE[0m
[2m2024-09-09 17:26:56[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern          [0m [36mgroup_name[0m=[35mUUID[0m
[2m2024-09-09 17:26:56[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern          [0m [36mgroup_name[0m=[

#### Demonstrate anonymize/de-anonymize capability
**Responsible AI 2**

In [None]:
#YOUR QUERY HERE ...
situation = "Mr. Gregor, you can now submit your evidence for the theft charge. With this, create a screen play"
result = invokeLLMWithSecurityFilter(situation)
print_ww (result)

#### Flush Langfuse context (Clear off)

In [None]:
# SDK is async, make sure to await all requests
langfuse.flush()