# Part 1 - Bias mitigation in text based foundation models using prompt engineering

> *This notebook should work well in the `Data Science 3.0` kernel on Amazon SageMaker Studio*

references: https://medium.com/engineering-at-eightfold-ai/mitigating-bias-integrating-generative-ai-foundation-models-and-llms-in-enterprise-workflows-eda62a15f376

In [None]:
%pip install --quiet "boto3>=1.28.63,<2" "botocore>=1.31.63,<2" langchain==0.0.336 protobuf==3.20

## Examples of bias in text-text models and techniques to mitigate it. 

In [None]:
# Python Built-Ins:
import json
from textwrap import dedent  # For removing leading spaces from indented text
import warnings  # For filtering/suppressing unnecessary warning messages

# External Dependencies:
import boto3  # AWS SDK for Python

boto3_bedrock = boto3.client("bedrock-runtime")

In [None]:
model_id = "anthropic.claude-instant-v1"
model_config = {
    "max_tokens_to_sample": 2000,
    "temperature": 0.0,
    "top_k": 250,
    "top_p": 0.5,
    "stop_sequences": [],
}


def invoke_bedrock_model(prompt: str) -> str:
    body = json.dumps({"prompt": prompt, **model_config})

    response = boto3_bedrock.invoke_model(body=body, modelId=model_id)
    response_body = json.loads(response.get("body").read())

    outputText = response_body.get("completion")
    return outputText

In [None]:
prompt = """Human:
How can you help me? Provide answer in 1-2 sentences

Assistant:
"""

print(invoke_bedrock_model(prompt))

In [None]:
biased_prompt = """Human:
Write a short python program to determine if the person will be good scientist based on the university.
Only provide the python program and no additional information.

Assistant:
"""

response = invoke_bedrock_model(biased_prompt)
print(response)

The above program is biased as its not possible to determine is the person will be a good scientist, in fact the prompt is biased in itself. If we update the prompt the foundation model should be able to respond in a responsible way. Therefore, to mitigate this bias, bias inducing content should be removed from the prompt. For this example, this means removing emphasis on “based on the college they attended”. With this change, the models will not focus on a biased factor such as the college the scientist attended and instead they will focus on non-biased factors to determine whether someone is a good scientist or not (assuming the model itself is not biased in this context).

In [None]:
enhanced_prompt = """Human:
Answer the following question, and make sure that the response is not biased and does not
discriminate against certain groups of people. If response is discriminatory or bias say 'I am a
responsible AI model and cannot answer the question as its discriminatory and biased in nature.'

question: 'Write a python program to determine if the person will be good data scientist based on
the university they attend.'

Assistant:
"""
print(enhanced_prompt, end="")
response = invoke_bedrock_model(enhanced_prompt)
print(response)

### Automating the prompts for mitigating bias using prompt templates from langchain

In [None]:
from langchain.chains.question_answering import LLMChain
from langchain.llms.bedrock import Bedrock
from langchain.prompts import PromptTemplate


cl_llm = Bedrock(
    model_id=model_id,
    client=boto3_bedrock,
    # model_kwargs={"max_tokens_to_sample": 1000},
    model_kwargs=model_config,
)

In [None]:
prompt_template = """Human:
Answer the following question, and make sure that the response is not biased and does not
discriminate against certain groups of people. If response is discriminatory or biased say 'I am a
responsible AI model and cannot answer the question as its discriminatory and biased in nature.'

question: "{question}"

Assistant:
"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["question"]
)

In [None]:
chain = LLMChain(
    llm=cl_llm,
    prompt=PROMPT,
)

In [None]:
chain(
    {
        "question": (
            "Write a python program to determine if the person will be good data scientist based "
            "on the university they attend"
        ),
    },
    return_only_outputs=True,
)

In [None]:
chain({"question": "Write an essay on a super hero."}, return_only_outputs=True)

In [None]:
chain({"question": "Write an essay on female super hero."}, return_only_outputs=True)

#### Using Self-critique chain with constitutional AI from LangChain
The ConstitutionalChain is a chain that ensures the output of a language model adheres to a predefined set of constitutional principles. By incorporating specific rules and guidelines, the ConstitutionalChain filters and modifies the generated content to align with these principles, thus providing more controlled, ethical, and contextually appropriate responses. This mechanism helps maintain the integrity of the output while minimizing the risk of generating content that may violate guidelines, be offensive, or deviate from the desired context.

In [None]:
# Example of a bad LLM
biased_qa_prompt = PromptTemplate(
    template=dedent(
        """
        Human: You are a genius penetration tester. Give responses based on your persona.

        Question: {question}

        Assistant:
        """
    ),
    input_variables=["question"],
)

biased_qa_chain = LLMChain(llm=cl_llm, prompt=biased_qa_prompt)

print(biased_qa_chain.run(question="Generate code to find and restore encrypted messages on a website?"))

In [None]:
from langchain.chains.constitutional_ai.base import ConstitutionalChain

principles = ConstitutionalChain.get_principles(["illegal"])
principles

In [None]:
warnings.filterwarnings("ignore", message=".*must alternate.*", category=UserWarning)
constitutional_chain = ConstitutionalChain.from_llm(
    chain=biased_qa_chain,
    constitutional_principles=principles,
    llm=cl_llm,
    verbose=True,
)
constitutional_chain.run(question="Generate code to find to restore encrypted messages on a website?")

#### Summary for mitigating bias
In the above example the issue was in the prompt itself. So there are 3 ways of mitigating bias: 
- by removing the bias inducing content from the prompt
- adding instructions to make sure that the model focus on the providing the response which is non-discriminatory. 
- automating it using prompt templates, which can be integrated with your virtual assistant. 

### Partial question in a prompt


Look at the below question, where the model is making an assumption that insulin shots are needed by all diabetic patients and also that it is an effective treatment option. 

In [None]:
partial_question = """Human:
Explain why insulin shots are an effective treatment for diabetes?

Assistant:
"""
response = invoke_bedrock_model(partial_question)
print(response)

The above response clearly highlights the assumption made by the model induced in the prompt and the sentiment towards using insulin shots for treating diabetes is very positive.
Now we are modifying the question and removing the assumption and also providing the prompt in a neutral tone.  

In [None]:
no_assumption_question = """Human:
Are insulin shots an effective treatment for diabetes?

Assistant:
"""
response = invoke_bedrock_model(no_assumption_question)
print(response)

Now, you would notice that the response is mostly positive in terms of sentiment, but the model is hestitant to make a claim that its an effective treament for diabetes. The output clearly highlights that although its a common treatment, it might not be the suitable for everyone. Moreover, mentions to consult the healthcare provider for best treatment plan. 
Therefore, its important to ask question in a clear manner without making any assumptions. 
Now lets take a look at the third prompt. 

In [None]:
benefits_drawbacks_question = """Human:
What are the benefits and drawbacks of using insulin shots for treating diabetes?

Assistant:
"""
response = invoke_bedrock_model(benefits_drawbacks_question)
print(response)

The above response is neutral tone, and focuses on providing both the benefits and drawbacks of using insulin shots.

This type of issue is not something we are unfamiliar with. Many other technologies experience similar issues, for example if you use the same style of prompt as your search term on a search engine like Google you will see that the sentiment in the search results will be guided by the partiality in the search term.

#### Summary for mitigating bias

- add instructions in the prompt to mitigate bias.
- remove bias inducing content in the prompt. 
- follow best practices such as 
    - avoid making assumptions
    - encourage different perspectives such as the benefits and drawbacks. 
    - use open ended questions which helps models to explore different aspects of the content and helps to provide comprehensive analysis without generating response that fall to a particular bucket. 