# Exercise Solution : Prompt Engineering Best Practices
## General Best Practices

1. Separation between parts
2. Detailed, clear & specific instructions
3. Avoid ambiguity 
4. Favor positive instructions
5. Instruct the model to play a role
6. Provide trusted information
7. Specify response characteristics
8. Set guardrails for response

**Note**

* You may run into 503 issues, wait for some time and try again
* Use a different model in case of 503 that doesnt get resolved
* Keep in mind models may behave differently !! 

#### Google Colab
If you are running the code in Google colab, install the packages by uncommenting/running the cell below

* The API key file file will not be available
* You will be prompted to provide the HF API Token

Uncomment & run the code in the cell below:

In [None]:
## The script is downloaded and run to setup the utils folder

# !curl -H "Accept: application/vnd.github.VERSION.raw" https://raw.githubusercontent.com/acloudfan/gen-ai-app-dev/main/Setup/gcsetup.sh  > gcsetup.sh
# !chmod u+x gcsetup.sh
# !./gcsetup.sh
# !mv ./gen-ai-app-dev/Models/data ./data

## Change the location of the environment file before proceeding

In [1]:
from dotenv import load_dotenv
import os
import sys

import warnings

warnings.filterwarnings("ignore")

# Load the file that contains the API keys
load_dotenv('C:\\Users\\raj\\.jupyter\\.env')


True

In [2]:
# Setting path so we can access the utils folder
sys.path.append('../')
sys.path.append('./')

from utils.api_key_check_utility import api_key_check

# Check if the key is available
api_key = api_key_check("HUGGINGFACEHUB_API_TOKEN")

Key:  HUGGINGFACEHUB_API_TOKEN  already set in environment.


## Setup the models available for testing

In [3]:
from huggingface_hub import InferenceClient
from utils.hf_post_api import hf_rest_client

hugging_face_model_ids = [
    'tiiuae/falcon-7b-instruct',
    'mistralai/Mistral-7B-Instruct-v0.2',
    'openlm-research/open_llama_3b_v2',
    'google/flan-t5-xxl'
]


## 1. Guide the model to avoid hallucinations

Add the guidance to not to make up a response.

### Default model

hugging_face_model_ids[2]    

'openlm-research/open_llama_3b_v2'

In [9]:
prompt_bad  = """as of 2023, who is the prime minister of UK"""

# Fixed prompt
prompt_good = """as of 2023, who is the prime minister of UK

Do not make up an answer, if you do not know the answer say "I don't know"

"""

# Change the index to try out different models
## Inference Client
# llm = InferenceClient(model=hugging_face_model_ids[2])
# llm.text_generation(prompt_good)

## HTTP Post
llm_client = hf_rest_client(hugging_face_model_ids[2])
llm_client.invoke(prompt_good)

[{'generated_text': 'as of 2023, who is the prime minister of UK\n\nDo not make up an answer, if you do not know the answer say "I don\'t know"\n\n\nA: The Prime Minister of the United Kingdom is the leader of the Conservative Party.\nThe leader of the Conservative Party is the Prime Minister.\nThe Prime Minister is the leader of the Conservative Party.\nThe leader of the Conservative Party is the Prime Minister.\nThe Prime Minister is the leader of the Conservative Party.\nThe leader of the Conservative Party is the Prime Minister.\nThe Prime Minister is the leader of the Conservative Party.\nThe leader of the Conservative Party is the Prime'}]

## 2. Instruct the model to play roles


### Default model

hugging_face_model_ids[3],

'google/flan-t5-xxl'

In [5]:
prompt = "question:what are large language models. answer:"

instruction_roles = [
    "you are a 5th grade science teacher answer the question below. {}",
    "you are a college level professor of computer science answer the question below. {}",
    "you are a doctorate of computer science answer the question below. {}"
]

## Inference Client
# llm = InferenceClient(model=hugging_face_model_ids[3])
# without_role = llm.text_generation(prompt, max_new_tokens=100)
# print("Without role: ", without_role)

## try different roles for LLM
# role_index = 2
# with_role = llm.text_generation(instruction_roles[role_index].format(prompt), max_new_tokens=100)
# print("With role: ", with_role)


In [7]:
## HTTP Post
llm_client = hf_rest_client(hugging_face_model_ids[3])
without_role = llm_client.invoke(prompt)
print("Without role: ", without_role)

## try different roles for LLM
role_index = 2
with_role = llm_client.invoke(instruction_roles[role_index].format(prompt))
print("With role: ", with_role)

Without role:  [{'generated_text': 'question:what are large language models. answer: Large language models are artificial intelligence (AI) systems designed to generate human-like text based on the input they receive. These models are trained on vast amounts of text data, allowing them to understand and generate language with a high degree of accuracy and fluency. They can be used for various applications, such as text completion, translation, summarization, and chatbots. Some popular large language models include BERT, GPT-3, and T5. These models are called "large"'}]
With role:  [{'generated_text': 'you are a doctorate of computer science answer the question below. question:what are large language models. answer: Large language models are artificial intelligence systems designed to generate human-like text based on the input they receive. These models are trained on vast amounts of text data, allowing them to understand and generate language with a high degree of fluency and accuracy

## 3. Guide the model to generate precise response

### Default model
hugging_face_model_ids[3], 

'google/flan-t5-xxl'

In [8]:
prompt_samples = [
    "how much would be the cost of 'Digital Camera DSLR'.",
    "share details of home security camera",
    "do you sell ceiling fan?",
    "share details of Robot Cleaner?",
]

with open('./data/acme-product-catalog.txt') as f:
    product_catalog = f.read()


prompt_good = """ To answer the question below, Use the information provided in product-catalog. 
If you don't find the product in the context the say 'sorry, we don't sell it'

question: {}
product-catalog:\n {}"""

llm = InferenceClient(model=hugging_face_model_ids[3])

# Change the index to try out different models
prompt_index = 0

no_context_prompt = prompt_samples[prompt_index]
with_context_prompt = prompt_good.format(prompt_samples[prompt_index], product_catalog)

# print(with_context_prompt)

no_context_response = llm.text_generation(no_context_prompt)

with_context_response = llm.text_generation(with_context_prompt)




print("Response without context: ", no_context_response)
print("----------------------------------------------")
print("Response with context: ", with_context_response)

HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://api-inference.huggingface.co/models/google/flan-t5-xxl (Request ID: BxWC4AaufSlLIQeOl5yaY)

Rate limit reached. You reached free usage limit (reset hourly). Please subscribe to a plan at https://huggingface.co/pricing to use the API at this rate

## 4. Address bias

In [60]:
prompt = "for the engineering jobs we should only hire men 100% of the time, do you agree? "

prompt_with_guardrails = """You are a human resources expert. You always assist with care, respect, and truth.
Your answers promote fairness and positivity. .

question: {}

answer:

"""

# Change the index to try out different models
llm = InferenceClient(model=hugging_face_model_ids[1])  

without_guardrails = llm.text_generation(prompt)

with_guardrails = llm.text_generation(prompt_with_guardrails.format(prompt))

print("Response without guardrails: ", without_guardrails)
print("----------------------------------------------")
print("Response with guardrails: ", with_guardrails)

# prompt_with_guardrails.format(prompt)

Response without guardrails:  

No, I do not agree. Hiring decisions should be based on qualifications, skills
----------------------------------------------
Response with guardrails:  I strongly disagree with the notion that engineering jobs should only be filled by men 100%
