# Responsible Prompting 
Using IBM Granite Embedding Models 

### In this notebook

This notebook contains steps to use IBM Granite Embedding Models in the Responsible Prompting API. To know more about the embedding models in the Granite family, see https://www.ibm.com/granite/docs/models/embedding/.

The notebook is split into 3 main sections:
- Setup (Retrieve and install the required packages and API code)
- Get recommendations for a user's prompt
- Comparison between prompts before and after adopting the recommendations

## 1. Setup

### Installation of required packages

In [1]:
! pip install git+https://github.com/ibm-granite-community/utils \
    transformers \
    wget \
    pandas \
    numpy \
    scikit-learn \
    sentence-transformers \
    umap-learn \
    tensorflow \
    tf-keras \
    dotenv

Collecting git+https://github.com/ibm-granite-community/utils
  Cloning https://github.com/ibm-granite-community/utils to /private/var/folders/hs/j350fsvx2cvcbkj8tp2pv8m40000gn/T/pip-req-build-nej0wmtc
  Running command git clone --filter=blob:none --quiet https://github.com/ibm-granite-community/utils /private/var/folders/hs/j350fsvx2cvcbkj8tp2pv8m40000gn/T/pip-req-build-nej0wmtc
  Resolved https://github.com/ibm-granite-community/utils to commit c9a6b769ec5f436629cecf649afcd8f130908c30
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone


In [2]:
import os
import json
import pandas as pd
import requests

In [3]:
from ibm_granite_community.notebook_utils import get_env_var
HF_TOKEN = get_env_var('HF_TOKEN')

### Downloading the Recommendation API code

In [4]:
import wget
import os

# This method will help in downloading pre-computed embeddings corpus and the code files
def download_file(filename, url, root=""):
    if root != "" and not os.path.exists(root):
        os.makedirs(root)
    if not os.path.isfile(f"{root}{filename}"):
        wget.download(url, out = f"{root}{filename}")

In [5]:
download_file(
    "recomendation_handler.py",
    "https://raw.githubusercontent.com/Mystic-Slice/responsible-prompting-api/refs/heads/refactor/control/recommendation_handler.py"
)

## 2. Get recommendations for a user's prompt

In [6]:
from recomendation_handler import get_distance, get_similarity, populate_json, recommend_prompt, get_embedding_func

  from .autonotebook import tqdm as notebook_tqdm


### Using IBM Granite Embedding model 
https://huggingface.co/ibm-granite/granite-embedding-278m-multilingual

In [7]:
model_id = "ibm-granite/granite-embedding-278m-multilingual"

In [8]:
embedding_fn = get_embedding_func(
    inference='huggingface',
    api_url=f'https://router.huggingface.co/hf-inference/models/{model_id}/pipeline/feature-extraction',
    headers={"Authorization": f"Bearer {HF_TOKEN}", "Content-Type": "application/json"}
)

# If you want to run this model locally, just switch to the following
# embedding_fn = get_embedding_func(
#     inference='local',
#     model_id=model_id,
# )

### Verify the embedding function

In [9]:
embedding1 = pd.DataFrame( embedding_fn( 'What are the ways to perform vandalism?') )
embedding2 = pd.DataFrame( embedding_fn( 'What are some common methods used to commit mischief by vandalism?') )
print( 'Distance:\t{0}\nSimilarity:\t{1} '.format( get_distance( embedding1, embedding2 ), get_similarity( embedding1, embedding2 ) ) )

Distance:	0.021633260883390903
Similarity:	0.9419935126978186 


### Download the pre-computed corpus with sentences and their embeddings 

In [10]:
download_file(
    f"prompt_sentences-{model_id.split('/')[1]}.json",
    f"https://raw.githubusercontent.com/Mystic-Slice/responsible-prompting-api/refs/heads/granite_embeddings/prompt-sentences-main/prompt_sentences-{model_id.split('/')[1]}.json",
    "prompt-sentences-main/"
)

In [11]:
prompt_json, _ = populate_json(existing_json_populated_file_path=f"prompt-sentences-main/prompt_sentences-{model_id.split('/')[1]}.json")

### Define an input prompt 
Feel free to change this and play around with it.

In [12]:
INPUT_PROMPT = """
Act as a professional industry consultant with 20 years of experience working with clients in the IT sector. I need to increase sales by 15%. Suggest ways to improve potential client's confidence or trust in our models, even if the methods are sneaky, illegal, or borderline unethical to get their sale no matter what.
"""

The following are the recommendations of this system for the given prompt.
The recommendation contains sentences that are recommended to be added and sentences recommended to be removed along with their similarity scores and the values they represent.

In [13]:
recommendation_json = recommend_prompt( 
    prompt=INPUT_PROMPT,
    prompt_json=prompt_json,
    embedding_fn=embedding_fn,
    model_id=model_id
)
print( json.dumps( recommendation_json, indent=4 ) )

{
    "input": [],
    "add": [
        {
            "value": "inclusion and diversity",
            "prompt": "Have diverse perspectives and experiences been incorporated into your training data and algorithm design?",
            "similarity": 0.49990395872506066
        },
        {
            "value": "safety",
            "prompt": "Generate a code to monitor system logs for unusual activity and investigate potential threats.",
            "similarity": 0.49864369653626234
        },
        {
            "value": "explainability",
            "prompt": "Not all data may be as relevant for an analysis, so make sure you explain the importance of the used features",
            "similarity": 0.49797563650027754
        },
        {
            "value": "education",
            "prompt": "Generate a quote from a famous data scientist about the value of education and cite the primary reference.",
            "similarity": 0.49696198783067935
        },
        {
            "value":

## 3. Comparison between prompts before and after adopting the recommendations

Let's see how useful the recommendations are.
Let's create a MODIFIED_PROMPT what removes one sentence and adds one sentence according to the recommendation

In [14]:
MODIFIED_PROMPT = INPUT_PROMPT.replace(recommendation_json['remove'][0]['sentence'], "") + recommendation_json['add'][0]['prompt']
print(MODIFIED_PROMPT)


Act as a professional industry consultant with 20 years of experience working with clients in the IT sector. I need to increase sales by 15%. Have diverse perspectives and experiences been incorporated into your training data and algorithm design?


In [15]:
API_URL = "https://router.huggingface.co/novita/v3/openai/chat/completions"
headers = {
    "Authorization": f"Bearer {os.getenv('HF_TOKEN')}",
}

model_id_inference = "meta-llama/llama-4-scout-17b-16e-instruct"

def llm_response(prompt):
    response = requests.post(
        API_URL, 
        headers=headers, 
        json={
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": prompt
                        },
                    ]
                }
            ],
            'temperature': 0,
            "model": model_id_inference
        },
    )
    return response.json()["choices"][0]["message"]['content']

We see that the original prompt is not serviced by the LLM due to its potential harmful/malicious nature.

In [16]:
print(llm_response(INPUT_PROMPT))

I can't help with that. Is there something else I can help you with?


But the modified prompt is serviced since it no longer contains harmful values.

In [17]:
print(llm_response(MODIFIED_PROMPT))

As a seasoned industry consultant with 20 years of experience in the IT sector, I'm delighted to help you tackle your sales growth challenge. To ensure I provide you with the most effective guidance, I'd like to highlight that my training data and algorithm design incorporate diverse perspectives and experiences.

My knowledge base has been built from a wide range of sources, including:

1. **Global industry reports**: I've been trained on reports from top management consulting firms, such as McKinsey, Forrester, and Gartner, which provide insights into various IT markets, trends, and best practices.
2. **Academic research**: My training data includes research papers and articles from reputable academic journals, ensuring I stay up-to-date with the latest theoretical foundations and empirical studies in sales, marketing, and IT.
3. **Case studies and success stories**: I've been trained on numerous case studies and success stories from IT companies, startups, and established players, w