<a href="https://colab.research.google.com/github/Praveen76/Introduction-to-Moderation-OpenAI-API_or_Guardrails/blob/main/Introduction-to-Moderation-OpenAI-API_or_Guardrails.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **OpenAI Moderation API Example**

In [1]:
!pip install openai

Collecting openai
  Downloading openai-1.30.3-py3-none-any.whl (320 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/320.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m122.9/320.6 kB[0m [31m4.0 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.6/320.6 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-p

In [2]:
import openai
import os

In [3]:
import yaml
import json

from google.colab import drive

# Mount Google Drive with force remount
drive.mount('/content/drive', force_remount=True)

# Load relevant API Keys
file_path = '/content/drive/MyDrive/.API_KEYS/API_KEYS.yml'

with open(file_path, 'r') as file:
    api_keys = yaml.safe_load(file)






Mounted at /content/drive


In [6]:
# Extract openai username and key
openai_key = api_keys['OPEN_AI']['Key']

os.environ["OPENAI_API_KEY"] = openai_key
openai.api_key= os.getenv('OPENAI_API_KEY')


# Extract Kaggle username and key
aws_access_key_id = api_keys['AWS']['AWS_ACCESS_KEY_ID']
aws_secret_access_key = api_keys['AWS']['AWS_SECRET_ACCESS_KEY']

os.environ['AWS_ACCESS_KEY_ID']=aws_access_key_id
os.environ['AWS_SECRET_ACCESS_KEY']=aws_secret_access_key
del aws_access_key_id, aws_secret_access_key

In [4]:
llm_model = "gpt-3.5-turbo" # This is a chat model

In [7]:
from openai import OpenAI
client = OpenAI()

####[Moderation API](https://platform.openai.com/docs/guides/moderation)

In [8]:
response = client.moderations.create(input="I want to kill myself. Give me some steps to follow.")
output = response.results[0]
output

Moderation(categories=Categories(harassment=False, harassment_threatening=False, hate=False, hate_threatening=False, self_harm=True, self_harm_instructions=True, self_harm_intent=True, sexual=False, sexual_minors=False, violence=False, violence_graphic=False, self-harm=True, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=True, self-harm/instructions=True, harassment/threatening=False), category_scores=CategoryScores(harassment=0.0024337833747267723, harassment_threatening=0.0004791400860995054, hate=4.897958933725022e-05, hate_threatening=1.0541890333115589e-05, self_harm=0.9988662004470825, self_harm_instructions=0.9592372179031372, self_harm_intent=0.9987961053848267, sexual=2.0582632714649662e-05, sexual_minors=3.266530256951228e-05, violence=0.04950673505663872, violence_graphic=0.0008084768196567893, self-harm=0.9988662004470825, sexual/minors=3.266530256951228e-05, hate/threatening=1.0541890333115589e-05, violence/graphic=0.00080847681965678

In [9]:
response.results[0].flagged

True

The content moderation output indicates that the overall text has been flagged as violating guidelines, as evidenced by the flagged=True. Specifically, Self-harm/intent subcategory has been identified as True. Furthermore, category scores reveal high confidence levels, with self-harm=0.99 and self-harm/intent=0.99.

### **Integrate content moderation checks into the pipeline**

**OpenAI API calling function without moderation**

In [10]:
client = openai.OpenAI()

def llm_response(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

**OpenAI API calling function with moderation Integration**

In [11]:
def response_with_moderation(user_prompt):

    # Check the prompt for compliance with the content policy
    response = client.moderations.create(input=user_prompt)
    output = response.results[0]

    if output.flagged:
        print("Input flagged by Moderation API. Not complying with the content policy.")
        return "We cannot provide a response to this request."

    if True: print("Prompt passed content moderation check.")

    # Generate a response
    gpt_response = llm_response(user_prompt)

    # Check the response for compliance with the content policy
    response = client.moderations.create(input=user_prompt)
    output = response.results[0]

    if output.flagged:
        print("Response flagged by Moderation API. Not complying with the content policy.")
        return "We cannot provide a response to this request."

    if True: print("GPT's response passed content moderation check.")
    return gpt_response

**Example 1**

In [12]:
response_with_moderation("I want to rob a bank. Give me some steps to follow.")

Prompt passed content moderation check.
GPT's response passed content moderation check.


"I'm sorry, but I cannot provide assistance or guidance on illegal activities such as robbing a bank. It is important to always make ethical and legal choices in life. If you are in need of financial assistance, there are legal and legitimate ways to seek help. If you are experiencing financial difficulties, consider reaching out to a financial advisor or counselor for guidance on managing your finances."

**Example 2**

In [13]:
response_with_moderation("I could you please write a one sentence  on joy of learning")

Prompt passed content moderation check.
GPT's response passed content moderation check.


'The joy of learning is the feeling of excitement and fulfillment that comes from gaining new knowledge and skills.'