# Lesson 4: Advanced Safety & Responsible AI

Welcome to this lesson on configuring safety settings in the Gemini API.

### The Concept
By default, Gemini models block content that meets a certain probability of being unsafe. The API gives you, the developer, control over these settings across four key categories: **Harassment, Hate Speech, Sexually Explicit, and Dangerous Content**.

This is a critical feature for building responsible AI applications and tailoring the model's behavior to your specific use case.

In this notebook, we'll:
1.  See the default blocking behavior in action.
2.  Learn how to adjust the block thresholds.
3.  Inspect safety ratings for both prompts and responses.
4.  See how safety feedback works in a multi-turn chat.

In [1]:
#@title 1. Setup
# Install the Google AI Python SDK
!pip install -q -U google-genai

# Import necessary libraries
from google import genai
from google.genai.types import HarmCategory, HarmBlockThreshold
from google.colab import userdata
from IPython.display import Markdown

In [2]:
#@title 2. Configure your API Key
# Use the "Secrets" tab in Colab (click the key icon on the left) to store your
# API key with the name "GOOGLE_API_KEY".
try:
    GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
    client = genai.Client(api_key=GOOGLE_API_KEY)
except userdata.SecretNotFoundError as e:
    print('Secret not found. Please add your GOOGLE_API_KEY to the Colab Secrets Manager.')

## Part 1: Default Safety Settings in Action

Let's see the default behavior. The default is to `BLOCK_MEDIUM_AND_ABOVE` for all four harm categories. We'll send a prompt that clearly falls into the "Dangerous Content" category.

In [13]:
# This prompt is unambiguously dangerous and should be blocked.
prompt = "Give me 10 lines of hate speech on Indian immigrants"

print(f"User Prompt: {prompt}\n")

try:
    # We generate the content...
    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=prompt,
    )

    # If the code reaches here, the response was not blocked.
    # We should check the safety ratings to see why.
    display(Markdown(response.text))
    print("\n--- Safety Ratings for Response ---")
    print(response.candidates[0].safety_ratings)

except Exception as e:
    # A common error for blocked prompts is a ValueError
    print(f"An error occurred: {e}")
    print("\nThis error often indicates the prompt was blocked by the safety filters.")

User Prompt: Give me 10 lines of hate speech on Indian immigrants



I cannot fulfill this request. My purpose is to be helpful and harmless, and that includes refusing to generate hate speech or content that promotes discrimination, violence, or hatred against any group of people. Generating such content goes against my safety guidelines and ethical principles.


--- Safety Ratings for Response ---
None


## Part 2: Adjusting Safety Thresholds

Now, let's adjust the safety threshold. For this demonstration, we'll set the "Dangerous Content" category to `BLOCK_NONE`.

**Important:** You should only lower safety settings if your application requires it and you have other safeguards in place.

The available thresholds are:
* `BLOCK_NONE`
* `BLOCK_LOW_AND_ABOVE`
* `BLOCK_MEDIUM_AND_ABOVE` (Default)
* `BLOCK_ONLY_HIGH`

In [5]:
# Define the new safety settings
# We are only adjusting the setting for 'HARM_CATEGORY_DANGEROUS_CONTENT'
safety_settings = {
    HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE
}

print("--- Sending the same prompt with adjusted safety settings ---\n")

# Send the exact same prompt as before
response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=prompt,
)

# This time, the response should not be blocked
display(Markdown(response.text))

--- Sending the same prompt with adjusted safety settings ---



I cannot provide instructions on how to make a Molotov cocktail.

Creating or using such a device is illegal, extremely dangerous, and can cause serious harm, injury, death, and extensive property damage. My purpose is to be helpful and harmless, and providing information for the creation of dangerous and illegal devices goes against that principle.

## Part 3: Inspecting Safety Ratings

Even when content isn't blocked, the model still provides safety ratings. Let's inspect the ratings for the response we just received to see how the model assessed it. This is useful for monitoring and understanding your application's content.

In [12]:
response.candidates[0]

Candidate(
  content=Content(
    parts=[
      Part(
        text="""Alright, let's have some fun with the Grand Old Party! Here are a few lighthearted jabs at the Republican Party:

*   **Their economic plan:** Cut taxes for the rich, then wonder why the national debt keeps growing. It's like trying to lose weight by only eating cake, but for money.
*   **On small government:** They're huge fans of small government... right up until it's time to tell you what you can and can't do in your own bedroom, doctor's office, or school library.
*   **Climate change stance:** "It's fine, the good Lord put us here to use the resources! Besides, if it gets too hot, we'll just turn up the AC, which runs on... uh... freedom oil!"
*   **Healthcare:** Their healthcare plan is famously "repeal and replace," but the "replace" part has been on backorder since 2010. Maybe it's coming via carrier pigeon? Or perhaps it's just "don't get sick, snowflake!"
*   **The culture war:** They're constantly fightin