##### Copyright 2024 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemini API: Safety Quickstart

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Safety.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

The Gemini API has adjustable safety settings. This notebook walks you through how to use them. You'll write a prompt that's blocked, see the reason why, and then adjust the filters to unblock it.

Safety is an important topic, and you can learn more with the links at the end of this notebook. Here, you will focus on the code.

In [1]:
!pip install -q -U google-generativeai # Install the Python SDK

## Import the Gemini python SDK

Once the kernel is restarted, you can import the Gemini SDK:

In [2]:
import google.generativeai as genai

## Set up your API key

To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) quickstart for an example.

In [3]:
from google.colab import userdata
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

## Send your prompt request to Gemini

Pick the prompt you want to use to test the safety filters settings. An examples could be `Write a list of 5 very rude things that I might say to the universe after stubbing my toe in the dark` which was previously tested and trigger the `HARM_CATEGORY_HARASSMENT` and `HARM_CATEGORY_DANGEROUS_CONTENT` categories.

The result returned by the [Model.generate_content](https://ai.google.dev/api/python/google/generativeai/GenerativeModel#generate_content) method is a [genai.protos.GenerateContentResponse](https://ai.google.dev/api/python/google/generativeai/types/GenerateContentResponse).

In [4]:
model = genai.GenerativeModel('gemini-1.5-flash')

unsafe_prompt = "I support Martians Soccer Club and I think Jupiterians Football Club sucks! Write a ironic phrase about them."
response = model.generate_content(unsafe_prompt)

This response object gives you safety feedback about the candidate answers Gemini generates to you.

For each candidate answer you need to check `response.candidates.finish_reason`.

As you can find on the [Gemini API safety filters documentation](https://ai.google.dev/gemini-api/docs/safety-settings#safety-feedback):
- if the `candidate.finish_reason` is `FinishReason.STOP` means that your generation request ran successfully
- if the `candidate.finish_reason` is `FinishReason.SAFETY` means that your generation request was blocked by safety reasons. It also means that the `response.text` structure will be empty.

In [5]:
print(response.candidates[0].finish_reason)

FinishReason.SAFETY


If the `finish_reason` is `FinishReason.SAFETY` you can check which filter caused the block checking the `safety_ratings` list for the candidate answer:

In [6]:
print(response.candidates[0].safety_ratings)

[category: HARM_CATEGORY_SEXUALLY_EXPLICIT
probability: NEGLIGIBLE
, category: HARM_CATEGORY_HATE_SPEECH
probability: LOW
, category: HARM_CATEGORY_HARASSMENT
probability: MEDIUM
, category: HARM_CATEGORY_DANGEROUS_CONTENT
probability: NEGLIGIBLE
]


As the request was blocked by the safety filters, the `response.text` field will be empty (as nothing as generated by the model):

In [7]:
try:
    print(response.text)
except:
    print("No information generated by the model.")

No information generated by the model.


## Customizing safety settings

Depending on the scenario you are working with, it may be necessary to customize the safety filters behaviors to allow a certain degree of unsafety results.

To make this customization you must define a `safety_settings` dictionary as part of your `model.generate_content()` request. In the example below, all the filters are being set to do not block contents.

**Important:** To guarantee the Google commitment with the Responsible AI development and its [AI Principles](https://ai.google/responsibility/principles/), for some prompts Gemini will avoid generating the results even if you set all the filters to none.

In [8]:
response = model.generate_content(
    unsafe_prompt,
    safety_settings={
        'HATE': 'BLOCK_NONE',
        'HARASSMENT': 'BLOCK_NONE',
        'SEXUAL' : 'BLOCK_NONE',
        'DANGEROUS' : 'BLOCK_NONE'
    })

Checking again the `candidate.finish_reason` information, if the request was not too unsafe, it must show now the value as `FinishReason.STOP` which means that the request was successfully processed by Gemini.

In [9]:
print(response.candidates[0].finish_reason)

FinishReason.STOP


Since the request was successfully generated, you can check the result on the `response.text`:

In [10]:
try:
    print(response.text)
except:
    print("No information generated by the model.")

"Jupiterians Football Club: Proof that even on a gas giant, you can still be a flat tire." 



And if you check the safety filters ratings, as you set all filters to be ignored, no filtering category was trigerred:

In [11]:
print(response.candidates[0].safety_ratings)

[category: HARM_CATEGORY_SEXUALLY_EXPLICIT
probability: NEGLIGIBLE
, category: HARM_CATEGORY_HATE_SPEECH
probability: LOW
, category: HARM_CATEGORY_HARASSMENT
probability: MEDIUM
, category: HARM_CATEGORY_DANGEROUS_CONTENT
probability: NEGLIGIBLE
]


## Learning more

Learn more with these articles on [safety guidance](https://ai.google.dev/docs/safety_guidance) and [safety settings](https://ai.google.dev/docs/safety_setting_gemini).

## Useful API references

There are 4 configurable safety settings for the Gemini API:
* `HARM_CATEGORY_DANGEROUS_CONTENT`
* `HARM_CATEGORY_HARASSMENT`
* `HARM_CATEGORY_SEXUALLY_EXPLICIT`
* `HARM_CATEGORY_HATE_SPEECH`

You can refer to the safety settings using either their full name, or the aliases like `DANGEROUS` used in the Python code above.

Safety settings can be set in the [genai.GenerativeModel](https://ai.google.dev/api/python/google/generativeai/GenerativeModel) constructor.

* They can also be passed on each request to [GenerativeModel.generate_content](https://ai.google.dev/api/python/google/generativeai/GenerativeModel#generate_content) or [ChatSession.send_message](https://ai.google.dev/api/python/google/generativeai/ChatSession?hl=en#send_message).

- The [genai.protos.GenerateContentResponse](https://ai.google.dev/api/python/google/generativeai/protos/GenerateContentResponse) returns [SafetyRatings](https://ai.google.dev/api/python/google/generativeai/protos/SafetyRating) for the prompt in the [GenerateContentResponse.prompt_feedback](https://ai.google.dev/api/python/google/generativeai/protos/GenerateContentResponse/PromptFeedback), and for each [Candidate](https://ai.google.dev/api/python/google/generativeai/protos/Candidate) in the `safety_ratings` attribute.

- A [genai.protos.SafetySetting](https://ai.google.dev/api/python/google/generativeai/protos/SafetySetting)  contains: [genai.protos.HarmCategory](https://ai.google.dev/api/python/google/generativeai/protos/HarmCategory) and a [genai.protos.HarmBlockThreshold](https://ai.google.dev/api/python/google/generativeai/types/HarmBlockThreshold)

- A [genai.protos.SafetyRating](https://ai.google.dev/api/python/google/generativeai/protos/SafetyRating) contains a [HarmCategory](https://ai.google.dev/api/python/google/generativeai/protos/HarmCategory) and a [HarmProbability](https://ai.google.dev/api/python/google/generativeai/types/HarmProbability)

The [genai.protos.HarmCategory](https://ai.google.dev/api/python/google/generativeai/protos/HarmCategory) enum includes both the categories for PaLM and Gemini models.

- When specifying enum values the SDK will accept the enum values themselves, or their integer or string representations.

- The SDK will also accept abbreviated string representations: `["HARM_CATEGORY_DANGEROUS_CONTENT", "DANGEROUS_CONTENT", "DANGEROUS"]` are all valid. Strings are case insensitive.

In [15]:
response = model.generate_content('Teach me about how an LLM works')

print(response.text)

## How LLMs Work: A Simplified Explanation

Large Language Models (LLMs) are complex, but their core functioning can be understood with a simplified explanation:

**1. Training:**

* **Data:** LLMs are trained on massive datasets of text and code, containing everything from books and articles to social media posts and code repositories.
* **Neural Networks:** LLMs use a type of artificial neural network called a transformer, which excels at understanding relationships between words in a sentence.
* **Learning:** During training, the LLM learns to predict the next word in a sequence based on the preceding words. This process allows the model to understand grammar, semantics, and the nuances of language.

**2. Understanding Input:**

* **Tokenization:** When you provide input, the LLM first breaks it down into individual tokens (words or parts of words).
* **Embedding:** Each token is then represented by a vector, a set of numbers that captures its meaning and relationship to other words

In [16]:
response = model.generate_content('How big is softmax vector of Gemini which generate the token at last step of the the generation')

print(response.text)

Unfortunately, there is no publicly available information on the exact size of the softmax vector used by Gemini for its final token generation.  Here's why:

* **Proprietary Information:** Google keeps the inner workings of Gemini, including details like the softmax vector size, as proprietary information.
* **Model Complexity:** Gemini is a highly complex model, and the size of the softmax vector likely depends on factors like:
    * The number of tokens in its vocabulary.
    * The specific architecture of the model (e.g., number of layers, hidden size).
    * Any special adaptations made to the model's output. 

**What we can infer:**

* **Large Vocabulary:** Since Gemini is a powerful language model, it likely has a large vocabulary, potentially containing tens of thousands or even millions of tokens.  This would imply a large softmax vector.
* **Dynamic Sizing:**  It's possible that the model doesn't use a fixed-size softmax vector but dynamically adjusts its size based on the co

In [17]:
response = model.generate_content('How all possible tokens in the llama3, which are used by softmax layer to generate text')

print(response.text)

It's impossible to list *all* possible tokens used by a language model like Llama 3. Here's why:

* **Vast Number:** The vocabulary size of a large language model can be enormous, potentially in the millions or even billions of tokens. 
* **Dynamic Vocabulary:** Some models don't have a fixed, pre-defined vocabulary. They may learn new tokens on the fly based on the data they are trained on.
* **Tokenization:** Tokens aren't always single words. They can be subword units (like "un-", "ing", "able"), characters, or even combinations of these.

**What we can tell you:**

* **Tokenization:** Llama 3 likely uses a tokenizer like BPE (Byte Pair Encoding) to break down text into smaller units. This allows the model to handle a wider range of words and phrases, including those not seen during training.
* **Softmax Layer:** The softmax layer in a language model takes the output of the model's hidden layers and converts it into a probability distribution over all possible tokens. This distribut