##### Copyright 2025 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the \"License\");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an \"AS IS\" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemini API: Error handling

<a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Error_handling.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=30/></a>

This Colab notebook demonstrates strategies for handling common errors you might encounter when working with the Gemini API:

*   **Transient Errors:** Temporary failures due to network issues, server overload, etc.
*   **Rate Limits:** Restrictions on the number of requests you can make within a certain timeframe.
*   **Timeouts:** When an API call takes too long to complete.

You have two main approaches to explore:

1.  **Automatic retries:** A simple way to retry requests when they fail due to transient errors.
2.  **Manual backoff and retry:** A more customizable approach that provides finer control over retry behavior.


**Gemini Rate Limits**

The default rate limits for different Gemini models are outlined in the [Gemini API model documentation](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations). If your application requires a higher quota, consider [requesting a rate limit increase](https://ai.google.dev/gemini-api/docs/quota).

In [None]:
%pip install -q -U "google-genai>=1.0.0"

### Setup your API key

To run the following cells, store your API key in a Colab Secret named `GOOGLE_API_KEY`. If you don't have an API key or need help creating a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) guide.

In [None]:
from google import genai
from google.colab import userdata

GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")
client = genai.Client(api_key=GOOGLE_API_KEY)

### Automatic retries

The Gemini API's client library offers built-in retry mechanisms for handling transient errors. You can enable this feature by using the `request_options` argument with API calls like `generate_content`, `generate_answer`, `embed_content`, and `generate_content_async`.

**Advantages:**

* **Simplicity:** Requires minimal code changes for significant reliability gains.
* **Robust:** Effectively addresses most transient errors without additional logic.

**Customize retry behavior:**

Use these settings in [`retry`](https://googleapis.dev/python/google-api-core/latest/retry.html) to customize retry behavior:

* `predicate`:  (callable) Determines if an exception is retryable. Default: [`if_transient_error`](https://github.com/googleapis/python-api-core/blob/main/google/api_core/retry/retry_base.py#L75C4-L75C13)
* `initial`: (float) Initial delay in seconds before the first retry. Default: `1.0`
* `maximum`: (float) Maximum delay in seconds between retries. Default: `60.0`
* `multiplier`: (float) Factor by which the delay increases after each retry. Default: `2.0`
* `timeout`: (float) Total retry duration in seconds. Default: `120.0`

In [None]:
from google.api_core import retry

MODEL_ID = "gemini-2.0-flash" # @param ["gemini-2.0-flash-lite","gemini-2.0-flash","gemini-2.5-pro-exp-03-25"] {"allow-input":true, isTemplate: true}

prompt = "Write a story about a magic backpack."

#Built in retry support was removed from the sdk, so you need to use retry package
@retry.Retry(
    predicate=retry.if_transient_error,
)
def generate_with_retry():
  return client.models.generate_content(
      model=MODEL_ID,
      contents=prompt
  )

generate_with_retry().text

'Elara wasn’t looking for magic. She was looking for a backpack. Her old one, affectionately nicknamed “The Beast,” had finally given up the ghost, its seams ripped and its zipper permanently jammed. So, she found herself in Mrs. Willowby’s Oddity Emporium, a place smelling of mothballs and forgotten dreams.\n\nThe backpack in question was tucked away in a dusty corner, almost hidden behind a taxidermied two-headed duck. It was made of a deep indigo fabric, embroidered with silver constellations that shimmered faintly even in the dim light. It was perfect.\n\n“That one’s been here for ages,” Mrs. Willowby croaked, dusting it off with a flourish. “Nobody seems to want it.”\n\nElara didn\'t care. She paid the paltry sum, slung the backpack over her shoulder, and hurried home.\n\nThe first sign that something was amiss came the next day. Packing for school, Elara discovered the backpack was inexplicably larger inside than out. She could fit her textbooks, lunch, a bulky art project, and s

### Manually increase timeout when responses take time

If you encounter `ReadTimeout` or `DeadlineExceeded` errors, meaning an API call exceeds the default timeout (600 seconds), you can manually adjust it by defining `timeout` in the `request_options` argument.

In [None]:
from google.genai import types
prompt = "Write a story about a magic backpack."

client.models.generate_content(
    model=MODEL_ID,
    contents=prompt,
    config=types.GenerateContentConfig(
       http_options=types.HttpOptions(
           timeout=15*60*1000
       )
    )
)  # Increase timeout to 15 minutes

GenerateContentResponse(candidates=[Candidate(content=Content(parts=[Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text='Flora had always been unremarkable. Brown hair, brown eyes, perpetually lost in a book. Even her backpack, a drab canvas thing she\'d inherited from her older brother, screamed "invisible." Until, one Tuesday morning, it didn\'t.\n\nShe was rushing to catch the bus, her fingers fumbling with the zipper of the aforementioned backpack, when it refused to budge. Frustrated, she yanked harder. There was a ripping sound, but instead of canvas tearing, a shimmering, iridescent light spilled out.\n\nFlora gasped. The inside of the backpack wasn\'t canvas anymore. It was… a swirling vortex of colours, like the aurora borealis compressed into a small space. Hesitantly, she reached in. Her fingers brushed against something soft, and she pulled it out.\n\nIt

**Caution:**  While increasing timeouts can be helpful, be mindful of setting them too high, as this can delay error detection and potentially waste resources.

### Manually implement backoff and retry with error handling

For finer control over retry behavior and error handling, you can use the [`retry`](https://googleapis.dev/python/google-api-core/latest/retry.html) library (or similar libraries like [`backoff`](https://pypi.org/project/backoff/) and [`tenacity`](https://tenacity.readthedocs.io/en/latest/)). This gives you precise control over retry strategies and allows you to handle specific types of errors differently.

In [None]:
from google.api_core import retry, exceptions

MODEL_ID = "gemini-2.0-flash" # @param ["gemini-2.0-flash-lite","gemini-2.0-flash","gemini-2.5-pro-exp-03-25"] {"allow-input":true, isTemplate: true}

@retry.Retry(
    predicate=retry.if_transient_error,
    initial=2.0,
    maximum=64.0,
    multiplier=2.0,
    timeout=600,
)
def generate_with_retry(prompt):
    return client.models.generate_content(
        model=MODEL_ID,
        contents=prompt
    )


prompt = "Write a one-liner advertisement for magic backpack."

generate_with_retry(prompt=prompt)

GenerateContentResponse(candidates=[Candidate(content=Content(parts=[Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text='Unzip endless possibilities with the Magic Backpack - more space, more adventure!\n')], role='model'), citation_metadata=None, finish_message=None, token_count=None, finish_reason=<FinishReason.STOP: 'STOP'>, avg_logprobs=-0.6894590854644775, grounding_metadata=None, index=None, logprobs_result=None, safety_ratings=None)], create_time=None, response_id=None, model_version='gemini-2.0-flash', prompt_feedback=None, usage_metadata=GenerateContentResponseUsageMetadata(cache_tokens_details=None, cached_content_token_count=None, candidates_token_count=16, candidates_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=16)], prompt_token_count=10, prompt_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TE

### Test the error handling with retry mechanism

To validate that your error handling and retry mechanism work as intended, define a `generate_content` function that deliberately raises a `ServiceUnavailable` error on the first call. This setup will help you ensure that the retry decorator successfully handles the transient error and retries the operation.

In [None]:
from google.api_core import retry, exceptions


@retry.Retry(
    predicate=retry.if_transient_error,
    initial=2.0,
    maximum=64.0,
    multiplier=2.0,
    timeout=600,
)
def generate_content_first_fail(prompt):
    if not hasattr(generate_content_first_fail, "call_counter"):
        generate_content_first_fail.call_counter = 0

    generate_content_first_fail.call_counter += 1

    try:
        if generate_content_first_fail.call_counter == 1:
            raise exceptions.ServiceUnavailable("Service Unavailable")

        response = client.models.generate_content(
            model=MODEL_ID,
            contents=prompt
        )
        return response.text
    except exceptions.ServiceUnavailable as e:
        print(f"Error: {e}")
        raise


prompt = "Write a one-liner advertisement for magic backpack."

generate_content_first_fail(prompt=prompt)

Error: 503 Service Unavailable


'Unzip the impossible with the Magic Backpack - where adventure always fits!\n'