##### Copyright 2025 Google LLC.

In [1]:
# @title Licensed under the Apache License, Version 2.0 (the \"License\");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an \"AS IS\" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemini API: Error handling

<a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Error_handling.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=30/></a>

This Colab notebook demonstrates strategies for handling common errors you might encounter when working with the Gemini API:

*   **Transient Errors:** Temporary failures due to network issues, server overload, etc.
*   **Rate Limits:** Restrictions on the number of requests you can make within a certain timeframe.
*   **Timeouts:** When an API call takes too long to complete.

You have two main approaches to explore:

1.  **Automatic retries:** A simple way to retry requests when they fail due to transient errors.
2.  **Manual backoff and retry:** A more customizable approach that provides finer control over retry behavior.


**Gemini Rate Limits**

The default rate limits for different Gemini models are outlined in the [Gemini API model documentation](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations). If your application requires a higher quota, consider [requesting a rate limit increase](https://ai.google.dev/gemini-api/docs/quota).

In [2]:
%pip install -q -U "google-genai>=1.0.0"

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.3/53.3 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m426.6/426.6 kB[0m [31m27.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m233.3/233.3 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires google-auth==2.43.0, but you have google-auth 2.45.0 which is incompatible.[0m[31m
[0m

### Setup your API key

To run the following cells, store your API key in a Colab Secret named `GOOGLE_API_KEY`. If you don't have an API key or need help creating a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) guide.

In [3]:
from google import genai
from google.colab import userdata

GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")
client = genai.Client(api_key=GOOGLE_API_KEY)

### Automatic retries

The Gemini API's client library offers built-in retry mechanisms for handling transient errors. You can configure retry behavior when creating the client using `HttpRetryOptions`.

**Option 1: Built-in HttpRetryOptions (Recommended)**

The simplest approach is to configure retry behavior directly on the client:

```python
from google import genai
from google.genai import types

retry_options = types.HttpRetryOptions(
    attempts=5,
    initial_delay=2.0,
    max_delay=64.0,
    http_status_codes=[408, 429, 500, 502, 503, 504]
)

client = genai.Client(
    api_key=GOOGLE_API_KEY,
    http_options=types.HttpOptions(retry_options=retry_options)
)
```

**Option 2: Custom retry decorator**

For more control, you can use a custom retry decorator. Note that `google-genai` raises its own error types (`google.genai.errors.ClientError` for 4xx and `google.genai.errors.ServerError` for 5xx), which are different from `google.api_core.exceptions`. You need a custom predicate to catch these errors:

**Customize retry behavior:**

Use these settings in [`retry`](https://googleapis.dev/python/google-api-core/latest/retry.html) to customize retry behavior:

* `predicate`:  (callable) Determines if an exception is retryable. Use a custom predicate for `google-genai` errors (see example below).
* `initial`: (float) Initial delay in seconds before the first retry. Default: `1.0`
* `maximum`: (float) Maximum delay in seconds between retries. Default: `60.0`
* `multiplier`: (float) Factor by which the delay increases after each retry. Default: `2.0`
* `timeout`: (float) Total retry duration in seconds. Default: `120.0`

In [4]:
from google.api_core import retry
from google.genai import errors

MODEL_ID = "gemini-3-flash-preview" # @param ["gemini-2.5-flash-lite", "gemini-2.5-flash", "gemini-2.5-pro", "gemini-2.5-flash-preview-05-20"] {"allow-input":true, isTemplate: true}

prompt = "Write a story about a magic backpack."

# Custom predicate for google-genai errors
# Note: google-genai raises google.genai.errors.APIError (not google.api_core.exceptions)
def if_genai_transient_error(exception):
    """Predicate for retrying google-genai transient errors."""
    return (
        isinstance(exception, errors.APIError)
        and exception.code in (408, 429, 500, 502, 503, 504)
    )


@retry.Retry(
    predicate=if_genai_transient_error,
)
def generate_with_retry():
    return client.models.generate_content(
        model=MODEL_ID,
        contents=prompt
    )


generate_with_retry().text

'Oliver was the kind of boy who prepared for everything, yet felt ready for nothing. He carried a spare umbrella on cloudless days, three different types of pens in case two failed, and a map of a town he had lived in his entire life.\n\nBut everything changed the day he found the Rucksack of Requirement at a dusty garage sale for three dollars.\n\nIt was a faded, forest-green canvas bag with brass buckles that looked like squinting eyes. The seller, an old woman with wind-chimes in her hair, had whispered, "Be careful, dear. It doesn’t give you what you want. It gives you what you need."\n\nOliver didn\'t believe in magic, but he did believe in a bargain. He took it home, emptied his old nylon bag, and began to pack.\n\nThe first sign that something was wrong happened the next morning on his way to school. Oliver reached into the bag for his history textbook. Instead, his fingers brushed against something cold and metallic. He pulled it out. \n\nIt was a heavy, Victorian-style brass k

### Alternative: Using built-in HttpRetryOptions

For a simpler approach, you can configure retry behavior directly on the client using `HttpRetryOptions`. This is the recommended method as it uses the SDK's built-in retry mechanism that correctly handles `google-genai` errors.

In [5]:
from google import genai
from google.genai import types

# Configure retry options for transient errors
retry_options = types.HttpRetryOptions(
    attempts=5,
    initial_delay=2.0,
    max_delay=64.0,
    http_status_codes=[408, 429, 500, 502, 503, 504]
)

# Create a client with retry options enabled
client_with_retry = genai.Client(
    api_key=GOOGLE_API_KEY,
    http_options=types.HttpOptions(retry_options=retry_options)
)

# All calls through this client will automatically retry on transient errors
response = client_with_retry.models.generate_content(
    model=MODEL_ID,
    contents="Write a haiku about error handling."
)
response.text

'Try, then catch the fall,\nGraceful path through broken code,\nSafety in the block.'

### Manually increase timeout when responses take time

If you encounter `ReadTimeout` or `DeadlineExceeded` errors, meaning an API call exceeds the default timeout (600 seconds), you can manually adjust it by defining `timeout` in the `request_options` argument.

In [6]:
from google.genai import types
prompt = "Write a story about a magic backpack."

client.models.generate_content(
    model=MODEL_ID,
    contents=prompt,
    config=types.GenerateContentConfig(
       http_options=types.HttpOptions(
           timeout=15*60*1000
       )
    )
)  # Increase timeout to 15 minutes

GenerateContentResponse(
  automatic_function_calling_history=[],
  candidates=[
    Candidate(
      content=Content(
        parts=[
          Part(
            text="""The backpack didn’t look like much. It was a faded, olive-green canvas rucksack with brass buckles that had turned a dull, matte brown. It sat in the very back of "Old Man Miller’s Curiosities," smelling faintly of cedar shavings and North Sea salt.

Ten-year-old Oliver, who had exactly four dollars and twenty cents in his pocket, bought it because his old plastic backpack had split down the middle, spilling his geography homework into a puddle.

"A word of advice, lad," Miller had whispered as he tucked the bag into a brown paper sack. "Don't go looking for things in there. Let the things find you."

Oliver didn't know what that meant, and he didn't care. He just needed to carry his books.

The first strange thing happened the next morning. Oliver was halfway to school when he realized he’d forgotten his lunch on the

**Caution:**  While increasing timeouts can be helpful, be mindful of setting them too high, as this can delay error detection and potentially waste resources.

### Manually implement backoff and retry with error handling

For finer control over retry behavior and error handling, you can use the [`retry`](https://googleapis.dev/python/google-api-core/latest/retry.html) library (or similar libraries like [`backoff`](https://pypi.org/project/backoff/) and [`tenacity`](https://tenacity.readthedocs.io/en/latest/)). This gives you precise control over retry strategies and allows you to handle specific types of errors differently.

In [7]:
from google.api_core import retry
from google.genai import errors

MODEL_ID = "gemini-3-flash-preview" # @param ["gemini-2.5-flash-lite", "gemini-2.5-flash", "gemini-2.5-pro", "gemini-2.5-flash-preview-05-20"] {"allow-input":true, isTemplate: true}


def if_genai_transient_error(exception):
    """Predicate for retrying google-genai transient errors."""
    return (
        isinstance(exception, errors.APIError)
        and exception.code in (408, 429, 500, 502, 503, 504)
    )


@retry.Retry(
    predicate=if_genai_transient_error,
    initial=2.0,
    maximum=64.0,
    multiplier=2.0,
    timeout=600,
)
def generate_with_retry(prompt):
    return client.models.generate_content(
        model=MODEL_ID,
        contents=prompt
    )


prompt = "Write a one-liner advertisement for magic backpack."

generate_with_retry(prompt=prompt)

GenerateContentResponse(
  automatic_function_calling_history=[],
  candidates=[
    Candidate(
      content=Content(
        parts=[
          Part(
            text='Pack a mountain, carry a feather.',
            thought_signature=b'\x12\x98\x12\n\x95\x12\x01r\xc8\xda|-\x0c\x07\x96\x87\x9a\x9b|[\xe2>\x9a\x98tZ\xb4\xd5Y\x90:\xe0v\xf3\xbe\x12\xcb\x0c\r\x8dF\x1b\x86\x18`!\xfcSr\x1c\x0ft/\x93\n\x80W$\xd3\xd6\xb9\xaa\x02{\x97\x80ca\x7f\x16Y\xea\xbb\xda\xdb\xbb\xe2\xb5yV\xa6\x8f\xa27\xa5R\xdbg\xddY\x00"\x88\xf3\x8b\xb7h...'
          ),
        ],
        role='model'
      ),
      finish_reason=<FinishReason.STOP: 'STOP'>,
      index=0
    ),
  ],
  model_version='gemini-3-flash-preview',
  response_id='hRNWaYbcH-PKjuMPuKnWuQ8',
  sdk_http_response=HttpResponse(
    headers=<dict len=10>
  ),
  usage_metadata=GenerateContentResponseUsageMetadata(
    candidates_token_count=8,
    prompt_token_count=11,
    prompt_tokens_details=[
      ModalityTokenCount(
        modality=<MediaModali

### Test the error handling with retry mechanism

To validate that your error handling and retry mechanism work as intended, define a `generate_content` function that deliberately raises a `ServerError` on the first call. This setup will help you ensure that the retry decorator successfully handles the transient error and retries the operation.

**Note:** The `google-genai` library raises `google.genai.errors.ServerError` for 5xx errors and `google.genai.errors.ClientError` for 4xx errors (like rate limits). These are different from `google.api_core.exceptions`, so you must use a custom predicate that checks for these error types.

In [8]:
from google.api_core import retry
from google.genai import errors


def if_genai_transient_error(exception):
    """Predicate for retrying google-genai transient errors."""
    return (
        isinstance(exception, errors.APIError)
        and exception.code in (408, 429, 500, 502, 503, 504)
    )


@retry.Retry(
    predicate=if_genai_transient_error,
    initial=2.0,
    maximum=64.0,
    multiplier=2.0,
    timeout=600,
)
def generate_content_first_fail(prompt):
    if not hasattr(generate_content_first_fail, "call_counter"):
        generate_content_first_fail.call_counter = 0

    generate_content_first_fail.call_counter += 1

    try:
        if generate_content_first_fail.call_counter == 1:
            # Simulate a 503 Service Unavailable error from google-genai
            raise errors.ServerError(503, {"message": "Service Unavailable"}, None)

        response = client.models.generate_content(
            model=MODEL_ID,
            contents=prompt
        )
        return response.text
    except errors.ServerError as e:
        print(f"Error: {e}")
        raise


prompt = "Write a one-liner advertisement for magic backpack."

generate_content_first_fail(prompt=prompt)

Error: 503 None. {'message': 'Service Unavailable'}


'**Infinite space, zero weight—the only bag that’s bigger on the inside.**'