<a href="https://colab.research.google.com/github/carlosmscabral/demo-apigee-genai/blob/main/Apigee_Gemini_Cross_Region_Re_Routing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Apigee - Gemini Cross Region Re-Routing

This sample client content generates a "stress test" towards Gemini in order to reach the per-region rate limit. But the code uses an Apigee API in front of the Gemini endpoints.

Apigee will automatically detect and recover from Vertex/Gemini rate-limit errors and transparently re-route the calls to a backup region.

In [None]:
!gcloud auth application-default login
!gcloud config set project cabral-apigee

In [None]:
import vertexai
from vertexai.generative_models import GenerativeModel, GenerationConfig
import time
from multiprocessing import Pool
from datetime import datetime


def exec_gemini_call(iteration):
    project_id = "cabral-apigee"
    vertexai.init(project=project_id, api_endpoint="https://dev.35.227.240.213.nip.io/active-retry", api_transport='rest')
    model = GenerativeModel("gemini-1.5-pro-001")
    config = GenerationConfig(
        max_output_tokens=100, temperature=0.4, top_p=1, top_k=32
    )
    model.generate_content(
        f"{iteration} - What's a good name for a flower shop that specializes in selling bouquets of dried flowers?", generation_config=config
    )

def main():

    iterations = range(100)

    start_time = time.time()
    print(f" Start time: {datetime.fromtimestamp(start_time).strftime('%Y-%m-%d %H:%M:%S')}")

    with Pool() as pool:
        _ = pool.map(exec_gemini_call, iterations)

    end_time = time.time()
    print(f" End time: {datetime.fromtimestamp(end_time).strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"Total time elapsed: {end_time - start_time:.2f} seconds")

if __name__ == "__main__":
    main()