<a href="https://colab.research.google.com/github/Saif-Shines/pk-cookbook/blob/portkey-on-google-cookbook-repo-klyst-171/integrations/Resiliency_and_Observability_Essentials_for_Gemini.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Resiliency and Observability Essentials for Gemini

Gemini is a series of multimodal generative AI models developed by Google. Gemini models can accept text and images in prompts, depending on what model variation you choose and output text responses. Keeping track of token costs, monitoring, and improving your app can be challenging. With Portkey, you get features that will help you solve these production challenges in minutes.

Dive into this guide for a comprehensive view of everything you could do with Portkey to make your Gemini-based app production ready!

## Portkey

Portkey provides an Observability suite and AI gateway capabilities for your gen-AI apps in production. It allows you to analyze all the logs and get power analytics on top of them. Not only can all your requests be served from the cache, but traffic can also be load-balanced, and fallbacks can be applied to improve reliability and save costs. These are just a few of the many features that Portkey has to offer.


## Integration with Gemini

The easiest way to get Portkey working for you is through its client SDK.


In [None]:
!pip install portkey_ai

Get the [Portkey API Key](https://portkey.ai/docs/api-reference/authentication#obtaining-your-api-key) and instantiate it to start using it to make chat completion calls.


In [3]:
from portkey_ai import Portkey
from google.colab import userdata # To work with environment varialbes in collab

PORTKEY_API_KEY=userdata.get('PORTKEY_API_KEY')
GOOGLE_VIRTUAL_KEY=userdata.get('GOOGLE_VIRTUAL_KEY')

portkey = Portkey(
    api_key=PORTKEY_API_KEY,
    virtual_key=GOOGLE_VIRTUAL_KEY
)

response = portkey.chat.completions.create(
    model="gemini-1.0-pro-001",
    messages = [{ "role": "user", "content": "c'est la vie" }]
)

print(response.choices[0].message.content)

C'est la vie is a French phrase that means "such is life." It is often used to express resignation or acceptance of a difficult situation. The phrase suggests that life is full of challenges and that it is important to learn to accept the things that you cannot change.


## Virtual Keys

With Portkey Vault, your Google Gemini API Key is stored safely, and a Virtual Key is generated. The Virtual Key has many advantages, such as the ability to rotate keys easily, multiple virtual keys for a single API key, and the option to impose restrictions based on cost, request volume, and user volume.

Learn [how to create Virtual Keys](https://portkey.ai/docs/product/ai-gateway-streamline-llm-integrations/virtual-keys#using-virtual-keys) and use them for your requests.

## Observability

Portkey can help you keep better track of your information and ensure that you're not missing out on any valuable insights.

1. Understand the number of tokens used and the nature of the token-costs attached.
2. Know when and what has caused an issue in production and troubleshoot it.
3. Analyze the status of each request and replay them as necessary.

There’s more to what you can do.

### Logging

Every request through Portkey to Anyscale will appear as a Log in the **Logs** page. Each log gives insights about the request and response body, timestamps, request timings, tokens, costs, and many more details.


In [3]:
response = portkey.chat.completions.create(
    model="gemini-1.0-pro-001",
    messages = [{ "role": "user", "content": "c'est la vie" }]
)

print(response.choices[0].message.content)

C'est la vie (French pronunciation: ​[sɛ la vi]) is a French phrase that means "such is life" or "that's life". It is often used to express resignation or acceptance in the face of adversity. The phrase can also be used in a more positive sense, to express the idea that life is full of both good and bad experiences, and that it is important to accept both.

The phrase is thought to have originated in the 16th century, and it has been used by many famous people over the years, including Marie Antoinette, Napoleon Bonaparte, and Albert Einstein. Today, the phrase is still commonly used in both French and English.


Here is a screenshot of it:
![](https://github.com/Saif-Shines/pk-cookbook/blob/portkey-on-google-cookbook-repo-klyst-171/integrations/images/resiliency-and-observability-essentials-for-gemini/1-resiliency-and-observability-essentials-for-gemini.png?raw=true)


### Tracing

Using Tracing abilities, you can pinpoint and analyze requests throughout their life-cycle and quickly filter through the heap of logs.



In [4]:
response = portkey.with_options(
  trace_id="gemini_and_portkey" # any trace identifier
).chat.completions.create(
    model="gemini-1.0-pro-001",
    messages = [{ "role": "user", "content": "Greet me in Japanese!" }]
)

print(response.choices[0].message.content)

こんにちは (Konnichiwa)!


Here is a screenshot

![](https://github.com/Saif-Shines/pk-cookbook/blob/portkey-on-google-cookbook-repo-klyst-171/integrations/images/resiliency-and-observability-essentials-for-gemini/2-resiliency-and-observability-essentials-for-gemini.png?raw=true)

### Metadata

Segment your requests and analyze them by attaching Metadata. They can describe anything that might be of valuable insight to you. For example, using metadata is the best answer if you want to segment the conversations related to a specific user.


In [5]:
response = portkey.with_options(
  trace_id="gemini_and_portkey",
  metadata={
    "_user": "3511522",
    "environment":"dev"
  }
).chat.completions.create(
    model="gemini-1.0-pro-001",
    messages = [{ "role": "user", "content": "Greet me in Japanese!" }]
)

print(response.choices[0].message.content)

こんにちは! (Konnichiwa!)


Filter through the logs using the metadata:

![](https://github.com/Saif-Shines/pk-cookbook/blob/portkey-on-google-cookbook-repo-klyst-171/integrations/images/resiliency-and-observability-essentials-for-gemini/3-resiliency-and-observability-essentials-for-gemini.png?raw=true)

For comprehensive information and more features, see the [Observability docs](https://portkey.ai/docs/product/observability-modern-monitoring-for-llms).


## Production Reliability

Portkey’s multimodal AI gateway lets you tackle failure scenarios and make your app more reliable and robust.



1. AI gateway can cache LLM responses and serve them instantly.
2. Furthermore, it can enable your app to switch from one LLM to another in case of unexpected failures, ensuring uninterrupted service.
3. Additionally, it can distribute incoming traffic evenly among the target LLMs, allowing for efficient load balancing.

### Caching, Retries and Request Timeouts

It’s easy to enable these features on your requests using [Gateway Configs](https://github.com/Portkey-AI/portkey-cookbook/blob/c0ee5af750b4edc8964339c3bd86adcd83721178/examples).

From your **Portkey** app, on **Configs** and write the following JSON:

```json
{
  "retry": {
    "attempts": 3
  },
  "cache": {
    "mode": "simple"
  },
  "request_timeout": 10000
}
```


Hit **Save Config** and get a Config ID.

![](https://github.com/Saif-Shines/pk-cookbook/blob/portkey-on-google-cookbook-repo-klyst-171/integrations/images/resiliency-and-observability-essentials-for-gemini/4-resiliency-and-observability-essentials-for-gemini.png?raw=true)

When making requests to the AI gateway, including the Config ID is essential. This will enable caching, timeouts, and automatic retries.



### Fallbacks and Loadbalancing

Similar to how you added features such as caching for your requests, bringing the power of Fallbacks and Load balancing is just as simple. Both are powerful means to build reliability into your system.

Fallbacks


```json
{
  "strategy": {
    "mode": "fallback"
  },
  "targets": [
    {
      "virtual_key": "gemini-virtual-key",
      "override_params": {
        "model": "gemini-1.0-pro"
      }
    },
    {
      "virtual_key": "gemini-virtual-key",
      "override_params": {
        "model": "gemini-1.0-pro-001"
      }
    }
  ]
}
```


Loadbalancing


```json
{
  "strategy": {
    "mode": "loadbalance"
  },
  "targets": [
    {
      "virtual_key": "gemini-virtual-key",
      "override_params": {
        "model": "gemini-1.0-pro-latest"
      },
      "weight": 0.7
    },
    {
      "virtual_key": "gemini-virtual-key",
      "override_params": {
        "model": "gemini-1.0-pro"
      },
      "weight": 0.3
    }
  ]
}
```


You can reference them in the code as follows:


In [7]:
CONFIG_ID="LOADBALANCE/FALLBACK_CONFIG_ID"

portkey = Portkey(
    api_key=PORTKEY_API_KEY,
    virtual_key=GOOGLE_VIRTUAL_KEY,
    config=CONFIG_ID # passing Config ID
)

response = portkey.chat.completions.create(
    model="gemini-1.0-pro-001",
    messages = [{ "role": "user", "content": "c'est la vie" }]
)

print(response.choices[0].message.content)

C'est la vie (translated to "such is life") is a French phrase that is often used to express the idea that life is full of unexpected events and that we must accept them with equanimity. It is a reminder that life is not always fair or easy, and that we must learn to embrace both the good and the bad.


## Continuous Improvement

Assessing the user impact of your prompts can be a difficult task, as certain prompts may have great user delight while others may lack it.

But what if your users could rate the quality of the responses provided by your LLMs? With Portkey, this is possible. By collecting user feedback, you can fine-tune your models and continuously improve their accuracy through autonomous fine-tuning. In other words, user feedback becomes the dataset used to train these models.


### Collect Feedback

We previously passed a trace ID (`gemini_and_portkey`) in our requests. Portkey’s feedback method can attach feedback to as follows:


```py
feedback = portkey.feedback.create(
    trace_id="gemini_and_portkey",
    value=5,   # Integer between -10 and 10
)
```


Connect any user interaction to record feedback on your response quality. You can also view feedback in the dashboard under the Feedback tab.


### Autonomous Fine-tuning

Our Fine-Tuning feature can automatically fine-tune models based on feedback. It is currently in a private beta phase. If you are interested, please message us on support@portkey.ai or on our [Discord](https://www.portkey.ai/community) channel.


## Conclusion

Portkey is an excellent tool that applies production-grade features on top of Gemini Models. What we have here is just tip of the iceberg. You can read about more exciting ways to use in[ Portkey Documentation](https://portkey.ai/docs).
