Gemini - Priority Tiering

### Feature Type

Nice to have

### Feature Description

Gemini vertexai now supports priority tiering inference in preview. 
https://docs.cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo

On the genai api it supports service tiers with priority inference
https://ai.google.dev/gemini-api/docs/priority-inference

TLDR api users can pay extra for inference to guarantee lower latency and more flexible throughput.
For voice agents with spiky traffic and value low latency inference, this is a critical feature.

This should be supported explicitly through the gemini plugin. Currently, this can be implemented through gemini LLM HTTP options. 



### Workarounds / Alternatives

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini - Priority Tiering #5664

Feature Type

Feature Description

Workarounds / Alternatives

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gemini - Priority Tiering #5664

Description

Feature Type

Feature Description

Workarounds / Alternatives

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions