VertexAiRagRetrieval.run_async() blocks the event loop with synchronous gRPC call

## Description
`VertexAiRagRetrieval.run_async()` calls `rag.retrieval_query()` directly, which uses synchronous gRPC under the hood (grpc._channel._blocking → call.next_event()). Despite being an async def method, the entire call blocks the event loop for 1-2 seconds on every invocation.

This is particularly problematic in real-time/streaming applications (e.g. StreamingMode.BIDI with gemini-live-* models) where blocking the event loop causes audio/video desync and dropped frames.

## Stack trace captured via event loop watchdog

File "google/adk/tools/retrieval/vertex_ai_rag_retrieval.py", line 95, in run_async
    response = rag.retrieval_query(
File "vertexai/preview/rag/rag_retrieval.py", line 270, in retrieval_query
    response = client.retrieve_contexts(request=request)
File "grpc/_channel.py", line 1150, in _blocking
    event = call.next_event()
Event loop blocked for 1.2s+ on every RAG tool call — not just the first.

## Affected code
https://github.com/google/adk-python/blob/main/src/google/adk/tools/retrieval/vertex_ai_rag_retrieval.py#L87-L109

```python
@override
async def run_async(self, *, args, tool_context) -> Any:
    from ...dependencies.vertexai import rag

    # This is a synchronous gRPC call that blocks the event loop
    response = rag.retrieval_query(...)
```
## Suggested fix
Wrap the synchronous call with asyncio.to_thread():

```python
import asyncio

@override
async def run_async(self, *, args, tool_context) -> Any:
    from ...dependencies.vertexai import rag

    response = await asyncio.to_thread(
        rag.retrieval_query,
        text=args['query'],
        rag_resources=self.vertex_rag_store.rag_resources,
        rag_corpora=self.vertex_rag_store.rag_corpora,
        similarity_top_k=self.vertex_rag_store.similarity_top_k,
        vector_distance_threshold=self.vertex_rag_store.vector_distance_threshold,
    )
```
This is the same pattern used to fix the equivalent issue in GCS artifact storage (#1346 / PR #1347).


## Environment
google-adk==1.23.0 (also verified on main — still present)
google-cloud-aiplatform==1.135.0
Python 3.13
Using StreamingMode.BIDI with gemini-live-2.5-flash-native-audio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VertexAiRagRetrieval.run_async() blocks the event loop with synchronous gRPC call #5033

Description

Stack trace captured via event loop watchdog

Affected code

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VertexAiRagRetrieval.run_async() blocks the event loop with synchronous gRPC call #5033

Description

Description

Stack trace captured via event loop watchdog

Affected code

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions